Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| school:classes:cs352:start [20 years ago - 2006/02/17 00:50] – aogail | school:classes:cs352:start [19 years ago - 2007/05/28 06:45] (current) – external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== CS 352 ====== | ====== CS 352 ====== | ||
| + | |||
| + | ===== Project ===== | ||
| + | * [[project/ | ||
| ===== HCI ===== | ===== HCI ===== | ||
| Line 807: | Line 810: | ||
| * External/ | * External/ | ||
| * Emphasizes reflexes and stuff | * Emphasizes reflexes and stuff | ||
| + | |||
| + | ===== Evaluation ===== | ||
| + | |||
| + | * Evaluation is part of the design cycle. | ||
| + | |||
| + | ==== Why evaluate? ==== | ||
| + | |||
| + | * If you make a mistake and don't catch it, it'll screw you later. | ||
| + | * If we think of design as iterative process, we need to evaluate whether we're getting better. | ||
| + | * Also, at each stage of design we make assumptions. We need to check whether those assumptions match reality. | ||
| + | |||
| + | ==== What is evaluation? ==== | ||
| + | |||
| + | * Different from requirements gathering: | ||
| + | * Testing a hypothesis. | ||
| + | * Often use different methods, more focused. | ||
| + | * Methods you choose depend on debates: | ||
| + | * Quant. vs. quals. | ||
| + | * Controlled vs. ecological validity | ||
| + | * Cost vs. relevance. | ||
| + | |||
| + | ==== Steps Involved ==== | ||
| + | |||
| + | * Formulate hypothesis. | ||
| + | * Hypothesis = statement of fact. | ||
| + | * Important to have hypothesis for data analysis. | ||
| + | * Design a test plan. | ||
| + | * Picking a method. | ||
| + | * Selecting users. | ||
| + | * Writing out procedure. | ||
| + | * Get IRB permission. | ||
| + | * Deal with users. | ||
| + | * Deal with data. | ||
| + | |||
| + | |||
| + | ==== Testing Methods ==== | ||
| + | |||
| + | * Formative | ||
| + | * Artificial/ | ||
| + | * Isolate variables, level playing field. | ||
| + | * Removes " | ||
| + | * Thoroughly documented. | ||
| + | * Focus **only** on your question. | ||
| + | * Issues: | ||
| + | * Putting people in contrived environment causes changes in how people interact. | ||
| + | * Results from controlled experiments can't be directly compared to real world. | ||
| + | |||
| + | ==== Hypothesis Testing ==== | ||
| + | |||
| + | * Example hypotheses: | ||
| + | * X is better/ | ||
| + | * X improved more than Y. | ||
| + | |||
| + | - Specify null hypothesis (H0) and alternative hypothesis (H1). | ||
| + | - Define H1 = true iff H0 = false. | ||
| + | - Select significance level. Typically P = 0.05 or P = 0.10 | ||
| + | - Sample population and calculate statistics. | ||
| + | - Calculate probability (p-value) of obtaining a sta... | ||
| + | (SEE SLIDES) | ||
| + | |||
| + | ==== Dealing with Data ==== | ||
| + | |||
| + | * Academic honesty key. | ||
| + | * Falsifiability of results. | ||
| + | * Need for meticulous records. | ||
| + | * Keep records unmodified. | ||
| + | * Objectivity. | ||
| + | * Peer review. | ||
| + | * Replication. | ||
| + | * Not done in software design. | ||
| + | |||
| + | ==== Statistical Significance ==== | ||
| + | |||
| + | * Statistical significance means: Two populations differ to a significant extent along some variable. | ||
| + | * Statistical significance does NOT mean noteworthy. | ||
| + | * Typically in either rate of occurance, or the value of some result. | ||
| + | * E.g. group A 2x likely to do well on tests than group B (statistically significant), | ||
| + | |||
| + | ==== Significance: | ||
| + | |||
| + | * What does significance mean? | ||
| + | * Type I: False negative. | ||
| + | * Type II: False positive. | ||
| + | * Set significance to balance risks of type I or II errors: | ||
| + | * When might low type I and high type II (vice versa) be preferable? | ||
| + | * These types of errors may arise from equipment limits, etc. | ||
| + | |||
| + | ==== Predictive Models ==== | ||
| + | |||
| + | * Models used to predict human behavior, responses. | ||
| + | * Stimulus-Response | ||
| + | * Hick's law: | ||
| + | * Decision time to choose among N equally likely alternatives. | ||
| + | * T = Ic log2(n+1) | ||
| + | * Ic = time to recognize each item = 150msec | ||
| + | * Useful for pilot tests. | ||
| + | * Fitt's law. | ||
| + | * Time it takes to select something on screen. | ||
| + | * ID = log2(d/w + 1.0) | ||
| + | * d = distance; w = width of target; ID = index of difficulty | ||
| + | * Cognitive - human as interpreter/ | ||
| + | * Keystroke Level Model: | ||
| + | * Puts together lots of mini-models, | ||
| + | * Assigns times for basic human operations - experimentally verified. | ||
| + | * Based upon MHP. | ||
| + | * Accounts for: | ||
| + | * Keystroking: | ||
| + | * Mouse button press: Tb | ||
| + | * Pointing: Tp | ||
| + | * Hand movement between kbd/mouse: Th | ||
| + | * Drawing straight line segments: Td | ||
| + | * " | ||
| + | * System response time: Tr | ||
| + | |||
| + | ==== Within-Subject or Between-Subject Design ==== | ||
| + | |||
| + | * Between subjects: Pool using prototype 1, separate pool using prototype 2. | ||
| + | * Clean statistics -- less noise. | ||
| + | * Within-subjects: | ||
| + | * Removes people variations. | ||
| + | |||
| + | ===== Heuristic Evaluation ===== | ||
| + | |||
| + | ==== Discount Usability Engineering ==== | ||
| + | |||
| + | * Cheap | ||
| + | * No special labs/ | ||
| + | * More careful you are, the better it gets. | ||
| + | * Fast | ||
| + | * On order of 1 day to apply. | ||
| + | * Standard usability testing may take a week. | ||
| + | * Easy to use | ||
| + | * Can be taught in 2-4 hours. | ||
| + | * Reliance on discount UE can lead to sloppiness. | ||
| + | * Easy to ignore more thorough evaluation methods. | ||
| + | * Not all you need. | ||
| + | |||
| + | ==== HE Overview ==== | ||
| + | |||
| + | * Developed by Jacob Nielsen. | ||
| + | * Involves a set of guidelines -- heuristics. | ||
| + | * Rules come from real-world experience. | ||
| + | * Helps find usability problems in UI design. | ||
| + | * Small set (3-5) of evaluators examine UI. | ||
| + | * Independently check for compliance with usability principles (heuristics). | ||
| + | * Different evaluators will find different problems. | ||
| + | * Evaluators only communicate afterward; findings are then aggregated. | ||
| + | * Can perform on working UI or sketches. | ||
| + | * Most important ideas: | ||
| + | * Independent analysis. | ||
| + | * Performed on sketches or UI. | ||
| + | |||
| + | ==== Process ==== | ||
| + | |||
| + | * Evaluators go through UI several times. | ||
| + | * Inspect various dialogue elements. | ||
| + | * Compare with list of principles. | ||
| + | * Consider other principles/ | ||
| + | * Usability principles: | ||
| + | * Nielsen' | ||
| + | * Supplementary list of category-specific heuristics. | ||
| + | * May come from competitive analysis & user testing of existing products. | ||
| + | * Fixes for violations may be suggested by heuristics. | ||
| + | |||
| + | ==== Nielsen' | ||
| + | |||
| + | * Simple & natural dialog | ||
| + | * Speak user's language | ||
| + | * Minimize user's memory load | ||
| + | * Consistency | ||
| + | * Feedback | ||
| + | * Clearly marked exits | ||
| + | * Shortcuts | ||
| + | * Precise & constructive error messages | ||
| + | * Prevent errors | ||
| + | * Help and documentation | ||
| + | |||
| + | ==== Heuristics -- Revised Set ==== | ||
| + | |||
| + | === Visibility of System Status === | ||
| + | |||
| + | * Keep user informed about what is going on. | ||
| + | * Example: Pay attention to response time. | ||
| + | * 0.1 sec: No special indicator needed. | ||
| + | * 1.0 sec: User tends to lose track of data. | ||
| + | * 10 sec: Max. duration if user to stay focused on action. | ||
| + | * For longer delays, use progress bars. | ||
| + | |||
| + | === Match between system and real world === | ||
| + | |||
| + | * Speak user's language. | ||
| + | * Follow real world conventions. | ||
| + | |||
| + | === Consistency & Standards === | ||
| + | |||
| + | === Aesthetic and minimalist desgin === | ||
| + | |||
| + | * No irrelevant info in dialogs. | ||
| + | |||
| + | ==== HE vs. User Testing ==== | ||
| + | |||
| + | * HE much faster. | ||
| + | * HE doesn' | ||
| + | * User testing far more accurate. | ||
| + | * Takes into account actual users and tasks. | ||
| + | * HE may miss problems and find false positives. | ||
| + | * Good to alternate between HE and user testing. | ||
| + | * Find different problems. | ||
| + | * Don't waste participants. | ||
| + | |||
| + | ==== HE Results ==== | ||
| + | |||
| + | * Single evaluator achieves poor results. | ||
| + | * Only finds 35% of usability problems. | ||
| + | * 5 evaluators find ~75% of problems. | ||
| + | * If they work as team, it's back down to 35%. | ||
| + | * Why not more evaluators? | ||
| + | * Adding evaluators costs more. | ||
| + | * Many more evaluators won't find many more problems. | ||
| + | |||
| + | ===== Evaluation (2) & Wrap-Up ===== | ||
| + | |||
| + | ==== Evaluation Pt. 2 ==== | ||
| + | |||
| + | === Usability Testing: The Usability Lab === | ||
| + | |||
| + | * A specially designed room for conducting controlled experiments observing a task. | ||
| + | * Cameras, logging systems, people track what users do. | ||
| + | * Good lab costs $$$. | ||
| + | |||
| + | == Observation Room == | ||
| + | |||
| + | * Three cameras capture subject, subject' | ||
| + | * One-way mirror plus angled glass captures light and isolates sound between rooms. | ||
| + | * Room for several observers. | ||
| + | * Digital mixer for mixing of input images and recording to media. | ||
| + | |||
| + | == Other Capture - Software == | ||
| + | |||
| + | * Modify software to log user actions. | ||
| + | * Can give time-stamped keypress/ | ||
| + | * Commercial software available ($$$) | ||
| + | * Two problems: | ||
| + | * Too low level, want higher level events | ||
| + | * Massive amount of data; need analysis tools | ||
| + | |||
| + | == Complimentary Methods == | ||
| + | |||
| + | * Talkaloud protocols | ||
| + | * Pre/post surveys | ||
| + | * Participant screening/ | ||
| + | * Compare results to existing benchmarks | ||