©2011 1 introducing evaluation chapter 12 adapted by wan c. yoon - 2012
Post on 12-Jan-2016
212 Views
Preview:
TRANSCRIPT
©20111www.id-book.com
Introducing Evaluation
Chapter 12
adapted by Wan C. Yoon - 2012
©20112www.id-book.com
The aims• Explain the key concepts used in evaluation. • Introduce different evaluation methods. • Show how different methods are used for
different purposes at different stages of the design process and in different contexts.
• Show how evaluators mix and modify methods.
• Discuss the practical challenges • Illustrate how methods discussed in Chapters 7
and 8 are used in evaluation and describe some methods that are specific to evaluation.
©20113www.id-book.com
Why, what, where and when to evaluate
• Iterative design & evaluation is a continuous process that examines:
• Why: to check users’ requirements and that users can use the product and they like it.
• What: a conceptual model, early prototypes of a new system and later, more complete prototypes.
• Where: in natural and laboratory settings.• When: throughout design; finished products can
be evaluated to collect information to inform new products.
©20114www.id-book.com
Bruce Tognazzini tells you why you need to evaluate
• “Iterative design, with its repeating cycle of design and testing, is the only validated methodology in existence that will consistently produce successful results. If you don’t have user-testing as an integral part of your design process you are going to throw buckets of money down the drain.”
• See AskTog.com for topical discussions about design and evaluation.
©20115www.id-book.com
Types of evaluation• Controlled settings involving users, eg usability
testing & experiments in laboratories and living labs.
• Natural settings involving users, eg field studies to see how the product is used in the real world.
• Any settings not involving users, eg consultants critique; to predict, analyze & model aspects of the interface analytics.
©20116www.id-book.com
Usability lab
http://iat.ubalt.edu/usability_lab/
©20117www.id-book.com
Living labs
• People’s use of technology in their everyday lives can be evaluated in living labs.
• Such evaluations are too difficult to do in a usability lab.
• Eg the Aware Home was embedded with a complex network of sensors and audio/video recording devices (Abowd et al., 2000).
©20118www.id-book.com
Usability testing & field studies can compliment
©20119www.id-book.com
Evaluation methodsMethod Controlle
d settingsNatural settings
Without users
Observing x x
Asking users
x x
Asking experts
x x
Testing x
Modeling x
©201110www.id-book.com
The language of evaluation
• Analytics • Analytical evaluation• Controlled
experiment• Expert review or
critique • Field study • Formative evaluation• Heuristic evaluation
• In the wild evaluation• Living laboratory• Predictive evaluation• Summative evaluation• Usability laboratory • User studies • Usability testing • Users or participants
©201111www.id-book.com
Key points• Evaluation & design are closely integrated in user-
centered design.• Some of the same techniques are used in evaluation as
for establishing requirements but they are used differently (e.g. observation interviews & questionnaires).
• Three types of evaluation: laboratory based with users, in the field with users, studies that do not involve users
• The main methods are: observing, asking users, asking experts, user testing, inspection, and modeling users’ task performance, analytics.
• Dealing with constraints is an important skill for evaluators to develop.
©201112www.id-book.com
Evaluation studies: From
controlled to natural settings
Chapter 14
augmented and annotated by Wan C. Yoon - 2012
©201113www.id-book.com
The aims:
• • Explain how to do usability testing• • Outline the basics of experimental design • • Describe how to do field studies
©201114www.id-book.com
Usability testing
• Involves recording performance of typical users doing typical tasks.
• Controlled settings. • Users are observed and timed.• Data is recorded on video & key presses are
logged. • The data is used to calculate performance times,
and to identify & explain errors. • User satisfaction is evaluated using
questionnaires & interviews. • Field observations may be used to provide
contextual understanding.
©201115www.id-book.com
Experiments & usability testing
• Experiments test hypotheses to discover new knowledge by investigating the relationship between two or more things – i.e., variables.
• Usability testing is applied experimentation. • Developers check that the system is usable by the
intended user population for their tasks.• Experiments may also be done in usability testing.
©201116www.id-book.com
Usability testing & researchUsability testing
• Improve products• Few participants• Results inform design• Usually not completely
replicable• Conditions controlled as
much as possible• Procedure planned• Results reported to
developers
Experiments for research
• Discover knowledge• Many participants• Results validated
statistically • Must be replicable• Strongly controlled
conditions• Experimental design• Scientific report to
scientific community
©201117www.id-book.com
Usability testing
• Goals & questions focus on how well users perform tasks with the product.
• Comparison of products or prototypes common.
• Focus is on time to complete task & number & type of errors.
• Data collected by video & interaction logging.• Testing is central.• User satisfaction questionnaires & interviews
provide data about users’ opinions.
©201118www.id-book.com
Usability lab with observers watching a user & assistant
©201119www.id-book.com
Portable equipment for use in the field
©201120www.id-book.com
©201121www.id-book.com
Testing conditions
• Usability lab or other controlled space.• Emphasis on:
– selecting representative users;– developing representative tasks.
• 5-10 users typically selected.• Tasks usually last no more than 30 minutes.• The test conditions should be the same for every
participant.• Informed consent form explains procedures and
deals with ethical issues.
©201122www.id-book.com
Some type of data• Time to complete a task.• Time to complete a task after a specified. time
away from the product.• Number and type of errors per task.• Number of errors per unit of time.• Number of navigations to online help or
manuals.• Number of users making a particular error.• Number of users completing task successfully.
©201123www.id-book.com
Product and Process measures
• Product measures– Task success/failure– Task Completion Time– Number (and type) of Errors (per task, per unit time,
etc.)– Subjective assessment (difficulty, uncertainty,
satisfaction etc.)
• Process Measures– Number of incorrect responses (for types)– Information gathering activities– Missed observations or hypotheses– Hesitations– Undo’s– Conditional assessment of decisions
©201124www.id-book.com
Usability engineering orientation
• Aim is improvement with each version.• Current level of performance. • Minimum acceptable level of performance.• Target level of performance.
©201125www.id-book.com
How many participants is enough for user testing?
• The number is a practical issue.• Depends on:
– schedule for testing;– availability of participants;– cost of running tests.
• Typically 5-10 participants. • Some experts argue that testing should continue
until no new insights are gained.
©201126www.id-book.com
Experiments
• Predict the relationship between two or more variables.
• Independent variable is manipulated by the researcher.
• Dependent variable depends on the independent variable.
• Typical experimental designs have one or two independent variable.
• Validated statistically & replicable.
©201127www.id-book.com
Different, same, matched participant design
Design Advantages Disadvantages
Different No order effects Many subjects & individual differences a problem
Same Few individuals, no individual differences
Counter-balancing needed because of ordering effects
Matched Same as different participants but individual differences reduced
Cannot be sure of perfect matching on all differences
between-subjects
within-subjects
paired or matched-subjects
©201128www.id-book.com
Hypothetical Models
• Mean Difference• Linear Regression and Variable Transformation • ANOVA
– Within and Between-Subject Models– Mixed Models– The Assumptions
• Independence• Normally distributed• Homogeneity of variances• Additivity and Sphericity
• ANCOVA– When a parallel pairs of model can be assumed.
©201129www.id-book.com
Field studies
• Field studies are done in natural settings.• “in the wild” is a term for prototypes being used
freely in natural settings.• Aim to understand what users do naturally and
how technology impacts them.• Field studies are used in product design to:
- identify opportunities for new technology;- determine design requirements; - decide how best to introduce new technology;- evaluate technology in use.
©201130www.id-book.com
Data collection & analysis
• Observation & interviews– Notes, pictures, recordings– Video– Logging
• Analyzes– Categorized– Categories can be provided by theory
• Grounded theory• Activity theory
©201131www.id-book.com
Data presentation
• The aim is to show how the products are being appropriated and integrated into their surroundings.
• Typical presentation forms include: vignettes, excerpts, critical incidents, patterns, and narratives.
©201132www.id-book.com
Key points Usability testing is done in controlled conditions. Usability testing is an adapted form of experimentation. Experiments aim to test hypotheses by manipulating
certain variables while keeping others constant. The experimenter controls the independent variable(s)
but not the dependent variable(s). There are three types of experimental design: different-
participants, same- participants, & matched participants. Field studies are done in natural environments. “In the wild” is a recent term for studies in which a
prototype is freely used in a natural setting. Typically observation and interviews are used to collect
field studies data. Data is usually presented as anecdotes, excerpts, critical
incidents, patterns and narratives.
©201133www.id-book.com
Analytical evaluation
Chapter 15
adapted by Wan C. Yoon - 2012
©201134www.id-book.com
Aims:
• Describe the key concepts associated with inspection methods.
• • Explain how to do heuristic evaluation and walkthroughs.
• • Explain the role of analytics in evaluation. • • Describe how to perform two types of
predictive methods, GOMS and Fitts’ Law.
©201135www.id-book.com
Inspections
• Several kinds.• Experts use their knowledge of users &
technology to review software usability.• Expert critiques (crits) can be formal or
informal reports.• Heuristic evaluation is a review guided by a
set of heuristics.• Walkthroughs involve stepping through a pre-
planned scenario noting potential problems.
©201136www.id-book.com
Heuristic evaluation
• Developed Jacob Nielsen in the early 1990s.• Based on heuristics distilled from an empirical
analysis of 249 usability problems.• These heuristics have been revised for current
technology. • Heuristics being developed for mobile devices,
wearables, virtual worlds, etc.• Design guidelines form a basis for developing
heuristics.
©201137www.id-book.com
Nielsen’s original heuristics
• Visibility of system status.• Match between system and real world.• User control and freedom.• Consistency and standards.• Error prevention. • Recognition rather than recall.• Flexibility and efficiency of use.• Aesthetic and minimalist design.• Help users recognize, diagnose, recover from
errors.• Help and documentation.
©201138www.id-book.com
Discount evaluation
• Heuristic evaluation is referred to as discount evaluation when 5 evaluators are used.
• Empirical evidence suggests that on average 5 evaluators identify 75-80% of usability problems.
©201139www.id-book.com
No. of evaluators & problems
©201140www.id-book.com
3 stages for doing heuristic evaluation
• Briefing session to tell experts what to do.• Evaluation period of 1-2 hours in which:
– Each expert works separately;– Take one pass to get a feel for the product;– Take a second pass to focus on specific features.
• Debriefing session in which experts work together to prioritize problems.
©201141www.id-book.com
Advantages and problems
• Few ethical & practical issues to consider because users not involved.
• Can be difficult & expensive to find experts.• Best experts have knowledge of application
domain & users.• Biggest problems:
– Important problems may get missed;– Many trivial problems are often identified;– Experts have biases.
©201142www.id-book.com
Heuristics for websites focus on key criteria (Budd, 2007)
• Clarity• Minimize unnecessary complexity & cognitive
load• Provide users with context• Promote positive & pleasurable user experience
©201143www.id-book.com
Cognitive walkthroughs
• Focus on ease of learning.• Designer presents an aspect of the design &
usage scenarios.• Expert is told the assumptions about user
population, context of use, task details.• One or more experts walk through the design
prototype with the scenario.• Experts are guided by 3 questions.
©201144www.id-book.com
The 3 questions
• Will the correct action be sufficiently evident to the user?
• Will the user notice that the correct action is available?
• Will the user associate and interpret the response from the action correctly?
As the experts work through the scenario they note problems.
©201145www.id-book.com
Pluralistic walkthrough
• Variation on the cognitive walkthrough theme.• Performed by a carefully managed team.• The panel of experts begins by working
separately.• Then there is managed discussion that leads to
agreed decisions.• The approach lends itself well to participatory
design.
©201146www.id-book.com
A project for you …
• http://www.id-book.com/catherb/• provides heuristics and a template so that you
can evaluate different kinds of systems. • More information about this is provided in the
interactivities section of the id-book.com website.
©201147www.id-book.com
Analytics
• A method for evaluating user traffic through a system or part of a system
• Many examples including Google Analytics, Visistat (shown below)
• Times of day & visitor IP addresses
©201148www.id-book.com
Social action analysis(Perer & Shneiderman, 2008)
©201149www.id-book.com
Predictive models
• Provide a way of evaluating products or designs without directly involving users.
• Less expensive than user testing.• Usefulness limited to systems with predictable
tasks - e.g., telephone answering systems, mobiles, cell phones, etc.
• Based on expert error-free behavior.
©201150www.id-book.com
GOMS• Goals – what the user wants to achieve eg. find a
website.• Operators - the cognitive processes & physical
actions needed to attain goals, eg. decide which search engine to use.
• Methods - the procedures to accomplish the goals, eg. drag mouse over field, type in keywords, press the go button.
• Selection rules - decide which method to select when there is more than one.
©201151www.id-book.com
Keystroke level model
• GOMS has also been developed to provide a quantitative model - the keystroke level model.
• The keystroke model allows predictions to be made about how long it takes an expert user to perform a task.
©201152www.id-book.com
Response times for keystroke level operators (Card et al., 1983)
Operator Description Time (sec)K Pressing a single key or button
Average skilled typist (55 wpm)Average non-skilled typist (40 wpm)Pressing shift or control keyTypist unfamiliar with the keyboard
0.220.280.081.20
P
P1
Pointing with a mouse or other device on adisplay to select an object.This value is derived from Fitts’ Law which isdiscussed below.Clicking the mouse or similar device
0.40
0.20H Bring ‘home’ hands on the keyboard or other
device0.40
M Mentally prepare/respond 1.35R(t) The response time is counted only if it causes
the user to wait.t
©201153www.id-book.com
Summing together
©201154www.id-book.com
Using KLM to calculate time to change gaze (Holleis et al., 2007)
©201155www.id-book.com
Fitts’ Law (Fitts, 1954)
• Fitts’ Law predicts that the time to point at an object using a device is a function of the distance from the target object & the object’s size.
• The further away & the smaller the object, the longer the time to locate it & point to it.
• Fitts’ Law is useful for evaluating systems for which the time to locate an object is important, e.g., a cell phone,a handheld devices.
©201156www.id-book.com
A project for you …
• Use the web & other resources to research claims that heuristic evaluation often identifies problems that are not serious & may not even be problems.
• Decide whether you agree or disagree.• Write a brief statement arguing your position.• Provide practical evidence & evidence from the
literature to support your position.
©201157www.id-book.com
A Project for you …Fitts’ Law
• Visit Tog’s website and do Tog’s quiz, designed to give you fitts!
•http://www.asktog.com/columns/022DesignedToGiveFitts.html
©201158www.id-book.com
Key points
• Inspections can be used to evaluate requirements, mockups, functional prototypes, or systems.
• User testing & heuristic evaluation may reveal different usability problems.
• • Walkthroughs are focused so are suitable for evaluating small parts of a product.
• • Analytics involves collecting data about users activity on a website or product
• • The GOMS and KLM models and Fitts’ Law can be used to predict expert, error-free performance for certain kinds of tasks.
top related