usability testing - department of computer scienceelements of usability testing evaluate, analyze,...
TRANSCRIPT
Usability Testing
November 14, 2016
Announcements
Wednesday:
HCI in industry
VW: December 1 (no matter what)
Fall 2016 COMP 3020 2
Questions?
Fall 2016 COMP 3020 3
Today
Usability testing
Data collection and analysis
Fall 2016 COMP 3020 4
Usability test
A usability test is a “formal” method for evaluating whether a design is learnable, efficient, memorable, can reduce errors, meets users’ expectations, etc.
users are not being evaluated
the design is being evaluated
5Fall 2016 COMP 3020
Usability test – Rough Outline
Bring in real users
Have them complete tasks with your design, while you watch (ideally with your entire team)
Measure and record things
task completion, task time, error rates
satisfaction, problem points, etc.
use a think-aloud protocol, so you can “hear what they are thinking”
6Fall 2016 COMP 3020
Usability test – Rough outline
Use the data to
identify problems (major ones | minor ones)
provide design suggestions to design/engineering team
iterate on the design, repeat
7Fall 2016 COMP 3020
Important Considerations
Usually takes place in a usability lab or other controlled space
Major emphasis is on
selecting representative users
developing representative tasks
5-10 users typically selected
Tasks usually last no more than 30 minutes
The test conditions should be the same for every participant
Informed consent form explains ethical issues
Fall 2016 8COMP 3020
Case Study: Testing MEDLINEplus
Five tasks were developed
Wanted to check categorization and navigation support
Task 1: Find information about whether a dark bump on your shoulder might be skin cancer
Task 2: Find information about whether its safe to use Prozac during pregnancy
Task 3: Find information about whether there is a vaccine for hepatitis C
Task 4: Find recommendations about the treatment of breast cancer
Task 5: Find information about the dangers associated with drinking alcohol during pregnancy
Fall 2016 9COMP 3020
Creating tasks
A task is designed to probe a problem
Tasks should be straightforward and require the user to find certain items, or do certain operations
They can be more complex such as solving particular problems
Sample tasks for a weather network web site:
What is the forecasted weather for Winnipeg?
What is air quality in Los Angeles today?
What is the level of humidity in Winnipeg?
What is the forecast for Ottawa for the upcoming weekend?
Fall 2016 10COMP 3020
Case Study: Testing MEDLINEplus
Selection of participants
9 participants from health care practices in DC area
7 Female, 2 Male
Fall 2016 COMP 3020 11
How many participants is enough for usability testing?
The number is largely a practical issue
Depends on:
schedule for testing
availability of participants
cost of running tests
Typical 5-10 participants
Some experts argue that testing should continue until no new insights are gained
Fall 2016 12COMP 3020
Activity
You are developing a user test for a new CS web page. Identify 6 tasks for the test:
Task 1: Identify the instructor for Comp 3020
Task 2: Find the e-mail address of the Comp 3020 prof
Task 3: Find the admission requirements for the M.Sc. Program
Task 4: Find out the first day of classes next term
Task 5: Locate the requirements for being a Co-op student
Task 6: Identify whether the graduate Graphics course is a “fundamentals” course
Fall 2016 13COMP 3020
Activity
You are developing a user test for a new CS web page.
Who are your participants:
Students (CS, or interested)Faculty…?
Fall 2016 14COMP 3020
Data Analysis
Qualitative data
Collected from interviews, some types of questionnaires, observation notes
Interpreted & used for telling a ‘story’ about what was observed
(difficult!)
Quantitative data
Collected from interaction & video logs
Presented as values, tables, charts, graphs and treated statistically
(safe!)
COMP 3020 15Fall 2016
Making Sense of Your Data
Affinity diagrams
16Fall 2016 COMP 3020
Making Sense of Your Data
discussion with others who watched with you
17Fall 2016 COMP 3020
Elements of Usability Testing
Identify practical issues – select typical users
make sure you have appropriate representationi.e. e-recipe primarily for families but 90% sample are single people
Identify practical issues – prepare testing conditions
Lab preferably
Identify practical issues – plan to run tests
Have scripts in place
Test equipment
Have recording material prepared
Deal with ethical issues
Consent form
Fall 2016 18COMP 3020
Elements of Usability Testing
Evaluate, analyze, and present data
Report on times to complete task, number of errors
Provide simple statistical measures: mean, median, std dev.
Describe interaction patterns
e.g., four ways that people may use the interface
Fall 2016 19COMP 3020
Usability Testing: Presenting the Results
Rank issues in terms of severity
Not only a list of problems and issues!
Provide small suggestions on how to address
Provide evidence (video, quotes, examples) of people encountering issues
ITERATE ON THE DESIGN!!?!?!?
20Fall 2016 COMP 3020
More on data collection…
Questionnaires
Earlier in the term we discussed questionnaire design for gathering requirements
Most user satisfaction questionnaires consist primarily of closed questions
Participants encouraged to leave their comments in space provided on the page, or in the margins
More on designing closed questions…
Fall 2016 COMP 3020 22
Question and response format – Likert scales
Likert scales are used for measuring opinions, attitudes, beliefs
E.g., Evaluating color on a web site can have the following forms
The use of color is excellent: (where 1 represents strongly disagree and 5 represents strongly agree)
1 2 3 4 5
The use of color is excellent:strongly disagree ok agree strongly
disagree agree
Fall 2016 23COMP 3020
Question and response format –Likert scales
• Steps for designing Likert scales:
– Gather a pool of short statements about the features of the product that are to be evaluated
– Divide the items into groups containing the same amount of positive and negative statements
– Create logical/conceptual groups
– Decide on the scale (5-point/3-point/9-point)
– Select items for the final questionnaire and reword as necessary
Fall 2016 24COMP 3020
Likert scales – response options
Odd/Even
If possible to have 'neutral' response, then use odd number of options (central = neutral place)
If judging something is good/bad, male/female then look at two response options
Even numbers 'force' respondents in one way or anotherend up with random responses between middle items
How wide (1 to 3, 1 to 5, or even 1 to 12?)
How will the majority distinguish between the different levels
If majority fairly uninformed about the topic, then use small number
If dealing with experts, then you can use a much larger set
Fall 2016 25COMP 3020
Anchors
Anchors are the verbal comments above the numbers ('strongly agree', etc.)
How many to include?
In factual statements (or smaller scales)
considered good to use anchors above all options will give you accurate results
News: Daily Weekly Monthly Never
Larger scales
Helpful to indicate the central (neutral) point if meaningful, having numerous anchors may not be so important
The content in the website is clear (1-10): 1 (strongly disagree) 5 (neutral) 10 (strongly agree)
Fall 2016 26COMP 3020
Guidelines for questionnaire design
See notes on from earlier in the term (recall)
Conciseness: questions should be clear and specific
e.g. should the system include a users manual? (YES/NO)
Closed questions: when possible ask closed questions and offer a range of answers
e.g. How often do you print checks? (1: very often – 5: never)
Alternate option: Consider including a “no-opinion” option for questions that seek opinions
e.g. the payroll module is essential (…N/A)
Order: think about the ordering of questions. General questions should precede specific ones
e.g. a question about a specific feature say in a payroll module should come after asking whether the payroll module is essential
Fall 2016 27COMP 3020
Guidelines for questionnaire design
Break up multiple questions: Avoid complex multiple questions
e.g. is the payroll system and attendance manager efficient?
Proper scales: when scales are used make sure the range is appropriate and do not overlap
e.g. 10…30, 31…40, ….
Language: avoid jargon
e.g. should the display be based on bezier curves?
Instructions: provide clear instructions on how to complete the questionnaire
e.g. please rate the performance of the following items
Compactness: a balance must be struck between white space and the need to keep the questionnaire as compact as possible
Fall 2016 28COMP 3020
Participant #: _____
Please circle the most appropriate selection
Age Range: 21-29 30-44 45-60
Gender: Male Female
Internet/Web ExperienceNews Daily Weekly Monthly Never
Research, Information gathering Daily Weekly Monthly Never
Top stories usage Daily Weekly Monthly Never
Please rate (i.e. check the box) agreement or disagreement with the following statements
Question Strongly
Agree
Agree Neutral Disagree Strongly
Disagree
The navigation on the
links is clear
The website contains
information that is useful
to me
Different typeface
used for
instructions on the
questionnaire
Designing questionnairesfa
ctu
al
Fall 2016 29COMP 3020
Analyzing questionnaire data
Helps to think of analysis of questionnaire even before its design
Present results clearly - tables can be used for proper structure
Simple statistics can say a lot, e.g., mean, median, mode, standard deviation
Percentages are useful but give population size
Bar graphs show categorical data well
More advanced statistics can be used if needed
Fall 2016 30COMP 3020
Observing People
The majority of evaluations with users involve some form of observation
Simple form of observation:
user is given a set of tasks, and the evaluator simply watches the user
So...? what do you watch? what do you do? what do you record?
Fall 2016 COMP 3020 31
Think-AloudGives insight into what the user is thinking
Awkward/uncomfortable for subject
May alter the way people perform their task
Hard to talk when they are concentrating
User’s personality may not align with thinking aloud
COMP 3020 32Fall 2016
Participatory Observation(co-discovery learning)
Main idea: remove the awkwardness of think-aloud
Two people sit down to complete tasksOnly one person is allowed to touch the interface
Variation: use a semi-knowledgeable “coach” and a novice (only the novice gets to touch the design)
Creates a natural social situation
Novice subject asking questions
Semi-knowledgeable coach giving little feedback but not much
The activity provides insights into thinking process of both subjects
COMP 3020 33Fall 2016
Indirect tracking of activities
Direct observation can be obtrusive or impossible
Alternatives:
Interaction logging:
Recording key presses, mouse buttons, interface changes
Difficulty: need to correlate specific action with the appropriate tasks and meaning (hard)
Diaries / experience sampling
What users did, when they did it, and what they thought about their interactions
Provide templates for users to fill in
COMP 3020 34Fall 2016
Observations: Obtrusive vs. Unobtrusive
How people behave and how they explain are different, e.g., as with LOOK vs ASK
Observation techniques can range from being unobtrusive to obtrusive
Unobtrusive:
Observe test users but refrain from interacting with them; want to avoid influencing or encouraging questions
Obtrusive:
interact with users by asking questions, explain design decisions, engage user in a discussion
COMP 3020 35Fall 2016