week 10 ux goals and metrics workshop
DESCRIPTION
Week 10 Workshop #2TRANSCRIPT
Workshop #2
UX Goals and Metrics
Human Computer Interaction / COG3103, 2014 Fall Class hours : Tue 1-3 pm/Thurs 12-1 pm 6 November
Choosing the Right Metrics Ten Types of Usability Studies
• Issue Based Metrics (Ch 5)
– Anything that prevents task completion
– Anything that takes someone off course
– Anything that creates some level of confusion
– Anything that produces an error
– Not seeing something that should be noticed
– Assuming something should be correct when it is not
– Assuming a task is complete when it is not
– Performing the wrong action
– Misinterpreting some piece of content
– Not understanding the navigation
Workshop #2 COG_Human Computer Interaction 2
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Self Reported Metrics (Ch 6) : Asking participant for information about their
perception of the system and their interaction with it
– Overall interaction
– Ease of use
– Effectiveness of navigation
– Awareness of certain features
– Clarity of terminology
– Visual appeal
– Likert scales
– Semantic differential scales
– After-scenario questionnaire
– Expectation measures
– Usability Magnitude Estimation
– SUS
– CUSQ (Computer System Usability Scale)
– QUIS (Questionnaire for User Interface Satisfaction)
– WAMMI (Website Analysis & Measurement Inventory)
– Product Reaction Cards
Workshop #2 COG_Human Computer Interaction 3
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Behavioral and Physiological Metrics (Ch 7)
– Verbal Behaviors
• Strongly positive comment
• Strongly negative comment
• Suggestion for improvement
• Question
• Variation from expectation
• Stated confusion/frustration
– Nonverbal Behaviors
• Frowning/Grimacing/Unhappy
• Smiling/Laughing/Happy
• Surprised/Unexpected
• Furrowed brow/Concentration
• Evidence of impatience
• Leaning in close to screen
• Fidgeting in chair
• Rubbing head/eyes/neck
Workshop #2 COG_Human Computer Interaction 4
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Combined and Comparative Metrics (Ch
8)
– Taking smaller pieces of raw data like
task completion rates, time-on-task, self
reported ease of use to derive new
metrics such as an overall usability
metric or usability score card
– Comparing existing usability data to
expert or idea results
Workshop #2 COG_Human Computer Interaction 5
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Live Website Metrics (Ch 9)
– Information you can glean from live data
on a production website
• Server logs – page views and visits
• Click through rates - # times link shown vs.
actually clicked
• Drop off rates – abandoned process
• A/B studies – manipulate the pages users
see and compare metrics between them
Workshop #2 COG_Human Computer Interaction 6
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Card Sorting Data (Ch 9)
– Open card sort
• Give participants cards, they sort and
define groups
– Closed card sort
• Give participants cards and name of
groups, they put cards into groups
Workshop #2 COG_Human Computer Interaction 7
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Increasing Awareness
– Aimed at increasing awareness of a specific piece of content
or functionality
– Why is something not noticed or used?
• Metrics
– Live Website Metrics
• Monitor interactions
• Not foolproof – user may notice and decide not to click,
alternatively user may click but not notice interaction
• A/B testing to see how small changes impact user behavior
– Self Reported Metrics
• Pointing out specific elements to user and asking whether
they had noticed those elements during task
• Aware of feature before study began
– Not everyone has good memory
• Show users different elements and ask them to choose
which one they saw during task
– Behavioral and Physiological Metrics
• Eye tracking
– Determine amount of time looking at a certain element
– Average time spent looking at a certain element
Workshop #2 COG_Human Computer Interaction 8
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Problem Discovery
– Identify major usability issues
– After deployment, find out what annoys users
– Periodic checkup to see how users are interaction with
the product
• Discovery vs. usability study
– Open-ended
– Participants may generate own tasks
– Strive for realism in typical task and in user’s
environment
– Comparing across participants can be difficult
• Metrics
– Issue Based Metrics
• Capture all usability issues, you can convert into type
and frequency
• Assign severity rating and develop a quick-hit list of
design improvements
– Self Reported Metrics
Workshop #2 COG_Human Computer Interaction 9
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Creating an Overall Positive User Experience
– Not enough to be usable, want exceptional user
experience
– Thought provoking, entertaining, slightly-addictive
– Performance useful, but what user thinks, feels, and
says really matters
• Metrics
– Self Reported
• Satisfaction – common but not enough
• Exceed expectations – want user to say it was easier,
more efficient, or more entertaining than expected
• Likelihood to purchase, use in future
• Recommend to a friend
• Behavioral and Physiological
– Pupil diameter
– Heart rate
– Skin conductance
Workshop #2 COG_Human Computer Interaction 10
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Comparing Designs
– Comparing more than one design alternative
– Early in the design process teams put together semi-
functional prototypes
– Evaluate using predefined set of metrics
• Participants
– Can’t ask same participant to perform same tasks with
all designs
– Even with counterbalancing design and task order,
information on valuable
• Procedure
– Study as between-subjects, participant only works with
one design
– Have primary design participant works with, show
alternative designs and ask for preference
Workshop #2 COG_Human Computer Interaction 11
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Comparing Designs (continued)
• Metrics
– Task Success
• Indicates which design more usable
• Small sample size, limited value
– Task Time
• Indicates which design more usable
• Small sample size, limited value
– Issue Based Metrics
• Compare the frequency of high-, medium-, and
lowseverity issues across designs to see which one
most usable
– Self Reported Metrics
• Ask participant to choose the prototype they would
most like to use in the future (forced comparison)
• As participant to rate each prototype along
dimensions such as ease of use and visual appeal
Workshop #2 COG_Human Computer Interaction 12
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Independent & Dependent Variables
Independent variables:
– The things you manipulate or
control for, e.g.,
– Aspect of a study that you
manipulate
– Chosen based on research question
– e.g.
• Characteristics of participants (e.g.,
age, sex, relevant experience)
• Different designs or prototypes
being tested
• Tasks
Dependent variables: – The things you measure
– Describes what happened as a result
of the study
– Something you measure as the result,
or as dependent on, how you
manipulate the independent variables
– e.g.
• Task Success
• Task Time
• SUS score
• etc.
Workshop #2 COG_Human Computer Interaction 13
Need to have a clear idea of what you plan to manipulate and what you plan to measure
Designing a Usability Study
RQ 1
• Research Question :
– Differences in performance
between males and females
• Independent variable
– : Gender
• Dependent variable
– : Task completion time
RQ 2
• Research Question :
– Differences in satisfaction
between novice and expert
users
• Independent variable :
– Experience level
• Dependent variable :
– Satisfaction
Workshop #2 COG_Human Computer Interaction 14
Types of Data
• Nominal (aka Categorical)
– e.g., Male, Female; Design A, Design B.
• Ordinal
– e.g., Rank ordering of 4 designs tested from Most Visually Appealing to
Least Visually Appealing.
• Interval
– e.g., 7-point scale of agreement: “This design is visually appealing.
Strongly Disagree . . . Strongly Agree”
• Ratio
– e.g., Time, Task Success %
Workshop #2 COG_Human Computer Interaction 15
NORMINAL DATA
• Definition
– Unordered groups or categories
– Without order, cannot say one is better than another
• May provide characteristics of users, independent variables that allow you to segment
data
– Windows versus Mac users
– Geographical location
– Males versus females
• What about dependent variables?
– Number of users who clicked on A vs. B
– Task success
• Usage
– Counts and frequencies
Workshop #2 COG_Human Computer Interaction 16
ORDINAL DATA
• Definition
– Ordered groups and categories
– Data is ordered in a certain way but intervals between measurements are not
meaningful
• Ordinal data comes from self-reported data on questionnaires
– Website rated as excellent, good, fair, or poor
– Severity rating of problem encountered as high, medium, or low
• Usage
– Looking at frequencies
– Calculating average is meaningless (distance between high and medium may
not be the same as medium and low)
Workshop #2 COG_Human Computer Interaction 17
INTERVAL DATA
• Definition
– Continous data where differences between the measurements are meaningful
– Zero point on the scale is arbitrary
• System Usability Scale (SUS)
– Example of interval data
– Based on self-reported data from a series of questions about overall usability
– Scores range from 0 to 100
• Higher score indicates better usability
• Distance between points meaningful because it indicates increase/decrease in percieved
usability
• Usage
– Able to calculate descriptive statistics such as average, standard deviation, etc.
– Inferal statistics can be used to generalize a population
Workshop #2 COG_Human Computer Interaction 18
Ordinal vs. Interval Rating Scales
• Are these two scales different?
• Top scale is ordinal. You should only calculate frequencies of each
response.
• Bottom scale can be considered interval. You can also calculate
means.
Workshop #2 COG_Human Computer Interaction 19
RATIO DATA
• Definition
– Same as interval data with the addition of absolute zero
– Zero has inherit meaning
• Example
– Difference between a person of 35 and a person 38 is the same as the
difference between people who are 12 and 15
– Time to completion, you can say that one participant is twice as fast as
another
• Usage
– Most analysis that you do work with ratio and interval data
– Geometric mean is an exception, need ratio data
Workshop #2 COG_Human Computer Interaction 20
Statistics for each Data Type
Workshop #2 COG_Human Computer Interaction 21
Confidence Intervals
• Assume this was your time data for a study with 5 participants:
Workshop #2 COG_Human Computer Interaction 22
Does that make a difference in your answer?
Calculating Confidence Intervals
– <alpha> is normally .05 (for a
95% confidence interval)
– <std dev> is the standard
deviation of the set of
numbers (9.6 in this example)
– <n> is how many numbers are
in the set (5 in this example)
Workshop #2 COG_Human Computer Interaction 23
=CONFIDENCE(<alpha>,<std dev>,<n>)
Excel Example
Show Error Bars
Workshop #2 COG_Human Computer Interaction 24
Excel Example
How to Show Error Bar
Workshop #2 COG_Human Computer Interaction 25
Binary Success
• Pass/fail (or other binary criteria)
• 1’s (success) and 0’s (failure)
Workshop #2 COG_Human Computer Interaction 26
Confidence Interval for Task Success
• When you look at task success data across participants for a single
task the data is commonly binary:
– Each participant either passed or failed on the task.
• In this situation, you need to calculate the confidence interval using
the binomial distribution.
Workshop #2 COG_Human Computer Interaction 27
Example
– Easiest way to calculate confidence interval is using Jeff Sauro’s
web calculator:
– http://www.measuringusability.com/wald.htm
Workshop #2 COG_Human Computer Interaction 28
1=success, 0=failure. So, 6/8 succeeded, or 75%.
Chi-square
• Allows you to compare actual and expected frequencies for
categorical data.
Workshop #2 COG_Human Computer Interaction 29
=CHITEST(<actual range>,<expected range>)
Excel Example
Comparing Means
T-test
• Independent samples
(between subjects)
– Apollo websites, task times
T-test
• Paired samples (within
subjects)
– Haptic mouse study
Workshop #2 COG_Human Computer Interaction 30
T-tests in Excel
Independent Samples: Paired Samples:
Workshop #2 COG_Human Computer Interaction 31
=TTEST(<array1>,<array2>,x,y)
x = 2 (for two-tailed test) in almost all cases
y = 2 (independent samples) y = 1 (paired samples)
Comparing Multiple Means
• Analysis of Variance (ANOVA)
Workshop #2 COG_Human Computer Interaction 32
“Tools” > “Data Analysis” > “Anova: Single Factor” Excel example: Study comparing 4 navigation approaches for a website
Exercise 10-2: Creating Benchmark Tasks and UX Targets for Your System
• Goal
– To gain experience in writing effective benchmark tasks and measurable UX targets.
• Activities
– We have shown you a rather complete set of examples of benchmark tasks and UX targets for the Ticket Kiosk
System. Your job is to do something similar for the system of your choice.
– Begin by identifying which work roles and user classes you are targeting in evaluation (brief description is
enough).
– Write three or more UX table entries (rows), including your choices for each column. Have at least two UX
targets based on a benchmark task and at least one based on a questionnaire.
– Create and write up a set of about three benchmark tasks to go with the UX targets in the table.
• Do NOT make the tasks too easy.
• Make tasks increasingly complex.
• Include some navigation.
• Create tasks that you can later “implement” in your low-fidelity rapid prototype.
• The expected average performance time for each task should be no more than about 3 minutes, just to keep it
short and simple for you during evaluation.
– Include the questionnaire question numbers in the measuring instrument column of the appropriate UX target.
Lecture #12 COG_Human Computer Interaction 33
Exercise 10-2: Creating Benchmark Tasks and UX Targets for Your System
• Cautions and hints:
– Do not spend any time on design in this exercise; there will be time for detailed design in
the next exercise.
– Do not plan to give users any training.
• Deliverables:
– Two user benchmark tasks, each on a separate sheet of paper.
– Three or more UX targets entered into a blank UX target table on your laptop or on paper.
– If you are doing this exercise in a classroom environment, finish up by reading your
benchmark tasks to the class for critique and discussion.
• Schedule
– Work efficiently and complete in about an hour and a half.
Lecture #12 COG_Human Computer Interaction 34