chi99 panel comparative evaluation of usability tests · panel format l introduction (rolf molich)...
TRANSCRIPT
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Rolf MolichDialogDesign
Denmark
CHI99 PanelComparative Evaluation of Usability Tests
Take a web-site.
Take nine professional usability teams.
Let each team usability test the web-site.
Are the results similar?
What Have We Done?
l Nine teams have usability tested thesame web-site– Seven professional teams– Two student teams
l Test web-site: www.hotmail.comFree e-mail service
Panel Format
l Introduction (Rolf Molich)
l Five minute statements from five participating teams
l The Customer’s point of view (Meeta Arcuri, Hotmail)
l Conclusions (Rolf Molich)
l Discussion - 30 minutes
Purposes of Comparison
l Survey the state-of-the art withinprofessional usability testing of web-sites.
l Investigate the reproducibility ofusability test results
NON Purposes of Comparison
l To pick a winner
l To make a profit
Basis for Usability Test
l Web-site address: www.hotmail.com
l Client scenario
l Access to client through intermediary
l Three weeks to carry out test
What Each Team Did
l Run standard usability test
l Anonymize the usability test report
l Send the report to Rolf Molich
Problems Found
l Total number of differentusability problems found 300
l Found by seven teams 1l six teams 1l five teams 4l four teams 4l three teams 15l two teams 49l one team 226 (75%)
Comparative Usability Evaluation 2
l Barbara Karyukina, SGI (USA)
l Klaus Kaasgaard & Ann D. Thomsen, KMD (Denmark)
l Lars Schmidt and others, Networkers (Denmark)
l Meghan Ede and others, Sun Microsystems, Inc., (USA)
l Wilma van Oel, P5 (The Netherlands)
l Meeta Arcuri, Hotmail, Microsoft Corp. (USA) (Customer)
l Rolf Molich, DialogDesign (Denmark) (Coordinator)
Comparative Usability Evaluation 2
l Joseph Seeley, NovaNET Learning Inc. (USA)
l Kent Norman, University of Maryland (USA)
l Torben Norgaard Rasmussen and others,Technical University of Denmark
l Marji Schumann and others,Southern Polytechnic State University (USA)
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Barbara KaruykinaSGI, Wisconsin
USA
Challenges:
Twenty functional areas
+
User preferences questions
Possible Solutions:
l Two usability tests
l Surveys
l User notes
l Focus groups
Results:
26 tasks + 10 interview questions
100 findings
Challenges:
Twenty functional areas
+
User preferences questions
Problems Found
l Total number of differentusability problems found 300
l Found by seven teams 1l six teams 1l five teams 4l four teams 4l three teams 15l two teams 49l one team 226 (75%)
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Klaus KaasgaardKommunedata
Denmark
Slides currently not available
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Lars SchmidtFramtidsfabriken Networkers
Denmark
Team E
Framtidsfabriken NetworkersTestlab, Denmark
Key learnings CUE-2
l Setting up the test– Insist on dialog with customer
– Secure complete understanding of user groups and usertasks
– Narrow down test goals
l Writing the report– Use screendumps
– State conclusions - skip the premises
– Test the usability of the usability report
Improving Test Methodology
l Searching for usability and usefulness– Hook up with different methodologies (e.g. interviews)
l Focus on website context– Test against e.g. YahooMail
– Test against softwarebased email clients
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Meghan EdeSun Microsystems
California, USA
Hotmail Study Requests
l 18 Specific Features• e.g. Registration, Login, Compose...
l 6 Questions• e.g. "How do users currently do email?"
l 24 Potential Study Areas
Usability Methods
l Expert Review• 6 Reviewers
• 6 Questions
l Usability Study• 6 Participants (3 + 3)
• 5 Tasks (with sub-tasks)
Report Description
1. Executive Summary- 4 Main High-Level Themes- Brief Study Description
2. Debriefing Meeting Summary- 7 Areas (e.g. overall, navigation, power features, ...)
3. Findings- 31 Sections- Study Requests, Extra Areas, Bugs, Task Times, Study Q & A
4. Study Description
Total: 36 Pages - 150 Findings
Lessons Learned
l Importance of close contactwith product team
l Consider including:• severity ratings
• more specific recommendations
• screen shots
Discussion Issues
l How can we measure theusability of our reports?
l How to deal with thedifference between numberof problems found andnumber included in report?
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Wilma van OelP5
The Netherlands
Wilma van Oel
P5
adviseurs voorprodukt-& kwaliteitsbeleidquality & productmanagement consultants
Amsterdam, the Netherlands
Structure of Presentation
l 1. Introduction
l 2. Deviations in approach– Test design
– Results and recommendations
l 3. Lessons for the future– Change in approach?– Was it worth the effort?
Introduction
• Company:P5 Consultants
• Personal background:psychologist
Test designl Subjects: n=11, pilot, ‘critical users’, 1 hour sessionl Data collection: log software, video recording
Methods:lab evaluation + informal approach
Techniques: exploration, task execution,
think aloud, interview, questionnaire
Tool: SUS
A Test Session
Results and recommendations
Negativen = median
Positiven > mean
Recommendations:general
not 'how'
Results:'general'severity?
Lessons for the future
l Change in approach?– Methods: add a usability inspection method
– Procedure: extensive analysis, add session time
– Results: less general, severity?
l Was it worth the effort?– Company: to get experience & benchmarking
– Personally: to improve skills, knowledge
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Meeta ArcuriMicrosoft Corporation
California, USA
Meeta ArcuriUser Experience ManagerMicrosoft Corp., San Jose, CA
CUE - 2 The Customer’s Perspective
l New findings ~ 4%l Validation of known issues ~ 67%
– Previous finding from our lab tests– Finding from on-going inspections
l Remainder - beyond Hotmail Usability– Business reasons for not changing– Out of Hotmail’s control (partner sites)– Problems generic to the web
Customer Summary of Findings
4 Quick and Dirty results4 Recommendations for problem fixes4 Participant quotes – get tone/intensity of
feedback4 Exact # of P who encountered each issue4 Background of Participants4 Environment (browser, speed of connection,
etc.)
Report Content:Positive Observations
l Fresh perspectivesl Lots of data on non-US usersl Recommendations from participantsl Trend reportingl Report of outdated material on site
(some help files)l Appreciate positive findings, comments
Additional Strengths of Reports
l Some recommendations not sensitive toweb issues (performance, security)
l At least one finding irreproducible(not preserving fields in Reg. Form)
l Frequency of issue reported wassometimes vague.
l Some descriptions terse, vague - had todecipher
Report Content: Weaknesses
l Cross-validate new findings with HotmailCustomer Service reports
l Lots of good data to cite in planning meetingsl Some good recommendations given by labs
and participants
How Hotmail Will Use Results
l Focused, iterative testing would give betterresults
l Wide array of user data very valuablel Overall - good qualitative and quantitative data
to help prioritize, schedule, and improveusability of Hotmail.
Conclusion
CHI99 PanelComparative Evaluation of Usability Tests
Presentation by
Rolf MolichDialogDesign
Denmark
Comparison of Tests
l Based only on test reports
l Liberal scoring
l Focus on major differences
l Two generally recognized textbooks:
– Dumas and Redish, ”A Practical Guide toUsability Testing”
– Jeff Rubin, ”Handbook of Usability Testing”
Resources
Team A B C D E F G H J
l Person hoursused for test 136 123 84 (16) 130 50 107 45 218
l # Usabilityprofessionals 2 1 1 1 3 1 1 3 6
l Number of tests 7 6 6 50 9 5 11 4 6
Usability Results
Team A B C D E F G H J
# Positive findings 0 8 4 7 24 25 14 4 6
# Problems 26 150 17 10 58 75 30 18 20
% Exclusive 42 71 24 10 57 51 33 56 60
Usability Results
Team A B C D E F G H J
# Problems 26 150 17 10 58 75 30 18 20
% Core problems(100%=26) 38 73 35 8 58 54 50 27 31
Person hoursused for test 136 123 84 NA 130 50 107 45 218
Problems Found
l Total number of differentusability problems found 300
l Found by seven teams 1l six teams 1l five teams 4l four teams 4l three teams 15l two teams 49l one team 226 (75%)
l If Hotmail is typical, then the totalnumber of usability problems for atypical web-site is huge,much larger than you can hope to findin one series of usability tests
l Usability testing techniques can beimproved
l We need more awareness of theUsability of Usability work
Conclusion
http://www.dialogdesign.dk/cue2.htm
Download Test Reports and Slides