2011 10 07 (uam) emadrid mfreire ucm sistema analisis usabilidad herramientas autoria contenidos e...
TRANSCRIPT
A Usability Analysis System fore-Learning Authoring Tools
Manuel Freire Morá[email protected] seminar on Adaptive Systems - 7 October 2011
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
1/20
Concluding remarks
Presentation
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
2/20
Concluding remarks
Introduction: authoring & tool evaluation
System adoption requires content authoring
reuse
Better authoring requires creativity support imagine, create, play, share, reflect
low threshold, high ceiling, wide walls
make it as simple as possible - and maybe simpler
evaluate your tools
creative thinking spiral - Resnick, 2008
Shneiderman et al, 2006
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
3/20
Concluding remarks
Story-line authoring: adventure game authoring
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
4/20
Concluding remarks
Story-line authoring: Weev
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
5/20
Concluding remarks
Story-line authoring: initial question
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
6/20
Concluding remarks
Ratings and outcomes: a-b testing with ratings
A-B testing: very popular in web usability Split participants randomly into two groups
Each group uses an interface variant (A and B)
Outcomes of each group are compared
Outcome of a creative task? Objective measures: correctness, completeness
Subjective measures: questionnaires (author satisfaction), ratings (comparative quality)
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
7/20
Concluding remarks
Ratings and outcomes: evaluating ratings
N users are requested to rate M users each All users rated same number of times
No user rated twice by same rater
No user rates self
Unknown individual rating distribution. How to determine if results are significant? H0: count(A rated better than B) is a fair coin toss
one-sided binomial distribution
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
8/20
Concluding remarks
Ratings and outcomes: statistical treatment
5/10 8/107/104/10 6/10
b ba aa
1x b>a
4 x a>b
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
9/20
Concluding remarks
Ratings and outcomes: experiment
Experiment with 20 users (first-time Weev users)
5-minute tutorial on tool use (applicable to A&B)
"Little Red Riding-Hood" script and resources
1 hour's time; after editing, rating screen
Outcome 70 total A vs B comparisons; in 40, B > A
p-value: 0.141: does not reject null hypothesis
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
10/20
Concluding remarks
Ratings and outcomes: simulations
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
11/20
Concluding remarks
Ratings and outcomes: more simulations
Increase ratings-per-user
Increase number of users
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
12/20
Concluding remarks
Ratings and outcomes: increasing ratings
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
13/20
Concluding remarks
Ratings and outcomes: increasing users
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
14/20
Concluding remarks
Time-lapse visualization: idea
UI instrumentation sends data to server server timestamps records for each user
UI screen captures
Action logs
last screen capture sample used for rating
ratings-file is just one more record type
How do you begin to analyze this data? Build videos from screen-captures
Visualization tool to scan time-lapse data
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
15/20
Concluding remarks
Time-lapse visualization: interface
detail / difference view
user / record selection time-lapse view of
records for first selected user
time-lapse view for next selected user
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
16/20
Concluding remarks
Time-lapse visualization: insights
User is stuck
Creative flow
Rectangular placement
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
17/20
Concluding remarks
Time-lapse visualization: tasks and directions
Tasks for UI evaluationWhat did this user do?
- glance at timeline
What happened between here and there? - use difference function
Still missing Variable-length time lapses ("time zoom")
Graphical display of textual log data
Filter & Sort
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
18/20
Concluding remarks
Future work: next experiment
Server Finer-grained data collection
Better rating distributions, more ratings
Support for experimenter note-taking
Analysis tool improvements
Test new automatic layout assistant Do people use it?
Do people rate "automatically-improved" storylines as better than-before?
Contents
Introduction
Ratings and outcomes
Story-line authoring
Time-lapse visualization
Future work
19/20
Concluding remarks
Concluding remarks
A/B usability testing methodology for creative/aesthetic tasks
Reusable server Collects images, logs, rating results
Configurable rating, allows real-time monitoring
Reusable client Specific handling for each record-type (ie.: logs)