my app is an apparatus: how to do mobile hci research in the large
DESCRIPTION
Since the introduction of application stores for mobile devices there has been an increasing interest to use this distribution platform to collect user feedback. Mobile application stores can make research prototypes widely available and enable to conduct user studies “in the wild” with participants from all over the world. Using apps as an apparatus goes beyond just distributing research prototypes. Consider apps as a tool for research means distributing specifically designed prototypes in order to extend our understanding of mobile HCI. In this tutorial we will provide an overview about recent research in this domain. It will be shown that stringent tasks and users’ motivation are crucial aspects. We will discuss how to design app-based experiments, what kind of users one can expect, and how to avoid ethical and legal issues.TRANSCRIPT
How to do Mobile HCI Research in the large?
Niels HenzeUniversity of StuttgartVisualization and Interactive Systems Institute
Martin PielotTelefónica I+DHCI and Mobile Computing Group
… but lets start with a question:
Who of you ever participated in a user study?
do you think that any of these guys ever did?
Photo by Robertobra, http://en.wikipedia.org/wiki/File:Guarani_Family.JPG (GFDL)
Outline
1. Limitations of common studies2. Into the large3. Types of studies4. What is so special?5. What works for us6. Wrap up
Outline
1. Limitations of common studies2. Into the large3. Types of studies4. What is so special?5. What works for us6. Wrap up
User studies at MobileHCI 201020% acceptance rate43 short+long papers
User studies at MobileHCI 201020% acceptance rate43 short+long paperssubjects per paper
http://nhenze.net/?p=810
User studies at MobileHCI 201020% acceptance rate43 short+long paperssubjects per papersubject’s gender
http://nhenze.net/?p=810
User studies at MobileHCI 201020% acceptance rate43 short+long paperssubjects per papersubject’s genderoften a biased sample
http://nhenze.net/?p=810
undergraduate or graduate students at the local university studying a variety of majors
members in a joint research project
most participants were students
studying or working in the University of Glasgow
university students
most subjects were students with a background in computer sciences
students or employees at our university
all with a university degree, recruited in the Institute community
recruited through flyers, posters and various mailing lists at the university
10 university students and 2 participants are marketing professionals
small samples
artificial context
artificial task
convenient samples
Some male students from the labtook part in our study...
Small sample size isn’t necessarily an issue for a study
Not every study needs a perfect sample of the population
Focussing on studies with few subjects prevents many findings
We stew in our own juices if using our own students by default
User studies at MobileHCI 201122.8% acceptance rate63 short+long paperssubjects per paper
http://nhenze.net/?p=865
Some motivation
Large numbers are expensive in the lab– 1,000 subjects for an hour -> 10,000€– 1,000 subjects for an hour -> 6 month– 1,000 subjects from around the world -> impossible
Different contexts are hard to address– We have no airplane in our lab– Don’t want to train ticket for my participant – And what are the relevant contexts anyway?
Outline
1. Limitations of common studies2. Into the large3. Types of studies4. What is so special?5. What works for us6. Wrap up
Target selection on mobile phonesthirty right-handed subjectsdifferent target locations and sizes
Example of getting large…
[Park2008MobileHCI]
Target selection on mobile phonesthirty right-handed subjectsdifferent target locations and sizes
Taps are skewedfixed posturesingle deviceKorean studentsvague results
[Park2008MobileHCI]
…same thing in the largegame published on the Android Marketwe inform the player about the studyjust looks like an ordinary gameparticipants get some introductionthey tap the targetsWe vary targets’ size and positionthere is even a high score list
published on the Android Market100,000 installations in three months120 million touch eventsmore than hundred different devicesplayers from all over the world
[Park2008MobileHCI]
[Henze2011MobileHCI]
Outline
1. Limitations of common studies2. Into the large3. Types of studies4. What is so special?5. What works for us6. Wrap up
Types of work
Proof of concept– Showing that an idea/concept/product works– Lots of users, good ratings, positive comments, ...
App stores as research tool– Experience report– Ethical and legal issues
Investigating app-specific aspects– How a specific app is used– Compare different visualizations
Observing general aspects– Learn about how people and devices behave– How are apps how, how people touch the screen, ...
Proof of concept
Smule’s iPhone Ocarinamusic instrument for the iPhonemillion installations
[Wang2009NIME]
Shapewriterdeveloped gesture-based keyboard + notepadqualitative feedback from App Store comments
[Zhai2009CHI]
App stores as research tool
Into the wild withHungry Yoshilocation based game for the iPhone94,642 unique downloaderinvestigated how to get subjective feedback
[McMillan2010Pervasive]
Experience from 5 Studiescompare amount of collected dataexperience with collecting qualitative datadiscuss internal and external validity
[Henze2011IJMHCI]
SINLA
PocketN
avigator
MapExplorer
Poke th
e RabbitTap It
0%
20%
40%
60%
80%
100%
0.46%7.32%
54.76%
83.68% 81.31%
Local vs. wildlocale study with 11 participantswild study with over 10,000 userscombine the findings of both approaches
[Morrison 2012CHI]
Investigating app-specific aspects
Ratings for Mobile Applicationscompare amount of collected dataexperience with collecting qualitative datadiscuss internal and external validity
[Girardello2010DSZ]
Compare off-screen visualisationsusing repeated measuresusing a tutorial for a map applicationand using a simple game
[Henze2010MobileHCI] [Henze2010MobileHCI]
Observing general aspects
Falling Asleep with … appazaar
[Böhmer2011MobileHCI]
A Study of Battery Life
[Ferreira2011Pervasive]
proof of concept app stores as a research tool
Ethics and legal issues
investigating app-specific aspects
investigating general aspects
[Wang2009NIME]
[Zhai2009CHI]
[Gilbertson2008CiE]
[Oliver2010HotPlanet]
[McMillan2010RiL]
[Miluzzo2010RiL]
[Henze2011IJMHCI]
[McMillan2010Pervasive]
[Cramer2010UbiComp]
[Morrison2010RiL]
[Poppinga2010OMUE]
[Pielot2011ELV]
[Henderson2009HotPlanet]
[Morrison2011CHI]
[Norcie2011ELV]
[Girardello2010DSZ]
[Riccamboni2010IB]
[Kuhn2010MM]
[Yan2011MobiSys]
[Budde2010IoT]
[Karpischek2011RiL]
[Henze2010MobileHCI]
[Henze2010NordiCHI]
[Hood2011IJTR]
[Henze2011MobileHCIa]
[Henze2011MobileHCIb]
[Watzdorf2010LocWeb]
[Ferreira2011Pervasive]
[Buddharaju2010CHI]
[Sahami2011CHI]
[Verkasalo2010MB]
[Böhmer2011MobileHCI]
Outline
1. Limitations of common studies2. Into the large3. Types of studies4. What is so special?5. What works for us6. Wrap up
but what is special about app store studies?
Common con-trolled studies
Mining existing data
App-based studies
Few participants Many participants Many participants
Artificial context Natural context Natural context
Defined task No tasks Defined tasks (if needed)
Total control over participants No control Weak control over
participants
Heavily biased sample Unbiased sample Biased to unbiased
sample
App-based vs. other studies
You have to “sell” your study
The study has a goal– Collect information about specific behaviour– Performance for a specific task
Users have to install the app on their own will– App needs a purpose– Good ratings, high ranking
Find a compromise– Maintain the goals of the study– Attract sufficient participants
Types of apps
Applications Games Widgets
ParticipantsHow do we count the number of participant?
installations opt-in active users0
10,00020,00030,00040,00050,00060,00070,00080,00090,000
100,000
[McMillan2010Pervasive] [Morrison2010RiL]
ParticipantsHow do we count the number of participant?A good sample of the population?
18-34 35-44 45-54 55-64 65+0%
10%
20%
30%
40%
50%
60%US Android users US population
[Nielsen2011] [USCensusBureau2008]
Collecting information
Objective data– As early as possible [Henze2011IJMHCI]
– More than just the task performance• All aspects that affect the results• E.g. device type, local, time, screen size, resolution, ...• In particular: a version number
– Compromise between permissions and data to collect
Collecting information
Subjective data– App Store comments can provide information• but usually don't [Henze2011IJMHCI]
• Might help to claim an app is great (e.g. [Zhai2009CHI])
• Ratings without baseline are meaningless
– Investigated how to get subjective feedback [McMillan2010Pervasive]
• In-game “tasks” with dynamically loaded questions• Integration with Facebook• Interviewed 10 people over VoIP for $25
Collecting information
You have to measure what you intend to measure!
Case Study: Pocket Navigator [Pielot2012CHI]
motivation: distraction
one in six (17%) cell-toting adults say they have been so distracted while talking or texting that they have physically bumped into another person or an object
Madden and Rainie, 2010, http://pewinternet.org/Reports/2010/Cell-Phone-Distractions.aspx
pocketnavigatornavigation system similar to Google Mapsruns on OpenStreet Maps
pocketnavigatornavigation system similar to Google Mapsruns on OpenStreet Maps
key innovation: convey navigation information in vibration patterns
evaluated in afield studyvibration patterns found to be effectivethey reduce level of distraction
evaluated in field studyvibration patterns found to be effectivethey reduce level of distraction
but, users were no expertsand did not use navigation support out of a necessity
evaluated in field studyvibration patterns found to be effectivethey reduce level of distraction
but, users were no expertsand did not use navigation support out of a necessityInstead of bringing the user into the “lab”
we bring the lab to the user’s daily life
Collecting data,Feb – Dec 2011
quick facts18,000 downloadsmostly US and Europe
quick facts18,000 downloadsmostly US and Europe
Between Feb – Dec 20118,187 routes calculated34,035,316 log entries9,400 hours of usage
quick facts18,000 downloadsmostly US and Europe
Between Feb – Dec 20118,187 routes calculated34,035,316 log entries9,400 hours of usage
a lot of data! But …
pedestrian navigation?
pedestrian navigation?we cannot prevent people from using the app anywhere, e.g. in cars
pedestrian navigation?we cannot prevent people from using the app anywhere, e.g. in carsin fact, 87% of all log data are from indoor use
pedestrian navigation?we cannot prevent people from using the app anywhere, e.g. in carsin fact, 87% of all log data are from indoor use hence filtering (route length, travel time, movement speed) required
lessons learneddouble-check that you measure the intended use!filter data might be necessaryacknowledge the fact that there is always uncertainty!
[Pielot2012CHI]
Collecting information
You have to measure what you intend to measure!
Another Example: TypeIt
TypeItcompare approaches to improve text entrypeople play as along as they want
[Henze2012CHIa, Henze2012CHIb, Henze2012Text]
TypeItcondition affects the number of played levels
4 conditions
TypeItcondition affects the number of played levelsAn ANOVA shows that the
feedback has a significant effect on the total number of levels played (p<.01).
TypeItcondition affects the number of played levelsFactor the number of played levels out using an ANCOVA
Analysis of covariance (ANCOVA) is a general linear model which blends ANOVA and regression. (Wikipedia)
Ready for prime timeUsers don’t care if it’s a research prototype
FC the rabbit.... uninstalledGodimus Prime
Stupid waste of time!!!cailan
Stupid waste of time.lance
Realy stupidhope
1 word...... dumb!josue
Its okerika
boring and dumb.Beba
What the hell is this??Luci
Boo!Cullen Girl
5 stars if there is a way to turn the music off.Doesnt go to well with slipknot
Allen
Stupid and offinciveto my pet rabbit bayleigh
Logan
Ready for prime timeUsers don’t care if it’s a research prototypeLow quality results in low ratings
Ready for prime timeusers don’t care if it’s a research prototypelow quality results in low ratingsand few install installations
Ethical and legal issues
“Primum non nocere”/”First, do no harm” (Thomas Sydenham)
“One should treat others as one would like others to treat oneself” [Flew1979Dictionary]
Informed consentPresentation highly affects the conversion rate
[Pielot2011ELV]
6.96% 57.28%
67.42% 87.57%
Informed consentPresentation highly affects the conversion rateParticipants aren't aware what data is collected
[Morrison2011CHI]
RegulationsWhich rules to follow?
RegulationsWhich rules to follow?e.g. EU Data Protection Directive
[Henderson2009HotPlanet]
“any information relating to an identified or identifiable natural person”
• Transparency: the persons whose data are being collected or accessed have the right to be informed when such data processing is taking place.
• Legitimate purpose: data can only be collected for specific purposes
• Proportionality: data should be processed in a fashion that is not excessive beyond the purposes for which they were collected
Outline
1. Limitations of common studies2. Into the large3. Types of studies4. What is so special?5. What works for us6. Wrap up
… or what works for us
Games vs. Appsour games are more successful
SINLA
MapEx
plorer
Poke th
e Rab
bittap
It!
Type I
t!Hit I
t!0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
400,000number of installations
apps 84.4%
games 15.6%
Games vs. Appsour games are more successfulthere are more apps than games
available in the Android Market
http://www.androlib.com/appstatstype.aspx
Games vs. Appsour games are more successfulthere are more apps than gamesplayers execute the strangest tasks
Games vs. Appsour games are more successfulthere are more apps than gamesplayers execute the strangest taskswidgets and background services are perfect for longitudinal observations
Games vs. Appsour games are more successfulthere are more apps than gamesplayers execute the strangest taskswidgets and background services are perfect for longitudinal observationsbut sometimes an app is just the only option
Informing the userprovide information in the Market
Informing the userprovide information in the Marketshow a modal dialog at the first start
Informing the userprovide information in the Marketshow a modal dialog at the first startprovide more information and a link to an about page
Publishingfancy screenshots and icon (that’s the first thing someone sees)title & description contain words users search forof course I don’t want to miss a single userprepare a dedicated webpage for each app
Playing with the marketfrequent updates
Playing with the marketfrequent updatesrate your app as soon as it becomes available
Keep it simplefocused and specialized studies
Keep it simplefocused and specialized studieslearning by doing
Keep it simplefocused and specialized studieslearning by doingrelease early, often, and try it again if it doesn’t work
Logginguse http and port 80 to transmit data
Logginguse http and port 80 to transmit datastore unaggregated measures
[Henze2012CHI]
Logginguse http and port 80 to transmit datastore unaggregated measuresconsider limited resources
in total:392,401 files
27,331,383,646 bytes
CSV files from ~400,000 users
Logginguse http and port 80 to transmit datastore unaggregated measuresconsider limited resourcesseriously!
Compressed binary data from less than 3,000 users
Advertisementsdoes not work!
200$ for AdMob over a couple of days
TapSnap: http://tiny.cc/tapsnap
Advertisementsdoes not work!well sometimes it does!
100$ for AppBrain on a single day
TypeIt II: http://tiny.cc/TypeIt2
Advertisementsdoes not work!well sometimes it does!focus all your efforts on a very short timeget additional users naturally
100$ for AppBrain on a single day
TypeIt II: http://tiny.cc/TypeIt2
What do?No harm!
Inform the userDon't store data you don't want
Choose a type of appGames worked for meBut if you have a great system anyway...
Sell you studyYou compete with commercial appsGraphics, design, ...
ReleaseKeywords, description, ...Rate and commentFocus your advertisement efforts
Test itWell I don't do thatAt least fix it
Think about the dataDo you store everything interestingCan you store data from 10,000 users?Can you analyse it?
small samples
small samples
large
artificial context
artificial context
natural?
artificial task
artificial task?
convenient samples
convenient samples
very
but how bad is it?
How to do Mobile HCI Research in the large?
Niels HenzeUniversity of StuttgartVisualization and Interactive Systems Institute
Martin PielotTelefónica I+DHCI and Mobile Computing Group
ethnography, controlled experiments, observations, … can all work in the large
collect data early, release often, be flexible
respect ethics, consider regulations
References[Morrison 2012CHI] Alistair Morrison, Donald McMillan, Stuart Reeves, Scott Sherwood, Matthew Chalmers: A Hybrid
Mass Participation Approach to Mobile Software Trials. Proceedings of CHI, 2012.[Wang2009NIME] Ge Wang: Designing Smule’s iPhone Ocarina. Proc. NIME, 2009.[Zhai2009CHI] Zhai, S., Kristensson, P.O., Gong, P., Greiner, M., Peng, S., Liu, L. Dunnigan, A., Shapewriter on the iPhone:
from the laboratory to the real world. Adjunct Proc. CHI, 2009.[Gilbertson2008CiE] Paul Gilbertson, Paul Coulton, Fadi Chehimi, Tamas Vajk: Using 'Tilt' as an Interface to control 'No
Button' 3-D Mobile Games. ACM Computers in Entertainment, 2008.[Oliver2010HotPlanet] Earl Oliver. The Challenges in Large-Scale Smartphone User Studies. Invited talk @ HotPlanet,
2010.[McMillan2010RiL] Donald McMillan: iPhone Software Distribution for Mass Participation. Proc. Research in the Large
Workshop @ UbiComp, 2010.[Miluzzo2010RiL] Emiliano Miluzzo, Nicholas D. Lane, Hong Lu, Andrew T. Campbell: Research in the App Store Era:
Experiences from the CenceMe App Deployment on the iPhone. Proc. Research in the Large Workshop @ UbiComp, 2010.
[Henze2011IJMHCI] Niels Henze, Martin Pielot, Benjamin Poppinga, Torben Schinke, Susanne Boll: My App is an Experiment: Experience from User Studies in Mobile App Stores, accepted by the International Journal of Mobile Human Computer Interaction (IJMHCI), 2011
[McMillan2010Pervasive] Donald McMillan, Alistair Morrison, Owain Brown, Malcolm Hall & Matthew Chalmers: Further into the Wild: Running Worldwide Trials of Mobile Systems, Proc. Pervasive 2010.
[Cramer2010UbiComp] Henriette Cramer, Mattias Rost, Nicolas Belloni, Didier Chincholle, Frank Bentley: Research in the Large. Using App Stores, Markets, and Other Wide Distribution Channels in Ubicomp Research. Adjunct Proc. Ubicomp, 2010.
[Morrison2010RiL] Alistair Morrison, Stuart Reeves, Donald McMillan, Matthew Chalmers: Experiences of Mass Participation in Ubicomp Research, Proc. Research In The Large Workshop at Ubicomp, 2010.
[Poppinga2010OMUE] Benjamin Poppinga, Martin Pielot, Niels Henze, Susanne Boll: Unsupervised User Observation in the App Store: Experiences with the Sensor-based Evaluation of a Mobile Pedestrian Navigation Application. Proc. OMUE in conjunction with NordiCHI, 2010.
References[Pielot2011ELV] Martin Pielot, Niels Henze, Susanne Boll: Experiments in App Stores – How to Ask Users for their
Consent?, Proceedings of the CHI workshop on Ethics, logs & videotape, 2011.[Henderson2009HotPlanet] Tristan Henderson, Fehmi Ben Abdesslem: Scaling Measurement Experiments to Planet-
Scale: Ethical, Regulatory and Cultural Considerations. Proc. HotPlanet, 2009.[Morrison2011CHI] Alistair Morrison, Owain Brown, Donald McMillan, Matthew Chalmers: Informed Consent and Users'
Attitudes to Logging in Large Scale Trials. Adjunct Proc. CHI, 2011.[Norcie2011ELV] Greg Norcie: Ethical and Practical Considerations For Compensation of Crowdsourced Research
Participants, Proc. ETHICS, LOGS and VIDEOTAPE @ CHI, 2011.[Girardello2010DSZ] A. Girardello, F. Michahelles, Explicit and Implicit Ratings for Mobile Applications. In 3. Workshop
“Digitale Soziale Netze” and der 40. Jahrestagung der Gesellshaft für Informatik, September 2010, Leipzig.[Riccamboni2010IB] Rodolfo Riccamboni, Alessio Mereu, Chiara Boscarol: Keys to Nature: A test on the iPhone market.
Tools for Identifying Biodiversity: Progress and Problems, 2010.[Kuhn2010MM] Michael Kuhn, Roger Wattenhofer, Samuel Welten: Social Audio Features for Advanced Music Retrieval
Interfaces. Proc. MM, 2010.[Yan2011MobiSys]Bo Yan, Guanling Chen: AppJoy: Personalized Mobile Application Discovery. Proc. MobiSys, 2011.[Budde2010IoT] Andreas Budde, Florian Michahelles: Product Empire - Serious play with barcodes. Proc. IoT, 2010.[Karpischek2011RiL] Stephan Karpischek, Geron Gilad, Florian Michahelles: Towards a Better Understanding of Mobile
Shopping Assistants - A Large Scale Usage Analysis of a Mobile Bargain Finder Application. Workshop on Research in the Large @ UbiComp, 2011.
[Henze2010MobileHCI] Niels Henze, Susanne Boll: Push the Study to the App Store: Evaluating Off-Screen Visualizations for Maps in the Android Market, Proc. MobileHCI, 2010
[Henze2010NordiCHI] Niels Henze, Benjamin Poppinga, Susanne Boll: Experiments in the Wild: Public Evaluation of Off-Screen Visualizations in the Android Market, Proc. NordiCHI, 2010.
References[Hood2011IJTR] Jeffrey Hood, Elizabeth Sall, Billy Charlton: A GPS-based Bicycle Route Choice Model for San Francisco, California.
Transportation Letters: The International Journal of Transportation Research, 2011[Henze2011MobileHCIa] Niels Henze, Enrico Rukzio, Susanne Boll: 100,000,000 Taps: Analysis and Improvement of Touch
Performance in the Large, Proceedings of MobileHCI, 2011[Henze2011MobileHCIb ] Niels Henze, Susanne Boll: Release Your App on Sunday Eve: Finding the Best Time to Deploy Apps,
Adjunct proceedings of MobileHCI, 2011[Henze2012CHIa] Niels Henze, Enrico Rukzio, Susanne Boll: Observational and Experimental Investigation of Typing Behaviour
using Virtual Keyboards on Mobile Devices, Proceedings of CHI 2012.[Henze2012CHIb] Niels Henze: Hit it!: an apparatus for upscaling mobile HCI studies. Proceeding of CHI Extended Abstracts, 2012.[Henze2012Text] Niels Henze: Ten male colleagues took part in our lab-study about mobile texting, Proceedings of the Workshop
on Designing and Evaluating Text Entry Methods in conjunction with CHI, 2012.[Watzdorf2010LocWeb] Stephan von Watzdorf, Florian Michahelles: Accuracy of Positioning Data on Smartphones. Proc. LocWeb,
2010.[Ferreira2011Pervasive] Denzil Ferreira, Anind K. Dey, Vassilis Kostakos: Understanding Human-Smartphone Concerns: A Study of
Battery Life. Proc. Pervasive, 2011.[Buddharaju2010CHI] Pradeep Buddharaju, Yuichi Fujiki, Ioannis Pavlidis, Ergun Akleman: A Novel Way to Conduct Human Studies
and Do Some Good. Adcunct Proc. CHI, 2010.[Sahami2011CHI] Alireza Sahami, Michael Rohs, Robert Schleicher, Sven Kratz, Alexander Müller, Albrecht Schmidt: Real-Time
Nonverbal Opinion Sharing through Mobile Phones during Sports Events, Proc. CHI 2011.[Verkasalo2010MB] Hannu Verkasalo: Analysis of Smartphone User Behavior, Proc. Ninth International Conference on Mobile
Business, 2010.[Böhmer2011MobileHCI] Matthias Böhmer, Brent Hecht, Johannes Schöning, Antonio Krüger, Gernot Bauer: Falling Asleep with
Angry Birds, Facebook and Kindle – A Large Scale Study on Mobile Application Usage. Proc. MobileHCI, 2011.[Agarwal2010HotNets] Sharad Agarwal, Ratul Mahajan, Alice Zheng, Victor Bahl: There’s an app for that, but it doesn’t work.
Diagnosing Mobile Applications in the Wild. Proc. Hotnets, 2010.[Morrison2010RiL] Alistair Morrison, Matthew Chalmers: SGVis: Analysis of Mass Participation Trial Data. Proc. Research In The
Large Workshop at Ubicomp, 2010.[Lane2010CM] Nicholas D. Lane, Emiliano Miluzzo, Hong Lu, Daniel Peebles, Tanzeem Choudhury, Andrew T. Campbell: A Survey
of Mobile Phone Sensing. IEEE Communications Magazine, 2010.