wait-learning: leveraging wait time for second language...

Wait-Learning: Leveraging Wait Time forSecond Language Education

Carrie J. Cai1 Philip J. Guo1,2 James R. Glass1 Robert C. Miller1

1MIT CSAILCambridge, MA USA

{cjcai, glass, rcm}@mit.edu

2University of RochesterRochester, NY [email protected]

ABSTRACTCompeting priorities in daily life make it difficult for thosewith a casual interest in learning to set aside time for regularpractice. In this paper, we explore wait-learning: leverag-ing brief moments of waiting during a person’s existing con-versations for second language vocabulary practice, even ifthe conversation happens in the native language. We presentan augmented version of instant messaging, WaitChatter, thatsupports the notion of wait-learning by displaying contextu-ally relevant foreign language vocabulary and micro-quizzesjust-in-time while the user awaits a response from her conver-sant. Through a two week field study of WaitChatter with 20people, we found that users were able to learn 57 new wordson average during casual instant messaging. Furthermore, wefound that users were most receptive to learning opportunitiesimmediately after sending a chat message, and that this tim-ing may be critical given user tendency to multi-task duringwaiting periods.

Author KeywordsWait-learning; Micro-learning; Second language learning.

ACM Classification KeywordsH.5.2. User interfaces: User-centered design; K.3.1. Com-puter Uses in Education: Computer-assisted instruction

INTRODUCTIONLearning a second language requires significant time and ef-fort on a recurring basis. Living in a country where the lan-guage is spoken is often necessary to reach fluency, yet manylanguage learners do not have the time to dedicate years oftheir lives to immersive instruction. Even for those who at-tempt to learn informally, the busyness of daily life makes itdifficult to schedule regular time for practice. Learners mustmaintain the executive motivation [12] necessary to study thesecond language on a repeated basis.

Despite the struggle to find time for learning, there are count-less moments in a day that go wasted, due to suboptimal

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full cita-tion on the first page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-publish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected] 2015, April 18–23, 2015, Seoul, Republic of Korea.Copyright is held by the owner/author(s). Publication rights licensed to ACM.ACM 978-1-4503-3145-6/15/04 ...$15.00.http://dx.doi.org/10.1145/2702123.2702267

Figure 1: WaitChatter presents vocabulary exercises in thelearning panel while the user awaits an IM response. Here,the user is being quizzed on a word and must enter the sec-ond language (L2) translation given the native language (L1)prompt. The word is highlighted in the chat history becauseit appeared in the context of the conversation.

scheduling or necessary waiting [19]. Recent work on micro-learning has explored ways to distribute traditional languagestudy into many micro moments throughout a person’s dailylife [16]. In this work, we extend micro-learning and intro-duce the notion of wait-learning. The novelty of this ap-proach lies in the targeted use of time that users would or-dinarily spend waiting. The proposed benefit is that learningduring wait time will minimize the likelihood that learning isperceived by learners as intrusive or time-consuming.

Due to inherent difficulties in predicting whether someone iswaiting based purely on tracking their activities [19], we turnto an activity that necessarily involves waiting within the ac-tivity itself. Instant messaging (IM) offers a powerful oppor-tunity for wait-learning due to its semi-asynchronous nature.Because messages being typed are unseen by the conversantand can be revised before being sent, a user must often waitfor a brief moment in anticipation of a response, with no guar-antee that the other party will in fact reply [28].

In this paper, we explore the notion of wait-learning withWaitChatter (Figure 1), an extension of instant messagingthat presents vocabulary exercises while the user awaits anIM response. The interaction is asymmetric, meaning that theuser can engage in learning without any knowledge of thisactivity by the other party. Based on findings from a wizard-of-oz exploratory study [7], we designed and implementedthe WaitChatter system as an extension of Google Chat inthe Chrome web browser. Through a two-week field studyin which 20 participants used WaitChatter on their personalcomputers, we find that users are most receptive to learningin the moments after sending a chat message. We also showthat wait-learning can facilitate learning. On average, partic-ipants learned 57 new vocabulary words over the course oftwo weeks, while exchanging on average 170 chats per day.

The main contributions of this paper are:

• the central idea of wait-learning, an extension of micro-learning that leverages wait time for education• a system that implements wait-learning by presenting

learning exercises at automatically-detected moments ofwaiting during instant messaging• a naturalistic evaluation that suggests users are most recep-

tive to wait-learning in the moments after sending a chatmessage

BACKGROUND AND RELATED WORKOur research on wait-learning extends existing work onmicro-learning to use wait time for education. It also buildsupon prior research on instant messaging and interruptions.

Micro-Learning and Contextual Micro-LearningBased on growing evidence that spaced exposure [11] andcontinued exposure [30] to second language vocabulary resultin greater learning gains, recent work on micro-learning [16]has explored methods to distribute learning into small unitsthroughout a person’s day-to-day life. Micro-learning haslargely been implemented in the form of contextual micro-learning, through mobile applications that provide opportu-nities to learn contextually relevant vocabulary based on sur-rounding objects [5], location [10, 14] and text [29].

Spaced exposure has been posited to benefit learning, even inthe absence of context [11]. Due to low interactivity, certaininstructional elements that can be reasonably learned in iso-lation may be more appropriate for micro-learning. For ex-ample, the study of vocabulary, computer terminology, andchemical symbols are posited to impose a lower cognitiveload in comparison to learning algebra, which has more in-teractive parts [26].

Prior work on contextual micro-learning has tended to focusmore on what the user is learning rather than when to presentthese learning opportunities. However, several researchershave aimed to make use of the transition time between tasksto present learning opportunities. Lernschoner [16] activatesa learning program whenever the learner’s computer screenbecomes idle. MicroMandarin [14] and Vocabulary Wall-paper [10] leveraged moments while users are visiting cer-tain locations to present location-sensitive vocabulary, though

even these studies tended to focus more on the content deliv-ered rather than the timing of those deliveries.

Micro-Waiting during Instant MessagingPrior research suggests that even micro-diversions during fre-quent activities could be perceived as disruptive if the userfeels that regular tasks are being delayed [29]. Beyond micro-learning, our work on wait-learning specifically targets mo-ments when users would ordinarily spend waiting.

Research on mobile micro-waiting suggests it is difficult touse activity information alone to predict idleness, possibly be-cause users may be more receptive to content delivered duringfleeting intervals of idle time within an ongoing activity [19].In one study on in-home interruptability, participants ratedwatching television as both a high and low engagement activ-ity depending on their goals [27].

During instant messaging, waiting often occurs within the ac-tivity itself. Unlike email and face-to-face communication,instant messaging conversations range from rapid exchangesto long periods of time passing between messages [23]. Dueto its semi-asynchronous nature, users frequently use waitingbreaks between conversation turns to attend to other informalactivities [18]. Prior work has explored ways to reduce theattention cost of multi-tasking, such as by predicting inter-ruptability [2, 15] and examining how IM notifications affectother tasks [9]. While these examples aim to decrease timewasted, our work instead makes use of this time by encourag-ing users to engage in learning while waiting for a response.

Attention and InterruptionAlthough wait-learning aims to leverage idle time, previouswork on interruptability suggests that the detailed manner andtiming of delivery can still cause significant differences intask performance and levels of frustration [1]. Because ourminds dynamically allocate and release resources throughouttask execution [20], the timing of information delivery rela-tive to a user’s ongoing task may affect interruption cost [4].

Because decreases in workload tend to be larger at bound-aries corresponding to larger chunks of a task [4, 22], a sys-tem that presents information during IM should differentiatebetween the many boundaries that might exist, favoring thosethat represent more salient breaks in workflow [4]. In onestudy on face-to-face communication, users processed addi-tional information best when presented with small batches ofinformation, and when the user was not speaking [24].

WAITCHATTER DESIGN AND IMPLEMENTATIONIn order to explore and empirically validate wait-learning as itapplies to instant messaging, we developed WaitChatter (Fig-ure 1) as an extension of Google Chat that runs in the Webbrowser when a Gmail page is in view. The client is packagedas a Chrome extension with a Python-based server backend.

User Interface DesignA central design challenge is the small amount of space avail-able for designing interactions. Because instant messaging isintended to be the main task, with learning as a secondary

(a) Study Mode for a non-contextual word.

(b) Study Mode for a con-textual word.

(c) Quiz Mode at easylevel (translate to L1).

(d) Quiz Mode at difficultlevel (translate to L2).

(e) Mastery panel slidesup when word is learned.

(f) Clicking the arrowfetches the next exercise.

Figure 2: WaitChatter teaches words during automatically detected wait-moments. Here are components of the user interface.

benefit, user interface elements should not demand a substan-tial context switch away from the IM conversation.

Based on findings from an exploratory study [7], the learningpanel is situated directly below the chat input box to minimizethe visual and motor effort of switching between chatting andlearning activities. To avoid user concern over whether chatconversants can view WaitChatter content, we keep the exer-cises within the learning panel, but highlight a keyword insidethe chat history if it is selected and presented for learning.

Vocabulary exercises appear and remain on the learning panelfor 10 seconds, during which the user can either do the exer-cise or ignore it by doing nothing. If the user has not in-teracted with the exercise for 10 seconds, it fades away. Tominimize disruption, the learning panel remains present at alltimes regardless of whether an exercise is present. After theuser completes an initial exercise, he can fetch a follow-upone by clicking the right arrow (Figure 2f) or hitting the rightarrow key on the keyboard. This functionality allows the userto continuously learn more words during longer wait times.

A vocabulary word is displayed in study mode (Figure 2a, 2b)the first time it is presented for learning, and in quiz mode(Figure 2c, 2d) during subsequent exposures. In study mode,the L2 (second language) vocabulary word is shown along-side a button which the user can click to view the L1 (nativelanguage) translation. Once a learner reveals the new word,the user is asked to indicate whether he or she already knewthat word. If not, the word is added to the learner’s vocabu-lary list, a growing list of words that the user did not alreadyknow but has been exposed to via WaitChatter. In quiz mode,the user is shown the L2 word and provides the L1 translationif the exercise is at the easy level (Figure 2c), or the reverseif the exercise is at the difficult level (Figure 2d). Users cansubmit a blank response if they don’t remember the word.

VocabularyWaitChatter aims to support the learning of second languagevocabulary. We limit our implementation to nouns because,in addition to being closely tied to the meaning of a conver-sation, nouns are also typically acquired before verbs and are

easier to translate automatically because they are less context-dependent. A related system [29] also focused on nouns forsimilar reasons.

Findings from a prior study indicate that contextual and non-contextual vocabulary serve complementary roles in learning,balancing needs across language levels, and that users maybenefit from a combination of the two [14]. Although thatstudy examined context with respect to location relevance, inWaitChatter the IM conversation itself provides a ripe oppor-tunity for in-context learning. The nouns in WaitChatter areeither drawn from a built-in word list (non-contextual, Fig-ure 2a), or selected on-the-fly from words used by either con-versant in the IM conversation (contextual, (Figure 2b). Non-contextual words could in theory be drawn from a number ofsources, such as a word bank seeded by the learner or teacher.

For each chat message exchanged during an IM conversation,WaitChatter determines whether an adequate contextual wordexists. First, it identifies nouns in the chat message using theSenna part-of-speech tagger [8]. Then, each noun is trans-lated on-the-fly using Google Translate. To maximize thechances that the word is translated correctly in context, Wait-Chatter sends both the word and the entire chat message toGoogle Translate, and considers an L1/L2 pair accepted onlyif the L2 word appears in both translation results. To miti-gate against inaccurate results, a dictionary icon (Figure 2f)is displayed which the user can click to see the word’s dic-tionary definition. For privacy reasons, WaitChatter logs onlythe length of a chat message and the L1/L2 pair displayed inan exercise, but not the content of the chat message itself.

Because exercises can only be displayed during appropriatewait-learning moments (described below), WaitChatter keepsa running set of accepted contextual L1/L2 pairs, for whichthe L1 word is still within the visible part of the chat history(viewport) but not yet displayed for learning. Once the wordscrolls off the viewport, WaitChatter discards the L1/L2 pairfrom being considered for learning.

Vocabulary Scheduling AlgorithmWaitChatter uses the Leitner schedule [17] for scheduling theorder of learning exercises. The Leitner schedule is based

on the principle of spaced repetition: given that humans ex-hibit a negatively exponential forgetting curve [13], repeti-tions should occur at increasingly spaced intervals so that theyare encountered just as they are about to be forgotten.

We define each flashcard in the system as an L1/L2 pair.WaitChatter maintains a set of five unlearned flashcards anda correct count for each flashcard, which represents the num-ber of correct responses to that flashcard. This count is in-cremented when the learner answers the flashcard correctlyand decremented if not. Flashcards with a correct count of nare displayed every nth Leitner session, so that better knowncards are reviewed less frequently. In WaitChatter, flashcardsare displayed at the easy level when the correct count is be-low three, and at the difficult level otherwise. A flashcardis “learned” and retired when its correct count reaches four(Figure 2e), opening up a slot for a new card to be added.

In WaitChatter, the Leitner algorithm was modified to enablea combination of contextual and non-contextual words. First,if an opportunity arises for a new card to be added, WaitChat-ter picks a contextual word if one is available. If no contex-tual word can be found, WaitChatter shows the next unusednon-contextual word at random from the same Leitner ses-sion. To capitalize on contextual opportunities, any word al-ready on the vocabulary list that re-appears in the context ofthe conversation is prioritized to be displayed at the next waitopportunity, regardless of the Leitner state.

During pilot studies, some users indicated a preference forseeing new words over repeatedly seeing old ones. To max-imize engagement, we rotate words within the same Leitnersession whenever a word is ignored so that learners are un-likely to ignore the same stale word consecutively. Further-more, if a new slot becomes available during a follow-up ex-ercise, WaitChatter will display the newly added flashcard atthe next initial exercise than showing it immediately, and in-stead display the next flaschard in the Leitner schedule. Sincefollow-up exercises are requested by the user whereas initialexercises are not, we display new flashcards during initial ex-ercises with the intent of maximizing engagement.

Detecting Waiting OpportunitiesResults from an exploratory study [7] identified two situationsin which a user may be waiting during an instant messagingconversation: 1) waiting for the conversant to start respond-ing, and 2) waiting for the conversant to finish responding.Figure 3 shows these two types of waiting opportunities inthe flow of a typical conversation.

The first case, which we name i sent, occurs after a user hassent a chat message and is waiting to see whether his conver-sant will respond. Because a common IM behavior is to type asequence of short chat messages as part of one conversationalturn [18, 21], an exercise that is naively delivered immedi-ately after a chat is sent may interrupt a follow-up messagethat the user is in the midst of composing. For this reason,WaitChatter waits for an additional amount of hesitation timeafter a message is sent, and subsequently triggers a learningexercise only if the user has not typed more. We used 1.5 sec-onds of hesitation time to balance against the user tendency to

Figure 3: Detection of waiting opportunities in WaitChatter.

leave the chat window after sending a message. According toa prior study [2], the probability that the message window isstill in focus is approximately 60% after 1 second, and dropsto approximately 50% within 5 seconds.

The second case, which we name you typing, occurs whenthe conversant has started typing a response but has not yetsent the message. In Google Chat and other similar instantmessaging applications, users see an indicator (i.e. “Corinneis typing...”) which signals that the conversant has startedtyping. WaitChatter triggers an exercise when the indicatorappears in the user’s chat window and the user is not typing.In both i sent and you typing, the exercise is only triggered ifthe cursor focus is still inside the chatbox.

The creation of an exercise requires server-side interaction, around-trip process that results in a small but unpredictable de-lay. WaitChatter will cancel an exercise before it is displayedif it detects that the user has typed between the time that thebrowser requested an exercise and received a server response.

EVALUATIONTo evaluate WaitChatter and the extent to which wait-learningcan help second language acquisition, we ran a two-weekfield study in which participants used WaitChatter in GoogleChat on their personal computers during their normal instantmessaging activities.

The questions our study sought to answer are:

• Learning: To what extent can users learn vocabulary usingWaitChatter?• Timing: What are the best times to present learning exer-

cises for the purpose of wait-learning?

VocabularyFor ease of user recruitment, our implementation of Wait-Chatter teaches Spanish and French, but could be extendedto other languages. Non-contextual vocabulary were drawnfrom high frequency English nouns as measured in the BritishNational Corpus.1 The words were translated to Spanish andFrench using Google Translate, after which two native speak-ers manually reviewed the word list for inaccurate transla-tions and highly ambiguous words. The final non-contextualword lists consisted of 446 words in each language.1http://www.natcorp.ox.ac.uk/

ProcedureParticipants first met with a researcher to have WaitChatter in-stalled on their personal computers. They were then given awalkthrough of how WaitChatter could be used. Participantsused WaitChatter as they pleased for the next two weeks. Dur-ing the study, WaitChatter prompted participants to indicatewhether or not they already knew a word the first time it ap-peared. Known words were not added to the user’s vocab-ulary list. At the end of two weeks, participants returnedto complete a post-study questionnaire, semi-structured in-terview, and vocabulary quiz. The quiz tested all vocabularythe user had been exposed to (but didn’t already know) whileusing WaitChatter, and was divided into two parts: partici-pants translated from L1 to L2 in the recall quiz, and from L2to L1 in the recognition quiz. Within each part, the order ofquestions was randomized. Users were asked not to guess.

Timing VariationTo better understand how the timing of exercises may af-fect the learner’s capacity to engage in learning, we exposedeach participant to two versions of our application: the de-tected wait version uses the i sent and you typing waiting op-portunities as described above. The random version displaysprompts at random whenever WaitChatter determines that auser has been actively instant messaging. We define a user’sinstant messaging activity as active if the Gmail page is infocus, has had keyboard or mouse activity within the last 30seconds, and contains at least one open chat window whichhad keyboard activity within the last 5 minutes. Because auser may be waiting for an instant messaging response whileengaged in nearby tasks on the same page, we consider mo-ments when the Gmail page is in focus even though the chat-box is not as viable candidates for wait-learning, so long asthe user is active as defined above. We also require at leastone chatbox to be open because an open chatbox indicatesthe likely presence of an ongoing conversation [2].

Each participant used the detected wait and random versionson alternating days. To ensure that users are exposed to Wait-Chatter prompts at approximately equal frequencies on thedetected wait and random versions, the desired frequencyon a random condition day was determined by calculatingthe cumulative frequency, which is the number of exercisesshown per time active, from the user’s previous detected waitcondition days. On random days, this desired frequency wasused every second the user was active to probabilistically de-termine whether an exercise should be shown.

We examined three metrics: whether the learner responded tothe exercise, the time taken to respond, and users’ subjectiveimpression of how well the exercises integrated into the flowof their activities. We define response time as the time be-tween a prompt being displayed and the user’s cursor focus-ing into the answer box. To capture subjective impressions,users were asked to complete a daily survey with two 7-pointLikert scale questions: 1) “In the past day, [WaitChatter] ex-ercises appeared at good moments within the flow of my dailyactivities”, and 2) “I enjoyed using [WaitChatter] today.” Thesurvey was sent via email every evening, instructing users tocomplete it once they finish chatting at the end of the day.

Participants21 participants were recruited by emails sent through univer-sity department, dorm, and foreign language course emaillists. We selected only those who were regular users ofGoogle Chat in the web browser, desired to learn or were cur-rently learning Spanish or French, and were not traveling formore than 2 days of the two-week study period. One partic-ipant was dropped midway through the study because theystopped instant messaging and completing the daily surveysafter the sixth day. Participants were given a $30 gift card fortheir time, and also entered into a raffle for one $100 gift card.

The 20 participants who completed the study included 12males and 8 females, ages 19 to 35 (mean=25.5). They con-sisted of mostly undergraduate and graduate students (17 outof 20), as well as two alumni working in industry and oneresearch scientist. Eleven chose to learn French and ninelearned Spanish. Ten users had prior formal education in thelanguage ranging from elementary school (2) and middle orhigh school (6) to university-level classes (2). Eight of theparticipants had studied the language informally through us-ing language learning software, traveling in a foreign country,or talking to friends. Six participants had never studied thelanguage before, either formally or informally. The partici-pants typically use Google Chat on their computers “Severaltimes an hour” (9) or “Several times a day” (11), mostly forsocial reasons or to chat with labmates about research.

RESULTS AND DISCUSSIONOverall, we observed 47393 instant messages exchanged bythe 20 participants, who communicated with a total of 249friends. Users exchanged an average of 170 chats per day.

LearningIn just two weeks of casual usage, participants were on aver-age able to recall 57 new words, equivalent to approximatelyfour words per day. Participants were on average exposedto 106.7 words, 87.7 (sd=64.8) of which they didn’t alreadyknow. Among those, approximately 40-60% of words werecognates. In post-study quizzes, users translated 57.1 words(66%) correctly to L2 and 80.2 words (92%) correctly to L1.In quiz translations to L2, 15% of wrong answers appearedto be spelling errors or near-misses. The user who was ex-posed to the most new words (256) translated 161 correctlyto L2 and 232 correctly to L1. The most infrequent chatter(55 chats per day) learned 17 new words. These results sug-gest that down time during instant messaging can serve as aviable channel for learning, at least for bite-sized information.

Some participants wished old words they had already mas-tered could be revived after a few days so that they wouldn’tbe as easily forgotten. Because there was no limit on thenumber of times a learner could continuously fetch more ex-ercises, it was possible for a user to “learn” a flashcard in asingle sitting. A spacing algorithm that incorporates tempo-ral effects, such as that described in [25], could be used in thefuture to improve long term retention.

Usage PatternsDuring post-study interviews, users reported behavior that re-sembled episodes in which they were learning while waiting.

For example, participants said they tended to complete exer-cises “while waiting for people to respond,” “while the otherperson is thinking,” or “when it was unpredictable how longmy friend would be gone for.”

Users indicated that they were most likely to interact withWaitChatter during a continuous but casual IM conversation.They were least likely to engage with exercises if they werehaving a particularly time-sensitive conversation or if the na-ture of the conversation was serious or work-related. As oneuser put it, “The best times are when I’m talking continuouslywith one person, but we’re not having a very heated conver-sation. Just like hi, how are you, and when the material ismore light.” Thus, WaitChatter usage seemed to occur duringperiods of “outeraction” [23], when people communicate forthe purpose of maintaining social connection and awareness,rather than for specific information exchange.

High-usage participants indicated that they frequently usedthe “fetch more” feature to complete a long sequence of exer-cises if the conversation was particularly sporadic: “At somepoints I needed to wait for the other person to respond. Thelonger they take, the more words I would go through.” Oth-ers were more likely to use WaitChatter to complete only oneor two exercises in the transition time between chatting and aprimary task. Because instant messaging is itself a commonbreak-time activity, IM functioned as a bootstrap for learningwhen the main task did not require high mental effort: “If I’mdoing something else at the same time like packing and stuff,I might hear the ping, read the reply, do a word or two, get upand go back to what I was doing.”

Overall, we found that people tend to fill their wait timeby doing casual tasks, and that wait-learning in many waysserved as a timely replacement for those alternative down-time activities: “There were definitely times when I wouldkeep clicking the arrow because whatever’s on TV was reallyboring.” Another user remarked, “Maybe I’m just chattingand looking at Facebook. Instead I would use WaitChatterbecause it’s more productive.”

Responsiveness to Timing ConditionsWe first measure the amount of time between messages dur-ing IM conversations to understand the extent to which learn-ing exercises can be feasibly completed during this time. Wefound that the time between the user sending a message andreceiving a message from the conversant (Figure 4) exhibitsa distribution with a peak and a long tail, similar to respon-siveness distributions reported in a prior IM study [2]. Thisintermessage distribution includes instances in which the con-versant has started composing a message before the user hassent his message. The time taken to complete an exercise wasshort (median=1.83 seconds), well within the median inter-message time (11 seconds). However, because intermessagetime is most commonly short (mode=4 seconds), particularlyduring conversations with frequent exchanges, it is reason-able that learning exercises be lightweight.

To understand how the user’s ability to respond to exercisescould be affected by different timing conditions, we looked atthe following measures: whether the user responded to the ex-

Figure 4: Histogram of intermessage time: the time betweenthe user sending a message and receiving a message from theconversant. Bin size is 1 second.

ercise, and the time taken to respond, as defined in the TimingVariation section above. For each user, data on the first de-tected wait condition day and the first random condition daywere excluded from analysis to avoid novelty effects. Fur-thermore, because follow-up exercises are requested by theuser whereas initial exercises are not, we focus our analysisonly on initial quiz exercises.

Because i sent and you typing exercises occurred only whilethe user had cursor focus inside the chatbox, we subdividedexercises on the random condition days into random inside,when the chatbox had focus, and random outside, whenthe chatbox did not have focus. We compared i sent andyou typing only to random inside trials. In all cases, the userhad been actively instant messaging as defined in the Tim-ing Variation section above. Table 1 displays a summary ofconditions.

Whether the learner respondedWe found that 43.5% of the exercises received a response inthe random inside condition, whereas only 31.2% receiveda response in the random outside condition. A generalizedlinear mixed effects analysis with the Timing condition (ran-dom inside and random outside) as the fixed effect and Par-ticipant as a random effect found that random outside de-liveries were significantly less likely to receive a response(p<0.0001). This analysis excludes one participant who didnot receive any random outside exercises. Consistent with aprior study [2] which found that chat window focus may bea strong indicator for chat responsiveness, our results implythat this applies to learning interactions as well.

Comparing i sent, you typing, and random inside, we foundthat response rate was highest for i sent (49.1%), followedby random inside (44.5%) and you typing (41.2%). Ina generalized linear mixed effects analysis, p-values were0.0085 (i sent>you typing), 0.0498 (i sent>random inside,and 0.397 (you typing=random inside), of which only the

Condition Day Shown Condition Requirementi sent detected wait (odd days) 1.5 seconds after user sends a chat, provided he has not started typing againyou typing detected wait (odd days) “[conversant name] is typing...” indicator appears and user is not typingrandom inside random (even days) Chat window is in focusrandom outside random (even days) Chat window is not in focus

Table 1: Summary of conditions.

first passed the Bonferroni-corrected threshold of 0.0167(i sent>you typing).

The relatively low response rate in the you typing conditioncould be due to a number of factors. you typing moments mayoccur when users are visually focused on another screen, evenif the cursor is focused inside the chatbox. Conversely, ran-dom inside prompts that happen to be delivered mid-typingare likely to be noticed, and could conceivably be attended tobefore the exercise disappears. Moreover, as chat messagestend to be short [21], users may already be receiving a re-sponse by the time they are able to attend to the exercise.

Time to respondAs shown in Figure 5, response time for the random insidecondition (mean=3973 ms, sd=792) was faster than that of therandom outside condition (mean=4888, sd=1282). A pairedt-test found this difference to be significant (F(1,16)=13.24,p<0.005). This analysis excludes three participants who didnot respond to any random outside exercises. The longer timeto respond in the random outside condition could be due tothe mental cost of switching from another activity, or the extradistance traveled to reach the answer box.

Comparing i sent, you typing, and random inside (Fig-ure 6), we found that i sent had the fastest responsetime (mean=3628, sd=637), followed by random inside(mean=4009, sd=792), and you typing (mean=4209,sd=969). A repeated measures ANOVA showed a significanteffect of condition (F(2,38) = 4.00, p<0.05). Post-hoc Bon-ferroni tests revealed that learners were significantly quickerto engage with prompts delivered during the i sent conditionthan those presented during random inside (p<0.05). Userswere also quicker to engage with i sent than you typing, adifference that was marginally significant (p=0.05).

Figure 5: Mean response times for random inside and ran-dom outside exercises. Error bars show SE of the mean.

Results suggest that exercises appearing during you typingwere not easier to process while conversing, despite occur-ring at a detected wait moment. It is possible that, duringyou typing, users are already in the midst of planning theirnext message, or concentrating on what their friend will sayin their response. The mental resources released may besmall due to a large carryover of information being activelymaintained in short-term memory, which is characteristic oflow-level task boundaries or non-boundaries [4]. Conversely,i sent occurs almost immediately after the user has finishedtyping, and may thus be more consistent with moments oflower mental workload because it occurs between the com-pletion of one subtask and the beginning of the next [22].

During the study, it appeared that in the random inside con-dition, many users responded quickly to an exercise evenif it appeared while they were typing. We thus examinedresponse times depending on the number of keystrokes re-maining in the message being composed (Figure 7). Wesplit these instances into two equally sized groups, andfound that the response time of prompts arriving just beforethe user sent a chat (within the last 8 keystrokes) is lower(median=2216ms, sd=2248) than those arriving earlier (me-dian=3446ms, sd=2175). This result suggests that exercisesarriving immediately prior to message completion may re-ceive fast engagement, possibly because the user is almostdone typing and has mentally queued up the exercise.

Lastly, we evaluate the 1.5 second hesitation time that Wait-Chatter used to ensure the user was not typing a followupmessage before it delivered a learning exercise in the i sentcondition. Due to missing log data for some users, we reportpreliminarily on the seven users for whom we had a full setof millisecond-accurate data, including all times they pressedthe enter key. We limit our analysis to events where the user

Figure 6: Mean response times for i sent, you typing, andrandom inside. Error bars show SE of the mean.

Figure 7: Box plots of response times to exercises arrivingjust before a chat is sent (≤ 8 keystrokes left), compared tolonger before a chat is sent (> 8 keystrokes left).

Figure 8: If the user has started typing within 30 secondsof sending a chat, there is a 70% chance that this re-typingoccurs within 1.5 seconds. In contrast, if the user starts typingwithin 30 seconds after receiving a chat, there is only a 18%that it occurs within 1.5 seconds.

re-started typing within 30 seconds, based on prior researchfindings that 28-30 seconds is a typical amount of time be-tween conversational turns [3]. Among these instances, 70%occurred within WaitChatter’s hesitation threshold of 1.5 sec-onds, making it a reasonable estimate (Figure 8). Neverthe-less, the hesitation threshold ought to be balanced against theuser tendency to leave the chat window after sending a mes-sage. A more lenient implementation of WaitChatter mightset the hesitation threshold to be even lower than 1.5 seconds.

Perception of Time and DisruptionBecause WaitChatter specifically makes use of wait time, wewere interested in how users perceived the time and effortthey spent on learning, as compared to alternative learningmethods. Here, we report on results from the post-study ques-tionnaires and interviews related to this question.

Perception of Time SpentAs shown in Figure 9, responses to 7-point Likert scale ques-tions (1=strongly disagree, 7=strongly agree) indicated that

users found WaitChatter very enjoyable (mean=6.15). Theyalso felt that they would continue using WaitChatter if theycould (mean=6.15), and would engage in vocabulary practicemore frequently than they would otherwise (mean=6.6). Onthe last question, 15 out of 20 participants submitted a ratingof 7.

Figure 9: Post-study questionnaire results, with standard er-ror bars.

During interviews, nearly every participant expressed thatWaitChatter felt less time-consuming compared to otherchannels of learning because they did not need to set asidetime for learning. As one user stated, “The key thing is that Ididn’t feel like I was taking extra time out of my day to dedi-cate to learning vocabulary words. It was just sort of time thatwould be wasted otherwise.” Another compared it to typicalbreak-time activities: “Some people play Angry Birds, but forme, I would play with [WaitChatter]. At least I’m learningsome French words.”

Many contrasted WaitChatter to language courses and soft-ware, which they felt required a conscious effort to scheduletime for learning. One person commented, “With Duolingoyou have to think ‘I have to go do this now’, whereas with[WaitChatter] it’s already done for you, spoonfed to you.”Another said, “With this I never had to make time or put awaythings to the future. Whereas learning from Rosetta Stone,you have to schedule time.” Most who had used vocabulary-learning mobile applications indicated that they eventuallygave up, citing time as a major factor.

Several users noted that the little time required to completea WaitChatter exercise ironically encouraged them to inter-act more with it overall. One person described the low timecommitment as follows: “It’s just like, here, literally justtake 2 seconds!” Another appreciated the regularity of ex-posure: “You’re just constantly getting new words. It mightbe a slower rate overall [compared to classes], but it’s neat inthe aspect that you’re always chugging along, learning newvocabulary.” These comments suggest that user engagementcould hinge more on the perceived than the actual time spent.

Perception of DisruptionOverall, participants felt that WaitChatter integrated well intotheir daily lives because they already instant message regu-larly: “I already have gmail open and I’m always chattingwith people anyway.” Furthermore, the ease of access waskey to their frequent usage: “It being so close to the chatbox

made me do it more. If it was a separate thing it wouldn’t beas easy – an extra click away is a large amount of effort.”

In the 7-point Likert scale questions sent daily (1=stronglydisagree, 7=strongly agree), users on average felt that the ex-ercises “appeared at good moments within the flow of mydaily activities” (mean=5.45, sd=1.05) and that they “enjoyedusing WaitChatter today” (mean=5.61, sd=1.02). A WilcoxonSigned-rank test found no significant difference between userratings on detected wait versus random condition days, foreither question (p=0.71 and p=0.57).

During interviews, users indicated that they did not noticesystematic differences in the timing of exercises, and that theappearance of exercises almost never felt disruptive becausethey always had the option to ignore them: “It was just amatter of choice. Subconsciously I would see that a word hasappeared. Sometimes it would pique my interest and I wouldlook at it, but if not I just wouldn’t look at it so it wasn’t reallydisrupting anything.” These findings are consistent with thenotion that interruptions which do not occlude the primarytask are perceived to be less mentally demanding and less an-noying [6]. Unlike pop-up notifications, the learning panelwas in a persistent self-allocated space, making it potentiallyless distracting even in cases when timing was sub-optimal.

Some users reported feeling frustrated when they could notattend to the exercise in time because they were still typing along message. Hence, while users did not perceive the timingof appearances to be intrusive, they may be less tolerant ofthe premature disappearance of an exercise.

LIMITATIONS AND FUTURE WORKOne limitation of this study design was that the frequency ofsurveys may have been too coarse-grained to capture moretransient impressions. We sent the survey only once a dayto prevent the frequency of the survey itself from negativelyimpacting user experience. Future studies may consider sam-pling user impressions closer in time to the user’s actual en-counter with learning exercises. In addition, we alternatedtiming conditions on a daily basis to balance the frequency ofexercises. While it is possible that more frequent switchingcould better mask the conditions, users expressed during theinterviews that they did not detect a noticeable pattern in thetiming of exercises.

Furthermore, our implementation was limited to vocabularylearning. Some users wished the system could additionallyteach sentence structures and phrases. It remains to be seenwhether more complex structures can be taught in a way thatstill preserves the bite-sized nature of wait-learning. For ex-ample, a more sophisticated system could present more com-plex exercises when users actively fetch more exercises, indi-cating that they are open to learning more, or when the wait-ing period is particularly long.

DESIGN IMPLICATIONSThe timing and manner in which learning components appearduring waiting opportunities can have important implicationsfor actual end-user use. In the following sections, we considerfour implications related to the use of wait time for education.

Engaging Prior to Context-SwitchingLearning exercises should be delivered immediately at thestart of or prior to waiting, rather than times that are wellinto the waiting period. Because users are likely to attend toother tasks while waiting, they may be less likely to engage inlearning once they have already shifted attention to other de-fault waiting activities. Results from our user study indicatethat educational prompts delivered right after the user has senta message tend to receive the fastest engagement, and that thismay be true even if exercises are presented within the last fewkeystrokes while the user is still typing.

Multiplexing the Learning ComponentTo ensure a non-intrusive user experience, learning compo-nents should appear in a way that allows the user to be suffi-ciently informed of the arrival of a learning opportunity with-out losing interactivity with their main task. Participants inour study found WaitChatter to be non-intrusive even whenexercises appeared amidst typing, citing their degree of con-trol and ability to ignore prompts as primary reasons.

Extending Engagement with Optional ContinuationsBecause the system could be imperfect in predicting when auser is interruptable, educational prompts should serve as anentryway to further learning. If a prompt is well-timed andthe user chooses to interact, this opportunity should be capi-talized to encourage continuation of learning. In WaitChatter,the right-arrow enabled users to fetch more exercises if theirwaiting period was particularly long, and in some cases gen-erated momentum such that the user continued fetching untilshe had reached a particular learning goal.

Using Micro-Structures to Complement Micro-WaitingBecause expected time commitment may affect engagementmore than actual time spent, learning exercises designed tocapture user attention during waiting periods should be bite-sized and low-pressure. Users reported that a core benefit ofWaitChatter was that it demanded very low time commitment,which ironically invited more frequent usage. In line withprior research on education [26], our research suggests that,beyond vocabulary flashcards, general learning tasks whichimpose a low cognitive load and can be reasonably learned inisolation may be more appropriate for wait-learning.

CONCLUSIONIn this paper, we introduced the idea of wait-learning andevaluated a system that offers wait-learning opportunities inIM. We found that exercises displayed at the beginning of apotential waiting period receive higher engagement, and thatexercises should be optional and demand a low cognitive loadto minimize intrusiveness. While this work investigated theeffectiveness of wait-learning within a single medium (IM),the general classes of measures that were investigated – theamount learned during wait time and the timing of learningopportunities – occur in other situations and generalize toother potential forms of wait-learning. In the future, it wouldbe beneficial to investigate wait-learning in other media con-texts and educational domains, enticing even the busiest usersto engage in life-long learning.

ACKNOWLEDGMENTSWe thank Max Goldman and the members of the UID groupat MIT CSAIL for all their help and feedback. This work wasfunded by Quanta Computer and MIT Lincoln Laboratory.

REFERENCES1. Adamczyk, P. D., and Bailey, B. P. If not now, when?:

the effects of interruption at different moments withintask execution. In CHI ’04, ACM (2004), 271–278.

2. Avrahami, D., Fussell, S. R., and Hudson, S. E. Imwaiting: timing and responsiveness in semi-synchronouscommunication. In CSCW ’08, ACM (2008), 285–294.

3. Avrahami, D., and Hudson, S. E. Communicationcharacteristics of instant messaging: effects andpredictions of interpersonal relationships. In CSCW ’06,ACM (2006), 505–514.

4. Bailey, B. P., and Iqbal, S. T. Understanding changes inmental workload during execution of goal-directed tasksand its application for interruption management. TOCHI14, 4 (2008), 21.

5. Beaudin, J. S., Intille, S. S., Tapia, E. M., Rockinson, R.,and Morris, M. E. Context-sensitive microlearning offoreign language vocabulary on a mobile device. InAmbient Intelligence. Springer, 2007, 55–72.

6. Bohmer, M., Lander, C., Gehring, S., Brumby, D. P., andKruger, A. Interrupted by a phone call: exploringdesigns for lowering the impact of call notifications forsmartphone users. In CHI ’14, ACM (2014), 3045–3054.

7. Cai, C. J., Guo, P. J., Glass, J., and Miller, R. C.Wait-learning: leveraging conversational dead time forsecond language education. In CHI’14 ExtendedAbstracts, ACM (2014), 2239–2244.

8. Collobert, R., Weston, J., Bottou, L., Karlen, M.,Kavukcuoglu, K., and Kuksa, P. Natural languageprocessing (almost) from scratch. The Journal ofMachine Learning Research 12 (2011), 2493–2537.

9. Czerwinski, M., Cutrell, E., and Horvitz, E. Instantmessaging: Effects of relevance and timing. In Peopleand computers XIV: Proceedings of HCI, vol. 2, BritishComputer Society (2000), 71–76.

10. Dearman, D., and Truong, K. Evaluating the implicitacquisition of second language vocabulary using a livewallpaper. In CHI ’12, ACM (2012), 1391–1400.

11. Dempster, F. N. Effects of variable encoding and spacedpresentations on vocabulary learning. Journal ofEducational Psychology 79, 2 (1987), 162.

12. Dornyei, Z., and Otto, I. Motivation in action: A processmodel of l2 motivation. Working Papers in AppliedLinguistics (1998), 43–69.

13. Ebbinghaus, H. Memory: A contribution to experimentalpsychology. No. 3. Teachers college, Columbiauniversity, 1913.

14. Edge, D., Searle, E., Chiu, K., Zhao, J., and Landay,J. A. Micromandarin: mobile language learning incontext. In CHI ’11, ACM (2011), 3169–3178.

15. Fogarty, J., Hudson, S. E., Atkeson, C. G., Avrahami,D., Forlizzi, J., Kiesler, S., Lee, J. C., and Yang, J.Predicting human interruptibility with sensors. TOCHI12, 1 (2005), 119–146.

16. Gassler, G., Hug, T., and Glahn, C. Integrated microlearning–an outline of the basic method and first results.Interactive Computer Aided Learning 4 (2004).

17. Godwin-Jones, R. Emerging technologies from memorypalaces to spacing algorithms: approaches tosecondlanguage vocabulary learning. Language,Learning & Technology 14, 2 (2010).

18. Isaacs, E., Walendowski, A., Whittaker, S., Schiano,D. J., and Kamm, C. The character, functions, and stylesof instant messaging in the workplace. In CSCW ’02,ACM (2002), 11–20.

19. Isaacs, E., Yee, N., Schiano, D. J., Good, N.,Ducheneaut, N., and Bellotti, V. Mobile microwaitingmoments: The role of context in receptivity to contentwhile on the go.

20. Kahneman, D. Attention and effort. Citeseer, 1973.

21. Ling, R., and Baron, N. S. Text messaging and imlinguistic comparison of american college data. Journalof Language and Social Psychology 26, 3 (2007),291–298.

22. Miyata, Y., and Norman, D. A. Psychological issues insupport of multiple activities. User centered systemdesign: New perspectives on human-computerinteraction (1986), 265–284.

23. Nardi, B. A., Whittaker, S., and Bradner, E. Interactionand outeraction: instant messaging in action. In CSCW’00, ACM (2000), 79–88.

24. Ofek, E., Iqbal, S. T., and Strauss, K. Reducingdisruption from subtle information delivery during aconversation: mode and bandwidth investigation. In CHI’13, ACM (2013), 3111–3120.

25. Pavlik, P. I., and Anderson, J. R. Using a model tocompute the optimal schedule of practice. Journal ofExperimental Psychology: Applied 14, 2 (2008), 101.

26. Sweller, J., Van Merrienboer, J. J., and Paas, F. G.Cognitive architecture and instructional design.Educational psychology review 10, 3 (1998), 251–296.

27. Takemae, Y., Chaki, S., Ohno, T., Yoda, I., and Ozawa,S. Analysis of human interruptibility in the homeenvironment. In CHI’07 Extended Abstracts, ACM(2007), 2681–2686.

28. Tree, J. E. F., Mayer, S. A., and Betts, T. E. Groundingin instant messaging. Journal of Educational ComputingResearch 45, 4 (2011), 455–475.

29. Trusty, A., and Truong, K. N. Augmenting the web forsecond language vocabulary learning. In CHI ’11, ACM(2011), 3179–3188.

30. Webb, S. The effects of repetition on vocabularyknowledge. Applied Linguistics 28, 1 (2007), 46–65.

wait-learning: leveraging wait time for second language...

Documents