engaging children as a storyteller: backchanneling models...

1
Engaging Children as a Storyteller: Backchanneling Models for Social Robots Mirko Gelsomini * , Hae Won Park, Jin Joo Lee, Cynthia Breazeal MIT Media Lab 20 Ames Street, E15-468, Cambridge, MA, USA gelso, haewon, jinjoo, [email protected] ABSTRACT In this video, we provide an overview of the analyses, de- sign, and evaluation of a backchannel opportunity prediction (BOP) model for a social robot listener. 1. AN ATTENTIVE ROBOT LISTENER Our aim is to develop a robot companion that can as- sess children’s language skills and storytelling ability and present a personalized context to effectively improve their syntactic and lexical skills. In the process, it is important for a robot listener to convey its attention to the child story- teller through listener responses, i.e., backchanneling. Here, we provide an analysis of young children’s nonverbal be- havior with respect to how they encode and decode listener responses and speaker cues. Based on our findings, we de- veloped a BOP model that detects four main speaker-cue events based on prosodic features in speech. We evaluated this model in a human-subjects study where children told stories to an audience of two robots, one demonstrating a contingent backchannel behavior and the other providing non-contingent responses. A full description of this work is presented in [1]. We still have very little knowledge regarding young chil- dren’s speaking and listening dynamics and how a robot companion should decode these behaviors and encode its own in a way children can understand. From a video corpus of preschool-aged child dyads taking turns telling stories to each other, we identified backchannel behaviors – partner gazes, leaning toward, smiles, utterances, brow raises, and nods – that indicate a positive engagement state of the lis- tener. We also identified speaker cues—gaze, pitch, word, energy, pauses taken singly and in combinations— that chil- dren listeners acknowledge and respond to. From a within subject study, we characterized the bidirectionality of non- verbal behaviors in that children understand the function of nonverbal behaviors both in the role of the communica- * Mirko Gelsomini is also affiliated with Politecnico di Mi- lano, Italy. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). HRI ’17 Companion March 06-09, 2017, Vienna, Austria c 2017 Copyright held by the owner/author(s). ACM ISBN 978-1-4503-4885-0/17/03. DOI: http://dx.doi.org/10.1145/3029798.3036651 Figure 1: Tega detects prosodic cues from the sto- ryteller’s speech to provide backchannel feedback. tor and recipient. Children can decode nonverbal behaviors they encode. Based on our findings, we developed a rule-based BOP models based on wordiness, long pause, and combinations of pitch & pause, and energy & pause events. Twenty-three children were recruited to tell stories to two “twin” Tegas (Figure 1), contingent and non-contingent backchanneling robots. We hypothesize that children will identify the robot with contingent backchannel behavior more attentive, shown both in children’s behavioral and subjective measures. The results of this study show that our BOP model produces contingent backchanneling responses that convey more at- tentive listening behavior, and children prefer telling stories to the BOP model robot. More specifically, we found that: 1) Children direct their storytelling more toward the con- tingent BOP robot, 2) children express more positive affect toward the BOP contingent robot, and 3) children perceive the BOP contingent robot as more attentive and interested in their story. Our aim was to evaluate two crucial questions that mo- tivate the design and development of social robot learning companions that can successfully foster early language skills of preschoolers. Can a social robot generate backchannel behaviors that are perceived as attentiveness by preschool children? Is contingency (as opposed to frequency) of agent feedback crucial toward creating these perceptions? Our study suggests that contingency matters as it leads children to appropriately direct their storytelling to their audience, conveys more attentive listening and gazing behavior, and gains children’s preference. 2. REFERENCES [1] H. W. Park, M. Gelsomini, J. J. Lee, and C. Breazeal. Telling stories to robots: The effect of backchanneling on a child’s storytelling. In Proceedings of the 12th ACM/IEEE international conference on Human robot interaction. ACM, 2017.

Upload: others

Post on 23-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Engaging Children as a Storyteller: Backchanneling Models ...robotic.media.mit.edu/wp-content/uploads/sites/7/2017/01/gelsomini… · Engaging Children as a Storyteller: Backchanneling

Engaging Children as a Storyteller:Backchanneling Models for Social Robots

Mirko Gelsomini∗

, Hae Won Park, Jin Joo Lee, Cynthia BreazealMIT Media Lab

20 Ames Street, E15-468, Cambridge, MA, USAgelso, haewon, jinjoo, [email protected]

ABSTRACTIn this video, we provide an overview of the analyses, de-sign, and evaluation of a backchannel opportunity prediction(BOP) model for a social robot listener.

1. AN ATTENTIVE ROBOT LISTENEROur aim is to develop a robot companion that can as-

sess children’s language skills and storytelling ability andpresent a personalized context to effectively improve theirsyntactic and lexical skills. In the process, it is importantfor a robot listener to convey its attention to the child story-teller through listener responses, i.e., backchanneling. Here,we provide an analysis of young children’s nonverbal be-havior with respect to how they encode and decode listenerresponses and speaker cues. Based on our findings, we de-veloped a BOP model that detects four main speaker-cueevents based on prosodic features in speech. We evaluatedthis model in a human-subjects study where children toldstories to an audience of two robots, one demonstrating acontingent backchannel behavior and the other providingnon-contingent responses. A full description of this workis presented in [1].

We still have very little knowledge regarding young chil-dren’s speaking and listening dynamics and how a robotcompanion should decode these behaviors and encode itsown in a way children can understand. From a video corpusof preschool-aged child dyads taking turns telling stories toeach other, we identified backchannel behaviors – partnergazes, leaning toward, smiles, utterances, brow raises, andnods – that indicate a positive engagement state of the lis-tener. We also identified speaker cues—gaze, pitch, word,energy, pauses taken singly and in combinations— that chil-dren listeners acknowledge and respond to. From a withinsubject study, we characterized the bidirectionality of non-verbal behaviors in that children understand the functionof nonverbal behaviors both in the role of the communica-

∗Mirko Gelsomini is also affiliated with Politecnico di Mi-lano, Italy.

Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for third-party components of this work must be honored.For all other uses, contact the owner/author(s).

HRI ’17 Companion March 06-09, 2017, Vienna, Austriac© 2017 Copyright held by the owner/author(s).

ACM ISBN 978-1-4503-4885-0/17/03.

DOI: http://dx.doi.org/10.1145/3029798.3036651

Figure 1: Tega detects prosodic cues from the sto-ryteller’s speech to provide backchannel feedback.

tor and recipient. Children can decode nonverbal behaviorsthey encode.

Based on our findings, we developed a rule-based BOPmodels based on wordiness, long pause, and combinationsof pitch & pause, and energy & pause events. Twenty-threechildren were recruited to tell stories to two “twin” Tegas(Figure 1), contingent and non-contingent backchannelingrobots. We hypothesize that children will identify the robotwith contingent backchannel behavior more attentive, shownboth in children’s behavioral and subjective measures. Theresults of this study show that our BOP model producescontingent backchanneling responses that convey more at-tentive listening behavior, and children prefer telling storiesto the BOP model robot. More specifically, we found that:1) Children direct their storytelling more toward the con-tingent BOP robot, 2) children express more positive affecttoward the BOP contingent robot, and 3) children perceivethe BOP contingent robot as more attentive and interestedin their story.

Our aim was to evaluate two crucial questions that mo-tivate the design and development of social robot learningcompanions that can successfully foster early language skillsof preschoolers. Can a social robot generate backchannelbehaviors that are perceived as attentiveness by preschoolchildren? Is contingency (as opposed to frequency) of agentfeedback crucial toward creating these perceptions? Ourstudy suggests that contingency matters as it leads childrento appropriately direct their storytelling to their audience,conveys more attentive listening and gazing behavior, andgains children’s preference.

2. REFERENCES[1] H. W. Park, M. Gelsomini, J. J. Lee, and C. Breazeal. Telling

stories to robots: The effect of backchanneling on a child’sstorytelling. In Proceedings of the 12th ACM/IEEEinternational conference on Human robot interaction. ACM,2017.