collection and analysis of multimodal interaction in direction giving dialogues
DESCRIPTION
Collection and Analysis of Multimodal Interaction in Direction Giving Dialogues. Towards an Automatic Gesture Selection Mechanism for Metaverse Avatars. Seikei University. Takeo Tsukamoto. Yumi Muroya. Yukiko Nakano. Masashi Okamoto. Japan. Overview. Introduction - PowerPoint PPT PresentationTRANSCRIPT
Collection and Analysis of Multimodal Interaction in Direction Giving
Dialogues
Seikei University
Takeo Tsukamoto Yumi Muroya
Masashi Okamoto Yukiko Nakano
Japan
Towards an Automatic Gesture Selection Mechanism for Metaverse Avatars
Overview Introduction Research Goal and Questions Approach Data Collection Experiment Analysis Conclusion and Future Work
Introduction Online 3D virtual worlds based on Metaverse
application is growing steadily in popularityEx. : Second Life (SL)
⇒The communication method is limited to:Online chat with speech balloonsManual gesture generation
Hello
Introduction(Cont.) Human face-to-face communication is largely
dependent on non-verbal behaviors Ex. direction giving dialogues
Many spatial gestures are used in order to illustrate directions and physical relationships of buildings and landmarks
How can we implement natural non-verbal behaviors into Metaverse
application ?
Research Goal and QuestionsGoal
Establish natural communication between avatars in Metaverse based on human face-to-face communication
Research QuestionsAutomation : gesture selection
How to automatically generate proper gestures?
Comprehensibility : gesture displayHow to intelligibly display gestures to
interlocutor?
Previous work An automatic gesture selection mechanism
for Japanese chat texts in Second Life [Tsukamoto,2010]
/2 you keep going straight in this road, then you will be able to find a house having a round window on your left./
Proxemics
Proxemics is important to implement comprehensible gestures in
Metaverse
Previous work doesn’t consider proxemics⇒ There are some cases when avatar’s gesture becomes unintelligible to the others
ApproachConduct an experiment to collect human
gesturesin direction giving dialogues
Collect participant’s verbal and non-verbal data
Analyze the relationship between gestures and proxemics
Data Collection Experiment
Direction Giver (DG)Know the way to any place on campus of Seikei
Univ. Direction Receiver (DR)
Know nothing about the campus of Seikei Univ.
Experimental Procedure
The DR asks a way to a specific building
The DG explains how to get to
the building
DG
DR
Experimental InstructionDirection Receiver Instructed to completely understand the
way to the goal through a conversation with the DG
Direction GiverInstructed to confirm that the DR
understood the direction correctly after the explanation was finished
Experimental Materials
Each pair recorded a conversation for each goal place
Experimental Equipments
Right arm Abdominal
Head ShoulderHeadset microphone
EquipmentsExperimental
Video
Collected Data
Video DataUtterer Start Time sec.( )End Time sec.( ) Utterance Content
DG 6.8153 7.9568 Well,DG 8.3196 10.591 It is hard to understand because there are
many buildings.DR 10.5869 10.7703 YesDG 10.8244 12.1624 Uh…there is the connecting corridorDG 12.3124 13.4045 at the frontDR 13.0544 13.3586 YesDG 13.5765 14.5465 and that's
Transcription of Utterances
Motion Capture Data
AnalysisInvestigated DG’s gesture
distribution with respect to proxemics Analyzed 30 dialogues collected from 10
pairsAnalysis was focused on the
movements of DG’s right arm during gesturing
Automatic Gesture Annotation
Extracted features
Movement of position(x, y, z) Rotation(x, y, z) Relative position of the right arm to
shoulder(x, y, z) Distance between right arm and shoulder
Binary judge Gesturing /
Not gesturing
It is very time consuming to manually annotate nonverbal behaviors
Automatically annotated the gesture occurrence More than 77% of the gestures are right arm gestures
Built a decision tree that identified right arm gestures
Weka J48 was used for the decision tree learning
Automatic Gesture Annotation(Cont.) As the result of 10-fold cross validation, the
accuracy is 97.5%Accurate enough for automatic annotation
Example of automatic annotation
Gesture Display Space Defined as the overlap among the DG’s front
area, the DR’s front area, and the DR’s front field of vision
D GDR
Direction Receiver Direction Giver
Gesture Display SpaceCenter
Distance of DGfrom the center
Distance of DRfrom the center
DR’s bodyDirection vector
DG’s bodyDirection vector
DR’s front field of Vision
Category ConditionsNormal(12/30) 450mm Distance Both-center 950mm≦ ≦
Close_to_DG(4/30) Distance DG-center 450mm≦ 450mm Distance DR-center 950mm≦ ≦
Close_to_DR(8/30) Distance DR-center 450mm≦ 450mm Distance DG-center 950mm≦ ≦
Close_to_Both(2/30) Distance Both-center 450mm≦
Far_from_Both(4/30) 950mm Distance DG-center or≦950mm Distance DR-center≦
Define 450mm to 950mm as the standard distance from the center of the gesture display space Human arm length is 60cm to 80cm, by adding 15cm
margin
Categories of Proxemics
Analysis : Relationship between Proxemics and Gesture Distribution Analyze the distribution of gestures by
plotting the DG’s right arm positionNormal Close_to_DG Close_to_DR Close_to_Both
Similar Wider Smaller
Analysis : Relationship between Proxemics and Gesture Distribution(Cont.)
Close_to_Both < Normal = Close_to_DG < Close_to_Both
Applying the Proxemics Model Create avatar gestures based on our
proxemics model To test whether the findings are applicable
Close_to_DG Close_to_DR
Conclusion Conducted an experiment to collect human
gestures in direction giving dialogues Investigated the relationship between the
proxemics and the gesture distribution Proposed five types of proxemics
characterized by the distance from the gesture display space
Found that the gesture distribution range was different depending on the proxemics of the participants
Future WorkEstablish a computational model of
determining gesture direction
Examine the effectiveness of the model whether the users perceive the avatar’s
gestures being appropriate and informative
Thank you for your attention
Related work [Breitfuss, 2008] Built a system that
automatically adds gestural behavior and eye gaze Based on linguistic and contextual information of
input text [Tepper, 2004] Proposed a method for generating
novel iconic gestures Used spatial information about locations and shape of
landmarks to represent concept of words From a set of parameters, iconic gestures are generated
without relying on a lexicon of gesture shapes [Bergmann, 2009] Represented individual
variation of gesture shapes using Bayesian network Built an extensive corpus of multimodal behaviors in
direction-giving and landmark description task