king s research portal - king's college london€¦ · of eye contact during interaction,...

7
King’s Research Portal Document Version Peer reviewed version Link to publication record in King's Research Portal Citation for published version (APA): Celiktutan, O., & Gunes, H. (2015). Computational analysis of human-robot interactions through first-person vision: Personality and interaction experience. In IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) IEEE. Citing this paper Please note that where the full-text provided on King's Research Portal is the Author Accepted Manuscript or Post-Print version this may differ from the final Published version. If citing, it is advised that you check and use the publisher's definitive version for pagination, volume/issue, and date of publication details. And where the final published version is provided on the Research Portal, if citing you are again advised to check the publisher's website for any subsequent corrections. General rights Copyright and moral rights for the publications made accessible in the Research Portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognize and abide by the legal requirements associated with these rights. •Users may download and print one copy of any publication from the Research Portal for the purpose of private study or research. •You may not further distribute the material or use it for any profit-making activity or commercial gain •You may freely distribute the URL identifying the publication in the Research Portal Take down policy If you believe that this document breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 09. Jun. 2020

Upload: others

Post on 03-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: King s Research Portal - King's College London€¦ · of eye contact during interaction, whereas shyness and social anxiety are highly correlated with gaze aversion [6]. Many studies

King’s Research Portal

Document VersionPeer reviewed version

Link to publication record in King's Research Portal

Citation for published version (APA):Celiktutan, O., & Gunes, H. (2015). Computational analysis of human-robot interactions through first-personvision: Personality and interaction experience. In IEEE International Symposium on Robot and HumanInteractive Communication (RO-MAN) IEEE.

Citing this paperPlease note that where the full-text provided on King's Research Portal is the Author Accepted Manuscript or Post-Print version this maydiffer from the final Published version. If citing, it is advised that you check and use the publisher's definitive version for pagination,volume/issue, and date of publication details. And where the final published version is provided on the Research Portal, if citing you areagain advised to check the publisher's website for any subsequent corrections.

General rightsCopyright and moral rights for the publications made accessible in the Research Portal are retained by the authors and/or other copyrightowners and it is a condition of accessing publications that users recognize and abide by the legal requirements associated with these rights.

•Users may download and print one copy of any publication from the Research Portal for the purpose of private study or research.•You may not further distribute the material or use it for any profit-making activity or commercial gain•You may freely distribute the URL identifying the publication in the Research Portal

Take down policyIf you believe that this document breaches copyright please contact [email protected] providing details, and we will remove access tothe work immediately and investigate your claim.

Download date: 09. Jun. 2020

Page 2: King s Research Portal - King's College London€¦ · of eye contact during interaction, whereas shyness and social anxiety are highly correlated with gaze aversion [6]. Many studies

Computational Analysis of Human-Robot Interactions throughFirst-person Vision: Personality and Interaction Experience

Oya Celiktutan∗ and Hatice Gunes∗

Abstract— In this paper, we analyse interactions with Nao, asmall humanoid robot, from the viewpoint of human partici-pants through an ego-centric camera placed on their forehead.We focus on human participants’ and robot’s personalities andtheir impact on the human-robot interactions. We automaticallyextract nonverbal cues (e.g., head movement) from first-personperspective and explore the relationship of nonverbal cues withparticipants’ self-reported personality and their interactionexperience. We generate two types of behaviours for the robot(i.e., extroverted vs. introverted) and examine how robot’spersonality and behaviour affect the findings. Significant cor-relations are obtained between the extroversion and agreeable-ness traits of the participants and the perceived enjoyment withthe extroverted robot. Plausible relationships are also foundbetween the measures of interaction experience and personalityand the first-person vision features. We then use computationalmodels to automatically predict the participants’ personalitytraits from these features. Promising results are achieved forthe traits of agreeableness, conscientiousness and extroversion.

I. INTRODUCTIONDesigning robots with socio-emotional skills is an emerg-

ing yet challenging research field. Numerous applications [1]provide us a positive outlook, however, the capabilities ofcurrent social robots are quite limited. One of the challengesis understanding human behaviours, i.e., the underlyingmechanisms of the humans in responding to and interactingwith real life situations, and how to model these mecha-nisms for the embodiment of naturalistic, human-inspiredbehaviours via robots.

Personality research received a huge interest in psychologyand its connections to the area of human-robot interactionhave become increasingly prominent over the last decade.Many researchers in psychology suggested that personalityplays a key role in understanding human behaviours, abilitiesand preferences in everyday situations such as an individual’srelationship with others, occupational choice etc. These find-ings have motivated a significant body of work for automaticpersonality analysis from verbal and nonverbal behaviouralcues [2], [3]. A great effort has been put into definingoptimal protocols and behaviours to express a specific typeof personality through a robotic platform [4] or a virtualagent [5] for improving humans’ interaction experience withintelligent user interfaces.

Mutual gaze, focus of attention and head movement aresome of the essential components of effective social interac-tion. Research on nonverbal cues has shown that gaze andhead movement are also significant predictors of personality.

∗The authors are with School of Electrical Engineering and Com-puter Science, Queen Mary University of London, E1 4NS London, UK,{o.celiktutandikici,h.gunes}@qmul.ac.uk

(a) (b)

(c) (d)

Fig. 1. (a) The human-robot interaction setup. (b-d) Simultaneous snapshotsfrom first-person videos: robot’s camera (b) and ego-centric cameras thatare placed on the forehead of the participants (c-d).

For example, dominance and extroversion is found to berelated to holding a direct facial posture and long durationsof eye contact during interaction, whereas shyness and socialanxiety are highly correlated with gaze aversion [6]. Manystudies of extroversion showed that extroverted people arefound to be more energetic, leading to higher head movementfrequency, more hand gestures and more posture shifts thanintroverted people [7], [8]. This is often accompanied by thefact that extroverted people attract more attention in a groupof individuals [9]. Another study implied that persuasion(confidence in what we are saying) is influenced by listener’shead movements (nodding and shaking), i.e., either enhancedor undermined [10]. Recent works [11], [12], [13], [14] haveshown that these nonverbal cues can be also measured fromfirst-person vision, i.e., through an ego-centric camera placedon humans’ forehead.

Due to the abovementioned importance of personalityin human-human and human-robot interactions, this paperfocuses on nonverbal cues conveyed through gaze, attentionand head movement in a multiparty human-robot interactionscenario as shown in Fig. 1-a. We present an experimen-tal study where two human participants are involved ina structured conversation driven by the robot and eachparticipant wears an ego-centric camera placed on theirforehead. First, we concentrate on the interaction experiencewith the robot and the influence of human personality onthe perceived interaction experience. Secondly, we inves-tigate how participants’ personality traits and interactionexperience are manifested through nonverbal cues (e.g., headmovement). We extract various low-level features from first-

Page 3: King s Research Portal - King's College London€¦ · of eye contact during interaction, whereas shyness and social anxiety are highly correlated with gaze aversion [6]. Many studies

person perspective to encode nonverbal cues, and showhow these features correlate with the interaction measuresof enjoyment, empathy, extroversion, positivity and real-ism. We examine these correlations with respect to thedifferent behaviours exhibited by the robot (extroverted vs.introverted). Significant correlations are obtained betweenthe extroversion and agreeableness traits of the participantsand the perceived enjoyment with the extroverted robot.Plausible relationships are also found between the measuresof interaction experience and personality and the first-personvision features. We then use computational models to au-tomatically predict the participants’ personality traits fromthese features. Promising results are achieved for the traitsof agreeableness, conscientiousness and extroversion. Theimplications of these findings are discussed with respect tothe studies in human-robot interaction and design.

II. RELATED WORK

Personality Analysis. Most of the existing automatic per-sonality recognition methods have been reviewed in a recentsurvey paper [2]. Among these methods, [15], [16], [17]constitute prominent works that exploited features based ongaze patterns, head movements or social attention. Oertel andSalvi [15] relied on features extracted from eye-gaze patternsonly, to model group involvement and individual engagementin game-based interactions. Subramanian et al. [16] exploredsocial attention features based on head pose as well asproximity features such as distance between the target subjectand the others, and the velocity of the target subject at a giventime window in a cocktail party scenario. In a study of smallgroup meetings, Aran and Gaticia-Perez [17] combined audioand motion features with a set of high level features basedon head, body, speaking activity and focus of attention.

Personality Synthesis. In the context of human-machineinteraction, some works focused on machine personality toimprove the quality of the human experience with the virtualagent or the robot: humans tend to be attracted by characterswho have either matching personality traits (similarity rule)or non-matching personality traits (complementarity rule)[18]. Aly and Tapus [4] synthesized a combined verbal andnonverbal behaviour for a robot based on the similarityrule, i.e, extroverted person is matched with an extrovertedrobot and introverted person with an introverted robot. Therobot exhibited either high rates of gestures while lookingup (extroverted behaviours) or low rates of gestures whilelooking down (introverted behaviours). They reported thatindividuals prefer to interact with a robot that has a similarpersonality to theirs. Cerekovic et al. [19] considered twovirtual agents (Obadiah and Poppy) from the SEMAINE [20]system where each participant evaluated their interactionwith both agents along three dimensions: quality, rapport andlikeness. Their experimental results supported the comple-mentarity rule, namely, extroverted people show tendencyto like Obadiah (gloomy and neurotic with low variationin speech and a flat tone), whereas people that score highon neuroticism show tendency to like Poppy (cheerful andextroverted with frequent gesturing and head nods).

De Graaf and Allouch [21] examined the effect of humans’prior expectations (low vs. high) on the personality theyassign to a robot and the tendency to attribute their ownpersonality traits to the robot. Their finding suggested thatextroverted people with high expectations tend to find therobot more extroverted. In a game scenario for children,Jang et al. [22] used gaze direction and other audio-visualcues (speech, facial expression etc.) to classify two states ofengagement (engaged vs. not engaged) from continuously an-notated videos. They obtained the best classification results,when they used smaller temporal windows (i.e., 1 second) forsummarising the features and engagement level over time.

First-person Vision. With the development of wearableconsumer cameras, many researchers have recently shiftedtheir focus onto acquisition and analysis of first-person (ego-centric) videos. A recent survey [11] provided a compre-hensive review of the state-of-the-art techniques in first-person vision by grouping them under three main applicationdomains: video summarization, object recognition and ac-tivity detection/recognition. Majority of the reviewed worksresorted to gaze and attention related features extracted fromhead motion and visual saliency [12], [13], [14].

Despite the long list of challenges (e.g., the randomness,high variability in illumination and camera motion), first-person vision is advantageous due to many reasons. Forexample, first-person vision provides the most relevant partof the data for recognising social interactions [23], [24] -people with whom the camera wearer interacts tend to becentred in the scene, and are less likely to be occluded whencaptured from a co-located, first person perspective ratherthan from a static, third-person perspective. Fathi et al. [23]focused on categorising social interactions among groups ofindividuals into three classes (i.e., dialogue, discussion andmonologue). They modelled the patterns of attention shiftsand mutual gaze over time. In particular, they detected andcombined each individual’s face location and orientation in3D space with first-person head cues for each frame andlearned the temporal relationships between the frames usingHidden Conditional Random Fields.

In the context of human-robot interaction, Ryoo andMatthies [24] placed an ego-centric camera onto the foreheadof a teddy bear (robot) and aimed at recognizing what activityothers are performing to it, e.g., handshaking with the robot,hugging, throwing an object to the robot, punching etc.They extracted both global and local motion features andbuilt separate histograms of visual words. Global featureswere computed based on optical flow. For local features,volumes (containing salient motion) were detected by ap-plying a spatio-temporal filter and then described using blurfeatures [25]. They used multi-channel kernels for activityrecognition and proposed a kernel based activity learningmethod to model the hierarchical structures of complexactivities.

No work to date has focused on (i) multiparty human-robot interactions where each participant wears an ego-centric camera, and (ii) computational analysis of personalitymanifested through nonverbal cues in first-person videos.

Page 4: King s Research Portal - King's College London€¦ · of eye contact during interaction, whereas shyness and social anxiety are highly correlated with gaze aversion [6]. Many studies

III. OUR WORKThis paper combines the areas of personality computation

and first-person vision in the context of human-robot inter-action. We aim to analyse (i) human-robot interactions usingfirst-person vision (participant’s perspective), and (ii) therelationships between participants’ and robot’s personalities,and features extracted from first-person vision (participant’sperspective) which has not been attempted before.

We present a multi-party interaction study where two (hu-man) participants are involved in a structured conversationdriven by the robot and each participant wears an ego-centriccamera placed on their forehead (see Figure 1-ab). We askeach participant to fill in a personality questionnaire priorto the interaction, and evaluate their interaction experiencewith the robot after the interaction. We extract a set of low-level features to represent the nonverbal cues including gazedirection, attention and head movement from first-personvideos. Our motivation is that the direction of gaze, focusof attention and head pose are linked in various activitiesas indicated by [14], [26], [27]. For example, a large gazeshift almost always leads to a large head rotation. For theanalysis of these behaviours, first-person vision provides themost relevant information and reduces the need for complexcamera systems [11].

We study the influence of human participants’ person-alities on the (perceived) interaction experience and howthe automatically extracted first-person vision features cor-relate with the participants’ self-reported personality traitsand interaction experience. More specifically, we focus onaddressing the following research questions: (i) Can weassess human-robot interactions using first-person vision(interaction experience & first-person vision features)? and(ii) What are the relationships between participants’ androbot’s personalities, and the features extracted from first-person vision (personality & first-person vision features)?

IV. METHODTo investigate the research questions above, we conducted

a controlled study where two participants were asked toanswer a set of personal questions posed by a robot. Theparticipants were exposed to one of the two personalities ofthe robot, i.e., extroverted or introverted.

A. Experimental DesignFor the robot test-bed, we used the humanoid robot Nao

developed by Aldebaran Robotics [28] with the technicaldetails of NaoQi version 2.1, head version 4.0 and bodyversion 25. The robot was controlled remotely in a Wizard-of-Oz setup during the interaction. An experimenter (i.e.,operator) used a computer to supervise the robot through itscamera, and manage the dialogue and the turn taking.

Based on the research findings summarised in Section I,we manipulated the robot’s behaviours and generated twotypes of personality, i.e., extroverted robot and introvertedrobot. While the extroverted robot displayed hand gesturesand talked faster and louder, the introverted robot soundedhesitant, less energetic and exhibited no hand gestures duringthe interaction. Table I illustrates the difference in observable

TABLE ISYNTHESISING EXTROVERTED (EXT) AND INTROVERTED ROBOT (INT)

PERSONALITIES THROUGH VERBAL AND NONVERBAL BEHAVIOURS.

Type Verbal Nonverbal

EXT “Would you like me to dancefor you?”; “It is amazinglyexciting!”

Displays hand gestures, pos-ture shifts; talks faster witha higher voice pitch

INT “Hmm . . . well, ok . . . wouldyou like me to play musicfor you?”; “Well good ...”

Displays an almost staticposture; talks slower with alower voice pitch

TABLE IIQUESTIONS ASKED BY THE ROBOT IN THE COURSE OF INTERACTION.

ID Questions

1 How has your day been?2 How do you feel right now/about being here?3 What do you do for a living? Do you like your job?4 I have a personal question for you. Is there something you

would like to change in your life?5 Can you tell me about the best memory you have or the best

event you have experienced in your life?6 Can you tell me about an unpleasant or sad memory you have

had in your life?7 What are your feelings toward robots? Do you like them?8 Have you watched Wall-e? Do you like it?

behaviours between the two robot personality types and pro-vides representative statements from the robot’s repertoire.

The robot asked personal questions to each participant assummarised in Table II. The robot initiated the conversationby greeting the participants and by asking them neutrally“You on my right, could you please stand up? Thank you!What is your name?”. Then the robot continued by askingabout their occupations, experiences and so on, at each turnspecifying the participant name that the question was directedto. These questions (see Table II) aim to elicit differentemotional states (positive vs. negative) and various facetsof personality.B. Procedure

PhD students and post-docs in our department participatedin the study. Each participant was guided into the experimen-tal room that had controlled lighting. The participants wereasked to sit on a chair located near the edge of a table. Therobot was initially seated and situated on the table.

First, the participants were provided an informed con-sent form that explained briefly how the experiment wouldproceed. The participants were requested to fill in a pre-study questionnaire asking them to provide demographicdata and further information about their general behaviouraltendencies (see Section V). The participants were told thatthe robot would ask them a number of questions and theywere expected to respond, but no information was providedregarding the Wizard-of-Oz setup. Then the experimenterleft the participants alone with the robot, and the interactionsession started with the robot standing on the table and greet-ing the participants. After the interaction, the experimenterasked each participant to fill in a post-study questionnaireto evaluate their interaction experience (see Section V). Allmeasures were on a 10-point Likert scale (from very low tovery high). Each interaction session lasted 10 - 15 minutes.

Page 5: King s Research Portal - King's College London€¦ · of eye contact during interaction, whereas shyness and social anxiety are highly correlated with gaze aversion [6]. Many studies

V. DATA AND LABELSData. Each interaction session was recorded using two

ego-centric cameras placed on the forehead of each partici-pant and the robot’s own camera. Sound was recorded via themicrophones inside the ego-centric cameras and the robot’shead. We recorded 12 interaction sessions and collected ap-proximately 3 hours of recordings. Each session involved twoparticipants, with 18 participants in total. Four participantstook part more than once - with the condition that theywere exposed to different robot personalities (extroverted vs.introverted) each time. Prior to data collection, the experi-menter switched on and off the lights and this co-occurredappearance change in the cameras was used to synchronise(in time) the multiview videos (i.e., videos taken from 2 ego-centric cameras and robot’s own camera). Synchronisationwas based on the amount of appearance change between twosuccessive frames using gray-level histograms.

For the analyses, we focused on the videos acquired byhuman participants’ ego-centric cameras only. We segmentedeach recording into short clips, where each one contains therobot asking a question to one of the participants and thetarget participant responding. This yielded on average 19episodes per participant, per session and a total of 456 clips.Each clip has a duration ranging from 20 to 120 seconds.Fig. 1-c and 1-d illustrate simultaneous snapshots from theserecordings.

Labels. The pre-study questionnaire aims to assess per-sonal behavioural tendencies, i.e., how an individual seesherself in the way she approaches problems, likes to work,deals with feelings and manages relationships with others,along the widely known Big Five personality traits [2]. Thesefive personality traits are extroversion (assertive, outgoing,energetic, friendly, socially active), neuroticism (having ten-dency to negative emotions such as anxiety, depression oranger), openness (having tendency to changing experience,adventure, new ideas), agreeableness (cooperative, com-pliant, trustworthy) and conscientiousness (self-disciplined,organized, reliable, consistent). The commonly used methodto measure these traits is the Big Five Inventory [29]. Inthe pre-study questionnaire we used the BFI-10 [30], whichis the short and commonly used version of the Big-FiveInventory. In this version, each item contributes to the scoreof a particular trait on a 10-point Likert scale.

The post-study questionnaire consists of five items listed inTable III that evaluate the participants’ interaction experiencewith the robot, and their impressions about the robot’sbehaviours on a 10-point Likert scale.

VI. FIRST-PERSON VISION FEATURESAs mentioned in Section I, nonverbal cues conveyed

through gaze direction, attention and head movement carryimportant information regarding the individuals’ personalitytraits and mental states. Head movement might lead tosignificant motion in the first-person videos, which can becharacterised by optical flow and motion blur. Attentionshifts and rapid scene changes may also cause drastic il-lumination changes. We used simple and computationally

efficient low-level features to describe blur, illuminationchanges and optical flow due to ego-motion. These featureswere first proposed in [31] for classifying first-person videosover multiple datasets. We extracted 12 blur, 6 illuminationand 22 optical flow features, which resulted in 40 first-personvision features per clip.

Blur features were computed based on the no-referenceblur estimation algorithm of [25]. Given a frame, this algo-rithm yielded two values, vertical (BLUR-Ver) and horizontalblur (BLUR-Hor), ranging from 0 to 1 (the best and the worstquality, respectively). We also calculated the maximum blur(BLUR-Max) over the vertical and the horizontal values. Forillumination, we simply calculated the mean (ILLU-Mean)and the median (ILLU-Med) of the pixel intensity values perframe.

For optical flow, we used the SIFT flow algorithm pro-posed in [32]. We computed a dense optical flow estimate foreach frame, where we set the grid size to 4. We converted thex and y flow estimate of a pixel into magnitude and angle,and then quantised the angles into 8 orientation bins. Wecalculated the mean (MAG-Mean) and the median (MAG-Med) of the magnitude values per frame. For the anglevalues, two types of features were computed over a frame:(i) the number of times the angle bin i contained the highestmotion energy in a frame (ANG-Nrg-i); and (ii) the totalamount of pixels belonging to the angle bin i (ANG-Count-i). These features were normalised such that the sum overall 8 bins was 1.

Since the frame rate of the ego-centric cameras was high(60 frames per second), all features were extracted fromframes sampled at every 200 milliseconds instead of adjacenttime instants. A clip was summarised by computing a total of40 features over the frames. These features are statistics overthe low-level features of blur, illumination and optical flow,i.e., mean (Mean), median (Med) and standard deviation(Std) over all frames in a video, the absolute mean (Abs-Mean) over all frames, applying z-score normalisation (z)across all frames, and taking the first (d1) and the second(d2) temporal derivatives.

VII. RESULTS AND ANALYSISThis section presents the correlation analysis between the

Big Five personality traits and the interaction experience, andalso examines how these measures are linked to the automat-ically extracted first-person vision features (Section VI). Wetest the statistical significance of the correlations (against thenull hypothesis of no correlation) using a t-distribution test.

A. Interaction Experience and First-person Vision Features

We examined how the individual measures of interactionexperience (see Table III) correlated with the automaticallyextracted first-person vision features. While no significant re-lationship has been found for the extroverted robot condition,the interactions with the introverted robot condition resultedin correlations with the horizontal blur features. The sta-tistically significant correlation values ranged between 0.30and 0.38 (at a significance level of p < 0.01). Although nothigh, these values indicate that there was apparent horizontal

Page 6: King s Research Portal - King's College London€¦ · of eye contact during interaction, whereas shyness and social anxiety are highly correlated with gaze aversion [6]. Many studies

TABLE IIIPOST-STUDY QUESTIONNAIRE TO EVALUATE THE INTERACTION

EXPERIENCE WITH THE ROBOT.Question Interaction

MeasureI enjoyed the interaction with the robot. EnjoymentI thought the robot was being supportive. EmpathyI thought the robot was assertive and social. ExtroversionI thought the robot was being positive. PositivityI found the robot behaviour realistic. Realism

TABLE IVSIGNIFICANT CORRELATIONS BETWEEN THE BIG FIVE PERSONALITY

TRAITS OF THE PARTICIPANTS AND THEIR INTERACTION EXPERIENCE

MEASURES (AT A SIGNIFICANCE LEVEL OF p < 0.05, *p < 0.01). EXT:EXTROVERSION, AGR: AGREEABLENESS, CON: CONSCIENTIOUSNESS,

NEU: NEUROTICISM, OPE: OPENNESS.

Trait Extroverted RobotCondition

Introverted RobotCondition

EXT Enjoyment (0.85*)Empathy (0.58)

-

AGR Enjoyment (0.62) -CON Positivity (0.71*) Positivity (0.71)NEU Realism (0.60) -OPE - Positivity (0.70)

Realism (0.67)

head movement during the interaction with the introvertedrobot. A visual inspection of the videos showed that as theinteraction progressed, the participants appeared less engagedand their attention drifted away from the robot. This was notthe case in the extroverted robot condition.B. Personality and Interaction Experience

We investigated the possible links between the Big Fivepersonality traits of the participants, the extroversion / in-troversion trait of the robot and the participants’ interactionexperience with the robot. In Table IV, significant results aregiven with their respective correlation values in parenthesis.

For the extroverted robot condition, perceived enjoy-ment with the robot is found to be significantly correlatedwith participants’ extroversion trait, which validates the sim-ilarity rule [18], [4]. We observe that robot’s perceived empa-thy positively correlates with participants’ extroversion trait.This might be due to the fact that extroverted people feelmore control over their interactions and judge them asmore intimate and less incompatible [33], [34]. A studyof agreeableness reported that more agreeable people showedstrong self-reported rapport when they interacted with avirtual agent [35]. Cuperman and Ickes [36] also indicatedthat more agreeable people reported having more enjoyableinteractions. Similarly, we observe that perceived enjoy-ment with the robot is highly correlated with the agree-ableness trait of the participants. A significant relationshipis also established between perceived robot’s realism andthe neuroticism trait of the participants. People who scorehigh on neuroticism tend to perceive their interactions asbeing forced and strained [36], therefore artificial behavioursof the robot might appear to them as realistic.

For the introverted robot condition, no significant cor-relations have been obtained with participants’ extrover-sion, agreeableness and neuroticism traits. People who scorehigh on conscientiousness tend to interact with others by

TABLE VSELECTED STATISTICALLY SIGNIFICANT CORRELATIONS BETWEEN THE

PARTICIPANTS’ PERSONALITY TRAITS AND FIRST-PERSON VISION

FEATURES (AT A SIGNIFICANCE LEVEL OF p < 0.01). BLUR: BLUR,ILLU: ILLUMINATION, MAG: OPTICAL FLOW MAGNITUDE, ANG: OPTICAL

FLOW ANGLE, EXT: EXTROVERSION, AGR: AGREEABLENESS, CON:CONSCIENTIOUSNESS, NEU: NEUROTICISM, OPE: OPENNESS.

Trait Extroverted RobotCondition

Introverted RobotCondition

EXT - BLUR-Ver-Mean(-0.55);BLUR-Ratio-Med(-0.49);

AGR BLUR-Ver-Mean(0.36) BLUR-Max-Med(0.35)CON BLUR-Ver-Mean(0.34);

ILLU-Mean-Std(-0.33)BLUR-Ver-Med(-0.53);BLUR-Ratio-Med(-0.48);ANG-Nrg-1(0.35)

NEU BLUR-Ver-Mean(-0.40);BLUR-Ratio-Med(-0.36);ILLU-Med-Std(-0.38);ANG-Nrg-1(0.41);ANG-Count-2(-0.42);

BLUR-Ver-Mean(0.68);BLUR-Max-Std(0.40);BLUR-Ratio-Med(0.61);MAG-Mean-Mean(0.35);MAG-Mean-d1-Abs-Mean(0.38)

OPE BLUR-Max-Mean(0.34);ILLU-Med-Std(0.39);ANG-Count-3(0.33)

BLUR-Hor-Mean(0.47);BLUR-Ver-Mean(-0.43);MAG-Mean-Mean(-0.35);ANG-Count-1(0.35)

showing greater attentiveness and responsiveness [36]. Thismight cause significant correlations with the interaction mea-sure of positivity regardless of the robot’s personality as therobot always provided a feedback to the participant in thecourse of interaction.

C. Personality and First-person Vision FeaturesThe goal of this analysis was to study the one-to-one

relationships between the Big Five personality traits ofthe participants and the automatically extracted first-personfeatures. Table V shows the prominent features and thesignificant correlations.

One can observe that in general the introverted robotcondition provided a larger number of significant correla-tions with the extracted features. This can be due to thereason explained in Section VII-A, i.e., participants’ attentionshifted more when interacting with the introverted robot. Forthe extroverted robot condition, the neuroticism trait of theparticipants showed significant relationships with all threetypes of features (blur, illumination and optical flow), inparticular, with blur features. No significant correlations havebeen found between participants’ extroversion trait and thefirst-person features. For the introverted robot condition, thepersonality traits of conscientiousness, neuroticism and open-ness of the participants showed significant relationships withthe blur and optical flow features. However, no correlationshave been found with the illumination features.

Looking at Table V, one significant relationship is be-tween agreeableness and the vertical blur feature, whichcan be associated with head nodding and being positive andsupportive. In Section VII-B, we observed that extrovertedpeople tend to enjoy the interaction with the extrovertedrobot more than the interaction with the introverted robot.Our experimental results further show that extroversion isnegatively correlated with the blur (motion) features for theintroverted robot. This result indicates that less energetic(introverted) people liked the introverted robot more, and it

Page 7: King s Research Portal - King's College London€¦ · of eye contact during interaction, whereas shyness and social anxiety are highly correlated with gaze aversion [6]. Many studies

is possible to deduce this from the first-person vision featuresextracted.

Significant correlations found between the participants’personality traits and the first-person vision features pave theway for automatic personality prediction. To that end, we em-ployed linear Support Vector Regression method with nestedleave-one-subject-out cross validation. Optical flow-anglefeatures (ANG-Nrg and ANG-Count) yielded the best pre-diction results in terms of coefficient of determination (R2)and root-mean-square error (RMSE), where we obtainedµR2 = 0.19 and µRMSE = 1.63 over all traits. The methodsuccessfully modelled the relationship between the first-person vision features and the traits of agreeableness (R2 =0.48, RMSE = 1.37), conscientiousness (R2 = 0.27,RMSE = 1.55) and extroversion (R2 = 0.20, RMSE =1.72). Similarly, the study in [17] applied Ridge regression topredict extroversion trait. Although the used database, Likertscale and visual feature set are completely different, theyalso obtained the best results with motion-based features(R2 = 0.31). Referring to this result as a baseline, our resultsfor agreeableness, conscientiousness and extroversion showthat prediction of personality traits from first-person visionis a promising research direction.

VIII. CONCLUSION AND FUTURE WORKThis paper investigated potential relationships between

human participants’ and robot’s personalities and the impactof these on human-robot interactions from first-person per-spective. One important finding is that the perceived enjoy-ment is found to be significantly higher for the extrovertedrobot condition. This result coincides with the literature thatreported that extroverted robots have a positive effect on theinteraction experience [37]. Although low-level first-personvision features have been proved to be useful for automaticanalysis of human-robot interactions and participants’ per-sonality traits, higher level features are needed to furthermodel multiparty behaviours (e.g., mutual gaze, synchrony,attention given, attention received) and to better understandthe interaction taking place.

ACKNOWLEDGMENTThis research work is funded by the EPSRC under its IDEAS Factory

Sandpits call on Digital Personhood, grant ref: EP/L00416X/1.

REFERENCES

[1] “Japanese bank introduces robot workers to deal with customersin branches,” www.theguardian.com/world/2015/feb/04/japanese-bank-introduces-robot-workers-to-deal-with-customers-in-branches,Accessed: 2015-02-21.

[2] A. Vinciarelli and G. Mohammadi, “A survey of personality comput-ing,” IEEE TAC, 2014.

[3] O. Celiktutan et al., “Maptraits 2014: Introduction to the audio/visualmapping personality traits challenge,” in ACM ICMI, 2014.

[4] A. Aly and A. Tapus, “A model for synthesizing a combined verbaland nonverbal behavior based on personality traits in human-robotinteraction,” in ACM/IEEE HRI, 2013.

[5] M. Neff et al., “Evaluating the effect of gesture and language onpersonality perception in conversational agents,” in IVA, LNCS. 2010.

[6] R. J. Larsen and T. K. Shackelford, “Gaze avoidance: Personality andsocial judgments of people who avoid direct face-to-face contact,”Pers. and Indiv. Differ., vol. 21, no. 6, pp. 907 – 917, 1996.

[7] R. E. Riggio and H.S. Friedman, “Impression formation: The roleof expressive behavior,” J. Pers. Soc. Psychol., vol. 50, no. 2, pp.421–427, 1986.

[8] R. Lippa, “The nonverbal display and judgment of extraversion, mas-culinity, femininity, and gender diagnosticity: A lens model analysis,”J. Pers. Soc. Psychol., vol. 32, no. 1, pp. 80–107, 1998.

[9] B. Lepri et al., “Connecting meeting behavior with extraversion: Asystematic study,” IEEE TAC, vol. 3, no. 4, pp. 443–455, Jan. 2012.

[10] P. Brinol and R. E. Petty, “Overt head movements and persuasion:a self-validation analysis,” J. Pers. Soc. Psychol., vol. 84, no. 6, pp.1123–1139, 2003.

[11] A. Betancourt et al., “The evolution of first person vision methods: Asurvey,” arXiv, 2014.

[12] A. Fathi, Y. Li, and J. M. Rehg, “Learning to recognize daily actionsusing gaze,” in ECCV, 2012.

[13] K. Yamada et al., “Attention prediction in egocentric video usingmotion and visual saliency,” in AIVT, LNCS. 2012.

[14] Y. Li et al., “Learning to predict gaze in egocentric video,” in ICCV,2013.

[15] C. Oertel and G. Salvi, “A gaze-based method for relating groupinvolvement to individual engagement in multimodal multiparty dia-logue,” in ACM ICMI, 2013.

[16] R. Subramanian et al., “On the relationship between head pose, socialattention and personality prediction for unstructured and dynamicgroup interactions,” in ACM ICMI, 2013.

[17] O. Aran and D. Gatica-Perez, “One of a kind: Inferring personalityimpressions in meetings,” in ACM ICMI, 2013.

[18] S. Buisine and J. C. Martin, “The influence of user’s personalityand gender on the processing of virtual agents’ multimodal behavior,”Advances in Psychol. Res., vol. 65, pp. 1–14, 2009.

[19] A. Cerekovic et al., “How do you like your virtual agent?: Human-agent interaction experience through nonverbal features and personal-ity traits,” in HBU, 2014, LNCS.

[20] “Semaine project,” http://www.semaine-project.eu/, Accessed: 2015-02-21.

[21] M. M. A. de Graaf and S. B. Allouch, “Expectation setting andpersonality attribution in hri,” in ACM/IEEE HRI, 2014.

[22] M. Jang et al., “Building an automated engagement recognizer basedon video analysis,” in ACM/IEEE HRI, 2014.

[23] A. Fathi et al., “Social interactions: A first-person perspective,” inIEEE CVPR, 2012.

[24] M.S. Ryoo and L. Matthies, “First-person activity recognition: Whatare they doing to me?,” in IEEE CVPR, 2013.

[25] F. Crete et al., “The blur effect: perception and estimation with a newno-reference perceptual blur metric,” Electronic Imaging, vol. 6492,2007.

[26] M. F. Land, “The coordination of rotations of the eyes, head and trunkin saccadic turns produced in natural situations,” Exp. Brain Res., vol.159, no. 2, pp. 151–160, 2004.

[27] R. Stiefelhagen et al., “From gaze to focus of attention,” LNCS, vol.1614, pp. 761–768, 1999.

[28] “Who is nao?,” https://www.aldebaran.com/en/humanoid-robot/nao-robot, Accessed: 2015-02-21.

[29] O. John et al., “The big five inventory versions 4a and 54,” Tech.Rep., Ins. of Pers. and Soc. Res., 1991.

[30] B. Rammstedt and O. P. John, “Measuring personality in one minuteor less: A 10-item short version of the big five inventory in englishand german,” J. of Res. in Pers., vol. 41, no. 1, pp. 203 – 212, 2007.

[31] C. Tan et al., “Understanding the nature of first-person videos:Characterization and classification using low-level features,” in IEEECVPRW, 2014.

[32] C. Liu et al., “SIFT flow: Dense correspondence across scenes andits applications,” IEEE TPAMI, vol. 33, no. 5, pp. 978–994, 2011.

[33] A. M. von der Putten et al., “How Our Personality Shapes OurInteractions with Virtual Characters - Implications for Research andDevelopment,” in IVA, 2010.

[34] A. W. Heaton and A. W. Kruglanski, “Person perception by introvertsand extraverts under time pressure: Effects of need for closure,” Pers.and Soc. Psychol. Bull., vol. 17(2), pp. 161–165, April 1991.

[35] S.-H. Kang et al., “Agreeable People Like Agreeable Virtual Humans,”in IVA, 2008, LNCS.

[36] R. Cuperman and W. Ickes, “Big five predictors of behavior andperceptions in initial dyadic interactions: Personality similarity helpsextraverts and introverts, but hurts disagreeables,” J. Pers. and Soc.Psychol., vol. 97(4), pp. 667–684, June 2009.

[37] B.B.W. Meerbeek et al., “The influence of robot personality onperceived and preferred level of user control,” Interact. Stud., vol.9, no. 2, pp. 204–229, 2008.