emotions in engineering: methods for the interpretation of ......international conference on...

16
Emotions in Engineering: Methods for the Interpretation of Ambiguous Emotional Content Emily Mower April 29, 2011

Upload: others

Post on 19-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Emotions in Engineering: Methods for the Interpretation of Ambiguous Emotional Content

    Emily Mower April 29, 2011

  • Motivation

    April 29, 2011

    • Increasing prevalence of interactive technology • Importance of emotion

    understanding

    • Engineering research starting to overlap with human behavioral research: • Autism • Depression • Marital therapy • General interaction dynamics • Psychiatric disorders

    2

  • Motivating Example

    April 29, 2011 3

    Emotional Computer Assistant • Provides interaction assistance • Describes the emotions of others • Allows user to understand stimuli for proper response

    User:

    Other :

    ???

    User:

    Other :

    Assistant:

    “The other person is frustrated… this is a mix of

    anger and sadness”

    “I am sorry”

  • Focus of this Presentation

    April 29, 2011 4

    Emotion Profiles: A novel mid-level representation for quantifying emotion

    • Overview:

    • Alleviates limitations of current frameworks • Captures shades of emotion • Represents ambiguous utterances

    • Component of classification • Stand-alone representation • Interpretable and informative • Can be used in a user-personalization

    framework

    Key finding: EPs can be used to track the emotional trajectory of audio-visual utterances

    Frustration

    Ang

    ry

    Hap

    py

    Neu

    tral

    Sa

    d

    Emotion Profile Representation

  • Data Overview: USC IEMOCAP

    April 29, 2011

    • Data: • 5 m-f pairs of actors • Audio, video,

    motion-capture (x,y,z)

    • Elicitation Strategy: • Scripted sessions • Improvisation scenarios

    • Emotional descriptors: • Categorical • Dimensional

    *Data collection led by Carlos Busso, UT Dallas

    5

  • Feature Extraction and Selection

    April 29, 2011 6

    • Extraction: • Utterance-length • Mean, variance, range,

    upper-quantile, lower-quantile, quantile range

    • Final feature set: • Principal Feature

    Analysis • Top 30 features

    Audio Features: Prosodic: pitch and energy

    Spectral: Mel Filterbank Coefficients

    Video Features: Motion capture relative

    distances Mouth, Eyebrows, Cheek,

    Forehead

  • Emotion Profiles

    April 13, 2011 7

    Describe the presence or absence of multiple emotion classes in a single clip using an estimate of classifier confidence

    • Binary Support Vector Machine classifications • Self vs. other • Matlab implementation

    • Output: • Binary yes/no for class membership • Distance from hyperplane

    • Interpretation: • Weight the binary output by the

    distance from the hyperplane (“confidence”)

    Classification:

    Angry vs. Not Angry

    Happy vs.

    Not Happy

    Sad vs. Not Sad

    Neutral vs. Not

    Neutral

  • Emotion Profile Construction

    April 29, 2011

    Val.

    Act.

    Form semantic clusters using disjoint set of speakers

    Train Self vs. Other Binary Classifiers on Each Semantic Cluster

    4 Binary Classifiers

    4-Dimensional Profiles For

    Test Speaker

    Utterances From Test Speaker

    Use trained binary classifiers to create an estimate of the emotion content

    8

    • Target value • Lagrange

    multiplier • Weight vector • Offset

  • Distance-Based Profile Measures

    April 29, 2011

    - Angry - Happy

    9

  • Emotograms: Dynamic Emotion Profiles

    April 29, 2011

    Emotogram for an Utterance Labeled “Happy”

    Emotion Profile for an Utterance Labeled “Happy”

    A

    H

    N

    S

    10

  • Problem Setup

    April 29, 2011

    Goal: Classify the affective state of clips at the utterance level using Emotograms

    • Features extracted over 10 (5m/5f) IEMOCAP speakers: • Motion capture: relative distances • Audio: prosodic, spectral • Feature Selection: Principal Feature Analysis (30 features)

    • Extract EPs over window lengths: 0.25 – 2 seconds • Train binary angry, happy, neutral, sad SVMs on disjoint set of speakers (9)

    • Model the trajectory of the EPs • Train angry, happy, neutral, sad HMMs on disjoint set of speakers (9)

    • Validation: • Leave-one-subject-out cross-validation (over each test speaker, merged results)

    11

  • Emotogram Construction

    April 29, 2011 12

  • Results

    April 29, 2011 13

  • Conclusions and Future Directions

    April 29, 2011

    • Hierarchical system improves classification performance over all sentence lengths when compared to static only (absolute / relative): • 6+ -- 7.84% / 11.75% • 3-6 – 3.55% / 5.48% • 1.5-3 – 0.54% / 0.87%

    • Largest improvement with longest sentences: • Implies that there exists a recognized pattern of emotion fluctuation

    • Human ability:

    • We can tell when emotions sound “wrong” • Flat affect is a diagnostic tool

    • Implication:

    • Emotion modulations can be modeled by people • This modulation may be modeled using a grammar

    14

  • Published Work in Emotion Profiles

    April 29, 2011

    1. Emily Mower and Shrikanth Narayanan. “A Hierarchical Static-Dynamic Framework for Emotion Classification.” International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Prague, Czech Republic. May 2011.

    2. Emily Mower, Maja J Matarić, Shrikanth Narayanan. “Framework for Automatic Human Emotion Classification Using Emotional Profiles.” IEEE Transactions on Audio, Speech, and Language Processing, 2010.

    3. Emily Mower, Maja J Matarić, Shrikanth Narayanan. “Robust Representations for Out-of-Domain Emotions Using Emotion Profiles.” Spoken Language Technology (SLT). Berkeley, CA, December 2010.

    4. Emily Mower, Kyu J. Han, Sungbok Lee and Shrikanth S. Narayanan. "A Cluster-Profile Representation of Emotion Using Agglomerative Hierarchical Clustering." InterSpeech. Makuhari, Japan, September 2010.

    5. Emily Mower, Angeliki Metallinou, Chi-Chun Lee, Abe Kazemzadeh, Carlos Busso, Sungbok Lee, Shrikanth Narayanan. "Interpreting Ambiguous Emotional Expressions." ACII Special Session: Recognition of Non-Prototypical Emotion from Speech- The Final Frontier? (Invited paper). Amsterdam, The Netherlands, September 2009.

    15

  • Thanks!

    April 29, 2011 16

    Questions?

    Emotions in Engineering: �Methods for the Interpretation of Ambiguous Emotional ContentMotivationMotivating ExampleFocus of this PresentationData Overview: USC IEMOCAPFeature Extraction and SelectionEmotion ProfilesEmotion Profile ConstructionDistance-Based Profile MeasuresEmotograms: Dynamic Emotion ProfilesProblem SetupEmotogram ConstructionResultsConclusions and Future DirectionsPublished Work in Emotion ProfilesThanks!