expressive generation for interactive stories - ucsc · natural language and dialogue systems lab...
Post on 04-Jun-2018
226 Views
Preview:
TRANSCRIPT
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Natural Language and Dialogue Systems Lab
Expressive Generation for Interactive Stories
Marilyn A. Walker, Ricky Grant, Jennifer Sawyer, Grace I. Lin, Noah Wardrip-Fruin, and Michael Buell
The Character Creator Project NSF Creative IT IIS-1002921
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Stories told through dialogue
What characters say How they say it How they react to what other
characters say
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Interactive Stories, Role Playing Games
Role-playing games are a type of interactive narrative game involving a story where the player takes on a role of a character in the story world.
Role-playing games are one of the most successful types of games
World of Warcraft, one of the largest role-playing games, has 11.5 million players by recent counts
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Role playing games used to have fixed visuals
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Now games usually dynamically generate animations/scenes
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Dialogue in games is where visuals were 20 years ago
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Authoring is very expensive and limits game play
Expensive: Script for Baldur’s Gate 2 was 3,257 pages long. Planescape: Torment contained nearly a million words of
dialogue. The Old Republic (2011) had authors developing content since
2006, with a team of at least twelve full-time writers
Limiting: Unlike film, games are interactive Exponential growth in authoring: every time a choice point is
supported, write a new dialogue tree for each branch. Dialogue choices often do not reflect player’s previous choices If a character is killed, any subplots with that character must be
removed
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Immersivity, Story, Meaningful Choice
Are such dialogue choices as meaningful as they might be?
Do they make us engage with the story?
Goal: support player agency, narrative choice
Solution: Write more dialogue and better dialogue trees?
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Procedural Language Generation: A Key Technology
Provides Parameters & Models Abstract & Modular Interfaces Trainable: Machine Learning Techniques
=> Greater Scalability, More Immersivity, Better Stories
There is no other way to make it possible to personalize dialogue interaction to an individual and their history playing the game
Similar issues across all dialogue applications
9
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Narrative Dialogue Generation Challenges
Revealing Subtext: key parts of narrative are not explicitly stated Character Personality: I am a friendly
person. Character Emotion: I am feeling hesitant. Character Motivation: I intend to flatter you.
Expressing a unique Character Voice Who is this person? More than what we need for other
dialogue applications
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Natural Language Generation
Most NLG systems have not focused on expressing character and personality for narrative applications
Expressive Natural Language Generation (ENLG) focuses on stylistic, social aspects of the linguistic behavior of dramatic characters Politeness theory: Walker et al, 1997, Andre’ et al,
2000, Cassell & Bickmore 2003, Wang et al, 2005 Personality: Ball & Breese 2000, Loyall & Bates 97, Isard
et al, 2006, Mairesse & Walker 2007, 2008 Archetypes: Rowe, Ha & Lester 2009
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Character Creator: Tool for author creativity
Tool for automatically ‘rendering’ variations in dialogue
Learn models of character voice (linguistic style) from film screenplays
Use the learned models to control the parameters of an expressive NLG engine(PERSONAGE)
Apply the learned models to control the style of character dialogue in a story
Test human perceptions of the resulting generated utterances
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Film Corpus
862 film scripts from IMSDb, as of May 19, 2010 7,400 characters 664,000 lines of dialogue 9,599,9900 tokens
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Scene from Annie Hall: Lobby of Sports Club
ALVY: Uh … you-you wanna lift? ANNIE: Turning and aiming her thumb over her shoulder Oh, why-uh … y-y-you gotta car? ALVY: No, um … I was gonna take a cab. ANNIE: Laughing Oh, no, I have a car. ALVY: You have a car? Annie smiles, hands folded in front of her So … Clears his throat. I don’t understand why … if you have a car, so then-then wh-why did you say “Do you have a car?” … like you wanted a lift?
Annie Hall: Getting a lift
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
The Terminator: getting a lift
Scene from The Terminator: Cigar biker
TERMINATOR: I need your clothes, your boots, and your motorcycle. CIGAR BIKER: You forgot to say please.
Terminator hurls Cigar, all 230 pounds of him, clear over the bar, through the serving window into the kitchen, where he lands on the big flat GRILL. We hear a SOUND like SIZZLING BACON as Cigar screams, flopping jerking. He rolls off in a smoking heap.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
What can we learn from a corpus?
Reveal Subtext: The way a character says something is one way to reveal subtext and character emotion Short vs. Long turns => friendliness, formality Word choice => level of education Disfluencies, Stuttering => anxiety, hesitation Direct forms vs. indirect forms => extraversion, aggression
Character Voice: Learning to model specific characters or sets of characters should produce individual character voices
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
4. Generate features reflecting linguistic behaviors
Pulp Fiction Script
Vincent’s Dialogue
Jules’ Dialogue
Other’s Dialogue
Jules’ Dialogue Vincent’s
Dialogue
Jules’ LIWC results Vincent’s
LIWC results
Jules’ Tag Question Ratio Vincent’s Tag
Question Ratio
Jules’ other features
Vincent’s other features
Jules’ Overall Polarity
Vincent’s Overall Polarity
…
…
1. Collect movie scripts from IMSDb
2. Extract utterances for each character
3. Select leading roles (dialogue > 60 turns)
Method
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Jules’ Learned Model
Vincent’s Learned Model
Generated features
Vincent in SpyFeet utterances
PERSONAGE generator
(ENLG engine)
Jules’ in SpyFeet utterances
Others in SpyFeet utterances
…
5. Learn models of character (z-scores)
6. Generate new utterances using learned models to control parameters of our dialogue generator
Story domain: SpyFeet utterances
Method (cont)
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Natural Language and Dialogue Systems Lab
Background: Personage Generator
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Automatically producing interesting dialogue
Parameters: lots of different parameters that produce interesting variations in character voices But which ones?
Models that control the parameters Tools that let authors control the parameters & models
Piloted an approach of exposing parameters and models directly to creative writers
Not natural to creative process to think of character voices in terms of parameters
But working with examples, and variations on examples, fits better with existing writing practice
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
PERSONAGE Generator: BIG FIVE Theory
Conscientiousness: Dutiful vs. impulsive Emotional stability: Calm vs. anxious Openness to experience: Imaginative vs. conventional Agreeableness: Kind vs. unfriendly Extraversion: Sociable, assertive vs. quiet
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ 22
Linguistic Reflexes of Personality: 50 years of studies
Extraversion (Furnham, 1990) Talk more, faster, louder and more repetitively Fewer pauses and hesitations Lower type/token ratio Less formal, more references to context (Heylighen & Dewaele, 2002)
More positive emotion words (Pennebaker & King, 1999) E.g. happy, pretty, good
Neuroticism (Pennebaker & King, 1999)
1st person singular pronouns Negative emotion words
Conscientiousness (Pennebaker & King, 1999)
Fewer negations and negative emotion words
Low but significant correlations
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
PERSONAGE Architecture: 67 Parameters
Realization
INPUT: Dialog Act, Content Pool
OUTPUT UTTERANCE
VERBOSITY
RESTATEMENTS
CONTENT POLARITY …
SYNTACTIC COMPLEXITY
SELF-REFERENCE …
CONTRAST: e.g. however, but JUSTIFY: e.g.
so, since
PERIOD …
EXCLAMATION
HEDGES: e.g. kind of, rather, basically, you know FILLED PAUSES: e.g. err…
SWEAR WORDS: e.g. damn
IN GROUP MARKERS: e.g. pal
STUTTERING: e.g. Ri-Ri-River TAG QUESTIONS
…
FREQUENCY OF USE
WORD LENGTH
VERB STRENGTH
Content Planner
Pragmatic Marker
Insertion
Lexical Choice
Aggregation
Syntactic Template Selection
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ 24
Restaurant Recommendations: 1000’s of variants
Alt Realization Extra
5 Err... it seems to me that Le Marais isn’t as bad as the others. 1.83
4 Right, I mean, Le Marais is the only restaurant that is any good. 2.83
8 Ok, I mean, Le Marais is a quite french, kosher and steak house place, you know and the atmosphere isn’t nasty, it has nice atmosphere. It has friendly service. It seems to me that the service is nice. It isn’t as bad as the others, is it?
5.17
9 Well, it seems to me that I am sure you would like Le Marais. It has good food, the food is sort of rather tasty, the ambience is nice, the atmosphere isn’t sort of nasty, it features rather friendly servers and its price is around 44 dollars.
5.83
3 I am sure you would like Le Marais, you know. The atmosphere is acceptable, the servers are nice and it’s a french, kosher and steak house place. Actually, the food is good, even if its price is 44 dollars.
6.00
10 It seems to me that Le Marais isn’t as bad as the others. It’s a french, kosher and steak house place. It has friendly servers, you know but it’s somewhat expensive, you know!
6.17
2 Basically, actually, I am sure you would like Le Marais. It features friendly service and acceptable atmosphere and it’s a french, kosher and steak house place. Even if its price is 44 dollars, it just has really good food, nice food.
6.17
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ 25
7.0 6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0
Extraversion rating
40
30
20
10
0
Utt
eran
ce
coun
t
Extravert Introvert
Rule-Based Extraversion Generation
Use correlations in literature to set parameters Significant perceptual differences p < .01 As binary classification, 90% accuracy
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Learning Method: Parameter Estimation Models
Data: 160 randomly generated utterances + generation decisions+ ratings
Training Multiple Continuous Parameters Models Independence assumption between parameters Best regression models selected through cross-validation
Example: CONTENT POLARITY
CONTENT POLARITY = - 0.102 x emotional stability + 0.970 x agreeableness - 0.110 x conscientiousness + 0.013 x openness to experience + 0.054
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Learning Method Evaluation Experiment
24 subjects rated 50 utterances Each utterance hits a combination of
Big Five targets
Correlation between target scores and average ratings
Extraversion = 3.5 Neuroticism = 1.7
Agreeableness = 6.5 Conscientiousn. = 4.0
Openness = 4.5
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Natural Language and Dialogue Systems Lab
PERSONAGE is the target ‘rendering engine’ for models learned from film dialogue
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Character Creator
Tools for increasing author creativity when writing interactive stories
Create parameter models by data mining utterance sets from lead characters in film dialogues
Discriminative features that map to generation parameters
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
4. Generate features reflecting linguistic behaviors
Pulp Fiction Script
Vincent’s Dialogue
Jules’ Dialogue
Other’s Dialogue
Jules’ Dialogue Vincent’s
Dialogue
Jules’ LIWC results Vincent’s
LIWC results
Jules’ Tag Question Ratio Vincent’s Tag
Question Ratio
Jules’ other features
Vincent’s other features
Jules’ Overall Polarity
Vincent’s Overall Polarity
…
…
1. Collect movie scripts from IMSDb
2. Extract utterances for each character
3. Select leading roles (dialogue > 60 turns)
Method
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
IMDB: 862 Films
Genre*
Gender
Directors
Film Period Now-2005, 2005-2000, 2000-1995, 1995-1990, 1990-1985,
1985-1980, before 1980
* images from AMC’s www.filmsite.org
• LEAD CHARACTERS: ~ 2500 characters with > 60 turns
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Scene from Annie Hall: Lobby of Sports Club
ALVY: Uh … you-you wanna lift? ANNIE: Turning and aiming her thumb over her shoulder Oh, why-uh … y-y-you gotta car? ALVY: No, um … I was gonna take a cab. ANNIE: Laughing Oh, no, I have a car. ALVY: You have a car? Annie smiles, hands folded in front of her So … Clears his throat. I don’t understand why … if you have a car, so then-then wh-why did you say “Do you have a car?” … like you wanted a lift?
Annie Hall: Getting a lift
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Scene from Pulp Fiction: Jack Rabbit Slim’s
Vincent: What do you think about what happened to Antwan? Mia: Who's Antwan? Vincent: Tony Rocky Horror. Mia: He fell out of a window. Vincent: That's one way to say it. Another way is, he was thrown out. Another way is, he was thrown out by Marsellus. And even another way is he was thrown out of a window by Marsellus because of you. Mia: Is that a fact? Vincent: No it's not, it's just what I heard. Mia: Who told you this? Vincent: They. Mia and Vincent smile. Mia: They talk a lot, don't they? Vincent: They certainly do.
Pulp Fiction
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Extracting Film Dialogue Features (Step 4)
Sample Feature Set
Sample Features
Basic Number of sentences per turn, number of verbs, number of verbs per sentence, etc.
Linguistic Inquiry and Word Count (LIWC) ratios
Categories: Discrepancy (should, would, could), Pos Emo (love, nice, sweet), Neg Emo (hurt, ugly, nasty), Negate (no, not, never), Certainty (always, never), Tentative (maybe, perhaps, guess), etc.
Polarity SemtiWordNet 3.0 (based on WordNet 3.0) assigns sentiment scores: positivity, negativity, objectivity e.g., “healthy”: pos=0.75, neg=0, ob=0.25 We accumulate sentiment scores of words in dialogue.
Verb Strength Sentiment scores of verbs only
Tag Question Ratio “You’re John, aren’t you?”
Pragmatic Marker “you know”, “I mean”, “well”, etc.
…etc.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Jules’ Learned Model
Vincent’s Learned Model
Generated features
Vincent in SpyFeet utterances
PERSONAGE generator
(ENLG engine)
Jules’ in SpyFeet utterances
Others in SpyFeet utterances
…
5. Learn models of character (z-scores)
6. Generate new utterances using learned models to control parameters of our dialogue generator
Story domain: SpyFeet utterances
Method (cont)
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Learning Character Models (Step 5) Each character is represented by a vector of feature values
Two ways of learning character models 1. Z-scores: train models representing individual characters 2. Classification: train models representing groups of characters We found Z-score models to be more useful for generating utterances, so
we then focused on z-scores
Char Gen LIWC- Posemo
LIWC- Tentat
LIWC- Discrep
LIWC- Negate
Tag- ratio
Verb- strength
…etc
Annie F 3.33 2.08 1.27 3.57 0.0472 0.009
Alvy M 2.66 1.64 1.54 3.31 0.0347 0.011
… etc.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Learning Character Models: Z-Scores
Z-scores: individual models trained by normalizing individual character model against a representative population
Example: normalize Annie (Annie Hall) against all female characters
z-score >1 or <-1 is more than one standard deviation away from the average
Indicates parameters that should be high or low
Annie’s z-score
Annie’s vector Averaged female population
Standard deviation female population
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Example: Model Learned for Annie
PERSONAGE parameter
Description Sample mapped features (from character model)
Annie
Verbosity Control # of propositions in the utterances
Number of sentences per turn, words per sentence
0.78
Content polarity
Control polarity of propositions expressed
Polarity-overall, LIWC-Posemo, LIWC-Negemo, LIWC-Negate
0.77
Polarization Control expressed polarity as neutral or extreme
1 if polarity-overall is strong negative or positive
0.72
Concessions Emphasize one attribute over another
Category-concession 0.83
Positive content first
Determine whether positive propositions – including the claim – are uttered first
Accept-ratio, Accept-first-ratio 1.00
… etc.
Map character model to PERSONAGE parameters: weighted average of features. Parameters either binary, or scalar range 0…1.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Evaluating Character Models
Would like to be able to estimate the quality of a character model with objective metrics (still developing)
Objective metrics: Number of significant features found for individual characters Confidence in feature value estimate
Experiments 3 male, 3 female, all with large number of turns For each character, randomize dialogue turns and separate
into incrementing segments of ~100 turns Segment 1 contains first ~100 turns, segment 2 contains
first ~100 turns plus next ~100 turns, and so on
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Evaluating Character Models: Corpus Size
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Evaluating Character Models: Corpus Size
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Evaluating Character Models: Summary
When we have more utterances for an individual character Indiana Jones: 3 films Hermione Granger: 7 films
Then: z>2 and z<-2, as well as z>3 and z<-3, shows upward trend
Conclusion: more dialogue more significant features in model
Perhaps use TV Series?
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Evaluating Character Models: Corpus Size on Archetype Groupings Objective metric: effect of corpus size on number of
significant features found for archetype groupings Result: combining characters by archetypes reduces number
of significant features
Leading action heroes significant attributes and z-scores
Bourne: (10) LIWC Cause (z=2.05), LIWC Self (z=1.91), LIWC I (1.87), LIWC WPS (-1.16), LIWC Posemo (-1.21), word though (-1.41), etc.
Bourne + The Rock: (9) word since (z=1.56), LIWC Period (z=1.31), LIWC I (z=1.13), word though (z=-1.41), word so (z=-1.51), etc.
Bourne + The Rock + Independence Day: (7) word because (z=1.29), word though (z=-1.41), word so (z=-1.53)
Bourne + The Rock + Independence Day + Die Hard: (5) word because (z=1.12), word so (z=-1.31) word though (z=-1.41)
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Conclusion Combining different characters within archetype resulting
dialogue being “blended” within the whole male population At least so far, works better to use individual models, rather
than archetype groupings Better method for combining whole population models with
individual models
Evaluating Character Models: Archetype
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Character Creator: Tool for author creativity
Learn models of character voice from film screenplays
Use the learned models to control the parameters of an expressive NLG engine(PERSONAGE)
Apply the learned models to character dialogue in the SpyFeet story domain A Different!! Domain
Test human perceptions of the resulting generated utterances
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Yes, the 5 of us are the guardians of nature...
What would you do to find
out?
Wolf
Tiger Beetle
Tortoise
Sparrow
Dr. Cartmill May I have
some cabbage?
And we know what Cartmill is up to…
The story: • Dr. Cartmill is up to no good… • 5 guardians of nature (animal spirits) know Cartmill’s plot Your job: Perform tasks to gain guardians of nature’s trust in order to uncover Dr. Cartmill’s plot
Oooo… fresh blood…
Story Domain: SpyFeet
Otter
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
NSF RI: project w/ Kurniawan & Wardrip-Fruin RPG outdoor AR story. Hypotheses: Dynamic elements will increase self-
efficacy for exercise, replayability, and immersion
Natural Language and Dialogue Systems http://nlds.soe.ucsc.edu
SPYFEET: Dynamic NPC/Player dialogue
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Annie (Annie Hall) original dialogue sample
• H’m? That’s, uh … that’s pretty serious stuff there. Yeah? Yeah? M’hm? M’hm. Yeah. U-huh. • Hi. Hi, hi. Well, bye. Oh, yeah? So do you. Oh, God, whatta- whatta dumb thing to say, right? I mean, you say it, “You play well,” and right away … I have to say well. Oh, oh … God, Annie. Well … oh, well … la-de-da, la-de-da, la-la
Generated dialogue (SpyFeet story domain)
• Come on, I don’t know, do you? People say Cartmill is strange while I don’t rush to um.. judgment.
• I don’t know. I think that you brought me cabbage, so I will tell something to you, alright?
• Yea, I’m not sure, would you be? Wolf wears a hard shell but he is really gentle.
• I see. I am not sure. Obviously, I respect Wolf. However, he isn’t my close friend, is he?
Original and Generated Utterances Annie’s Learned Z-Score Model for our ENLG engine
Verbosity=0.78 Conten polarity =0.77 Polarization =0.72 Repetition polarity=0.79 Concessions =0.83 Concessions Polarity=0.26 Positive content first=1.00 First Person in Claim=0.6 Claim Polarity=0.57 … etc.
Learning Linguistic Features
Generation
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Natural Language and Dialogue Systems Lab
CAN PEOPLE PERCEIVE THE CHARACTER THAT WAS MODELLED?
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Perceptual Experiment 3 scenes, 6 film characters (3 male, 3 female)
Alvy, Annie (Annie Hall) Indy, Marion (Indiana Jones – Raiders) Vincent, Mia (Pulp Fiction)
Mix character models with personality models Collect perceptions of character as well as Big Five
Ten Item Personality Inventory (TIPI) Rating 1 to 7 for each Big Five trait Analyze 3 traits only: extroversion, emotional stability,
agreeableness 29 users (13 female, 16 male), ages 22 to 44 Web-based experiment
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Perceptual Experiment Hypotheses
H1: Rule-based Big Five Personality models will be perceived as expressing intended traits in our story domain Previously tested ONLY in restaurant domain with
PERSONAGE generator
H2: Utterances generated using character models will be perceived as being more similar to that character than utterances generated using another randomly selected character model
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Generate Utterances from Character Models
Alvy (Annie Hall) • I don’t know. People say Cartmill is st-strange, alright? Err... on the other hand, I don’t rush to judgment. • Right, I am not sure, would you be? I will tell something you because you br-brought me cabbage.
Annie (Annie Hall) • Come on, I don’t know, do you? People say Cartmill is strange while I don’t rush to um.. judgment. • I don’t know. I think that you brought me cabbage, so I will tell something to you, alright?
Indy (Indiana Jones) • I don’t rush to judgment, but people say Cartmill is strange. • I will tell something you since you brought me cabbage. • Wolf is gentle but he wears a hard shell.
Vincent (Pulp Fiction) • Basically, I don’t rush to judgment. On the other hand, people say Cartmill is strange, he is strange. • Yeah, I can answer since you brought me cabbage that.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Perceptual Experiment: Part1: Original Utterances
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Part 1: Personality of Original Character
Could be used as intermediate representation for models
Perceived differences perhaps not as distinct as one would like?
Trait Character
Alvy Annie Indy Marion Mia Vincent
Extraversion 2.8 4.4 4.2 5.5 4.8 4.6
Emotional Stability 2.0 2.5 5.0 3.8 4.4 4.1
Agreeableness 4.0 4.5 3.3 3.9 4.0 4.1
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Rule-based personality models vs. film-corpus character models: users do not know which ones are which
Perceptual Experiment Part 2: Generated Utterances
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Perceptual Experiment Part 3: Can People Tell?
Example: read 3 scenes for Marion, then read 6 sets of generated utterances to determine similarity of style to original utterances
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Result: H1 confirmed for Extraversion and Emotional Stability
High/low perceived in both extroversion and emotional stability (significant difference, p<0.001)
High/low not perceived in agreeableness Limited set of utterances tested do not show variability in
agreeableness ? Need additional parameters in PERSONAGE generator to
show traits of agreeableness ?
Trait High Low P-‐value
Extraversion 5.2 3.3 <0.001
Emo5onal Stability 5.5 2.7 <0.001
Agreeableness 3.4 3.4 -‐-‐
Hypoth 1: Personality Models Perceived?
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Hypoth 2: Character Models Perceived?
Average similarity scores (1 to 7) between character and character models. Perfect Performance: A matrix with highest values along diagonal
*significant differences between character and character models of each row
Character Character Models
Alvy Annie Indy Marion Mia Vincent
Alvy 5.2 4.2* 2.1* 2.6* 2.8* 2.3*
Annie 4.2 4.3 2.8* 3.4* 3.9 2.9*
Indy 1.4* 2.2* 4.5 2.8* 3.3* 3.8*
Marion 1.6* 2.8* 3.7 3.1 4.1* 4.2*
Mia 1.7* 2.4* 4.3 3.2 3.6 4.3
Vincent 2.1* 3.2* 4.5 3.5* 3.6* 4.6
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Summary
Character parameter models learned from film dialogue
Models are applied to utterances in NEW domain Utterances produced using character models are
generally perceived as similar to that character Corpus-based character models more specific to
character voice than Big Five personality Current work:
Better ways to learn initial models Ways to improve initial models with author feedback
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Natural Language and Dialogue Systems Lab
Current experiments: Improve models
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Refining Corpus Models with Active Learning
Hybrid Models Film corpus models User feedback on desired (more like personality learned
models)
Refine corpus models to author-desired utterance styles using active learning (iterative supervised learning)
The learner chooses PERSONAGE parameters to generate utterances and query users for similarity rating to their ideal utterances
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Active Learning Model
Active Learning Model 1. System picks parameter to generate utterances to ask
author for similarity rating to the target model 2. Author compares these utterances with target model (in
their own head) utterances and rates it 1 to 7 3. System uses author rating to update the parameter
values accordingly 4. Repeat the entire process until initial model and target
model are “close enough”
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Initial and Target Models: Simulation first
Simulate an initial model progressing to a target model by querying target model for utterance similarity ratings
Initial Model 1. Given a film character model (Z-scores), map to PERSONAGE input
parameter values 2. Vary unspecified parameter values to generate different samples 3. Feedback similarity ratings rate (1 to 7) between these samples and
original film character dialogue 4. Create ARFF files for learning these initial models
Target Model (Blended Model to drive simulations) Mix film character model (step 1 of initial model) and Big Five personality
model (previous work on restaurant recommendation) Example: Mia from Pulp Fiction and extroversion
Preliminary Results Pilot author annota5on (not Mechanical Turk) on learning the ini5al model
Ini5al Model (Mia from Pulp Fic)on)
Ac5ve-‐Learning Model (a>er 25 itera5ons)
Target Model (Mia + extroversion)
I don't race to assessment. People, however, say Cartmill is strange.
I am sorry but people say Cartmill is unusual. I don't, however, race to judgement.
I am sorry but people say Cartmill is unusual. I don't race to judgment.
I will tell you something. Since you brought me cabbage, I will tell you something.
I will tell you something since you brought me cabbage.
Wolf is gentle. He, however, wears a hard shell.
Wolf is gentle. He, however, wears a hard shell.
Wolf is gentle. He, however, wears a hard shell.
Since Sparrow brings excitement to my life, I am allies with her.
Since Sparrow brings excitement to my life, I am allies with her.
I am allies with Sparrow since she brings excitement to my life.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Analysis
After 25 iterations, some utterances closer to target: Active-Learning Model: I am sorry but people say
Cartmill is unusual. I don't, however, race to judgment. Target Model: I am sorry but people say Cartmill is
unusual. I don't race to judgment.
However, some don’t change Currently working on refining learning process to
improve performance (requiring less iterations)
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Summary
Character parameter models learned from film dialogue
Models are applied to utterances in NEW domain Utterances produced using character models are
generally perceived as similar to that character Corpus-based character models more specific to
character voice than Big Five personality Current work:
Better ways to learn initial models Ways to improve initial models with author feedback
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Natural Language and Dialogue Systems Lab
Questions?
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Natural Language and Dialogue Systems Lab
Extra SLIDES
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Scene from Pulp Fiction: Jack Rabbit Slim’s after food has arrived
Vincent: What do you think about what happened to Antwan? Mia: Who's Antwan? Vincent: Tony Rocky Horror. Mia: He fell out of a window. Vincent: That's one way to say it. Another way is, he was thrown out. Another way is, he was thrown out by Marsellus. And even another way is he was thrown out of a window by Marsellus because of you. Mia: Is that a fact? Vincent: No it's not, it's just what I heard. Mia: Who told you this? Vincent: They. Mia and Vincent smile. Mia: They talk a lot, don't they? Vincent: They certainly do.
Pulp Fiction: Dialogue strategy of clarification Q, repetition of other
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Natural Language and Dialogue Systems Lab
Personage in restaurant recommendation domain
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ 72
Natural Language Generation Architecture
What to say
Content Planner
How to Say It
Sentence Planner
Surface Realizer
Prosody Assigner
What is Heard
Speech Synthesizer
We want to map findings like those into this architecture
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ 73
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ 74
What does the Content Planner do?
USER: I’d like a French restaurant on the Upper West Side. SYSTEM DM:
speech-act: recommend relations: justify(nuc1; sat:2); justify(nuc:1; sat:3); justify(nuc:1, sat:4; justify(nuc:1, sat:5); justify(nuc:1, sat:6) content: 1. assert(best (LeMarais)) 2. assert(has-att (LeMarais, service (3)))
3. assert(has-att (LeMarais, fqual (5))) 4. assert(has-att (LeMarais, price (44))) 5. assert(has-att (LeMarais, ftype (french-kosher)))
6. assert(has-att (LeMarais, décor (4)))
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
CP selects the assertions & relations to achieve dialog goal
Content VARIES (positive, negative, user preferences, plan structures) Right, I mean, Le Marais is the only restaurant that is any good. Ok, I mean, Le Marais is a quite french, kosher and steak house
place, you know and it has nice atmosphere. It seems to me that the service is nice. It isn’t as bad as the others, is it?
I am sure you would like Le Marais, you know. The servers are nice and it’s a french, kosher and steak house place. Actually, the food is good, even if its price is 44 dollars.
Basically, actually, I am sure you would like Le Marais. It features friendly service and and it’s a french, kosher and steak house place.
Even if its price is 44 dollars, it just has really good food, nice food.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ 76
What does the Sentence Planner do? USER: I’d like a French restaurant on the Upper West Side. SYSTEM DM:
speech-act: recommend relations: justify(nuc1; sat:2); justify(nuc:1; sat:3); justify(nuc:1, sat:4; justify(nuc:1, sat:5) content: 1. assert(best (LeMarais)) 2. assert(has-att (LeMarais, service (3)))
3. assert(has-att (LeMarais, fqual (5))) 4. assert(has-att (LeMarais, price (44))) 5. assert(has-att (LeMarais, ftype (french-kosher)))
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
SP maps content plan to many forms & stylistic variants
Possible realizations (among 100s) I am sure you would like Le Marais, you know. The servers are nice
and it’s a french, kosher and steak house place. Actually, the food is good, even if its price is 44 dollars.
It seems to me that Le Marais isn’t as bad as the others. It’s a french, kosher and steak house place. It has friendly servers, but it’s somewhat expensive, even if the food is pretty good.
Basically, actually, I am sure you would like Le Marais. It features friendly service you know and and it’s a french, kosher and steak house place. Even if its price is 44 dollars, it just has really good
food, nice food.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ 78
Example of Pragmatic Transformation
Negation insertion “X has awful food” “X doesn’t have good food”
Wok Mania class: proper noun
number: sg
have class: verb
awful class:adjective
food class: noun number: sg
article: none
Obj Subj
ATTR
WordNet Database
Look for antonym
“good”
- Negate verb - Replace adjective by antonym Wok Mania
class: proper noun number: sg
have class: verb
negated: true
good class:adjective
food class: noun number: sg
article: none
Obj Subj
ATTR
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Character In Film: One utterance tells it all
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Utterances express *multiple* personality traits
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
So where are we?
A flexible, real-time, generator Socially relevant & Personality parameters Methods for automatically training Personage trained for ‘Big Five personality’ but could
train to optimize other feedback measure, e.g. dramatic character
Personalize both content and form Standard meaning representations: DB Relations,
Content Plan, AI planner
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Corpus Annotation: 3 Human Judges (Ten-Item Personality Inventory, Gosling et al. 03)
Extraversion = 3.5 Neuroticism = 2.0
Agreeableness = 6.5 Conscient. = 4.0 Openness = 1.5
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Evaluating Character Models: Corpus Size
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ
Evaluating Character Models: Corpus Size
top related