expressive generation for interactive stories - ucsc · natural language and dialogue systems lab...

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ

Natural Language and Dialogue Systems Lab

Expressive Generation for Interactive Stories

Marilyn A. Walker, Ricky Grant, Jennifer Sawyer, Grace I. Lin, Noah Wardrip-Fruin, and Michael Buell

The Character Creator Project NSF Creative IT IIS-1002921

Stories told through dialogue

  What characters say   How they say it   How they react to what other

characters say

Interactive Stories, Role Playing Games

  Role-playing games are a type of interactive narrative game involving a story where the player takes on a role of a character in the story world.

  Role-playing games are one of the most successful types of games

  World of Warcraft, one of the largest role-playing games, has 11.5 million players by recent counts

Role playing games used to have fixed visuals

Now games usually dynamically generate animations/scenes

Dialogue in games is where visuals were 20 years ago

Authoring is very expensive and limits game play

  Expensive:   Script for Baldur’s Gate 2 was 3,257 pages long.   Planescape: Torment contained nearly a million words of

dialogue.   The Old Republic (2011) had authors developing content since

2006, with a team of at least twelve full-time writers

  Limiting:   Unlike film, games are interactive   Exponential growth in authoring: every time a choice point is

supported, write a new dialogue tree for each branch.   Dialogue choices often do not reflect player’s previous choices   If a character is killed, any subplots with that character must be

removed

Immersivity, Story, Meaningful Choice

  Are such dialogue choices as meaningful as they might be?

  Do they make us engage with the story?

  Goal: support player agency, narrative choice

  Solution: Write more dialogue and better dialogue trees?

Procedural Language Generation: A Key Technology

  Provides Parameters & Models   Abstract & Modular Interfaces   Trainable: Machine Learning Techniques

=> Greater Scalability, More Immersivity, Better Stories

  There is no other way to make it possible to personalize dialogue interaction to an individual and their history playing the game

  Similar issues across all dialogue applications

Narrative Dialogue Generation Challenges

  Revealing Subtext: key parts of narrative are not explicitly stated   Character Personality: I am a friendly

person.   Character Emotion: I am feeling hesitant.   Character Motivation: I intend to flatter you.

  Expressing a unique Character Voice   Who is this person?   More than what we need for other

dialogue applications

Natural Language Generation

  Most NLG systems have not focused on expressing character and personality for narrative applications

  Expressive Natural Language Generation (ENLG) focuses on stylistic, social aspects of the linguistic behavior of dramatic characters   Politeness theory: Walker et al, 1997, Andre’ et al,

2000, Cassell & Bickmore 2003, Wang et al, 2005   Personality: Ball & Breese 2000, Loyall & Bates 97, Isard

et al, 2006, Mairesse & Walker 2007, 2008   Archetypes: Rowe, Ha & Lester 2009

Character Creator: Tool for author creativity

  Tool for automatically ‘rendering’ variations in dialogue

  Learn models of character voice (linguistic style) from film screenplays

  Use the learned models to control the parameters of an expressive NLG engine(PERSONAGE)

  Apply the learned models to control the style of character dialogue in a story

  Test human perceptions of the resulting generated utterances

Film Corpus

  862 film scripts from IMSDb, as of May 19, 2010   7,400 characters   664,000 lines of dialogue   9,599,9900 tokens

Scene from Annie Hall: Lobby of Sports Club

ALVY: Uh … you-you wanna lift? ANNIE: Turning and aiming her thumb over her shoulder Oh, why-uh … y-y-you gotta car? ALVY: No, um … I was gonna take a cab. ANNIE: Laughing Oh, no, I have a car. ALVY: You have a car? Annie smiles, hands folded in front of her So … Clears his throat. I don’t understand why … if you have a car, so then-then wh-why did you say “Do you have a car?” … like you wanted a lift?

Annie Hall: Getting a lift

The Terminator: getting a lift

Scene from The Terminator: Cigar biker

TERMINATOR: I need your clothes, your boots, and your motorcycle. CIGAR BIKER: You forgot to say please.

Terminator hurls Cigar, all 230 pounds of him, clear over the bar, through the serving window into the kitchen, where he lands on the big flat GRILL. We hear a SOUND like SIZZLING BACON as Cigar screams, flopping jerking. He rolls off in a smoking heap.

What can we learn from a corpus?

  Reveal Subtext: The way a character says something is one way to reveal subtext and character emotion   Short vs. Long turns => friendliness, formality   Word choice => level of education   Disfluencies, Stuttering => anxiety, hesitation   Direct forms vs. indirect forms => extraversion, aggression

  Character Voice: Learning to model specific characters or sets of characters should produce individual character voices

4. Generate features reflecting linguistic behaviors

Pulp Fiction Script

Vincent’s Dialogue

Jules’ Dialogue

Other’s Dialogue

Jules’ Dialogue Vincent’s

Dialogue

Jules’ LIWC results Vincent’s

LIWC results

Jules’ Tag Question Ratio Vincent’s Tag

Question Ratio

Jules’ other features

Vincent’s other features

Jules’ Overall Polarity

Vincent’s Overall Polarity

1. Collect movie scripts from IMSDb

2. Extract utterances for each character

3. Select leading roles (dialogue > 60 turns)

Method

Jules’ Learned Model

Vincent’s Learned Model

Generated features

Vincent in SpyFeet utterances

PERSONAGE generator

(ENLG engine)

Jules’ in SpyFeet utterances

Others in SpyFeet utterances

5. Learn models of character (z-scores)

6. Generate new utterances using learned models to control parameters of our dialogue generator

Story domain: SpyFeet utterances

Method (cont)

Background: Personage Generator

Automatically producing interesting dialogue

  Parameters: lots of different parameters that produce interesting variations in character voices   But which ones?

  Models that control the parameters   Tools that let authors control the parameters & models

  Piloted an approach of exposing parameters and models directly to creative writers

  Not natural to creative process to think of character voices in terms of parameters

  But working with examples, and variations on examples, fits better with existing writing practice

PERSONAGE Generator: BIG FIVE Theory

  Conscientiousness: Dutiful vs. impulsive   Emotional stability: Calm vs. anxious   Openness to experience: Imaginative vs. conventional   Agreeableness: Kind vs. unfriendly   Extraversion: Sociable, assertive vs. quiet

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB UC SANTA CRUZ 22

Linguistic Reflexes of Personality: 50 years of studies

  Extraversion (Furnham, 1990)   Talk more, faster, louder and more repetitively   Fewer pauses and hesitations   Lower type/token ratio   Less formal, more references to context (Heylighen & Dewaele, 2002)

  More positive emotion words (Pennebaker & King, 1999)   E.g. happy, pretty, good

  Neuroticism (Pennebaker & King, 1999)

  1st person singular pronouns   Negative emotion words

  Conscientiousness (Pennebaker & King, 1999)

  Fewer negations and negative emotion words

  Low but significant correlations

PERSONAGE Architecture: 67 Parameters

Realization

INPUT: Dialog Act, Content Pool

OUTPUT UTTERANCE

VERBOSITY

RESTATEMENTS

CONTENT POLARITY …

SYNTACTIC COMPLEXITY

SELF-REFERENCE …

CONTRAST: e.g. however, but JUSTIFY: e.g.

so, since

PERIOD …

EXCLAMATION

HEDGES: e.g. kind of, rather, basically, you know FILLED PAUSES: e.g. err…

SWEAR WORDS: e.g. damn

IN GROUP MARKERS: e.g. pal

STUTTERING: e.g. Ri-Ri-River TAG QUESTIONS

FREQUENCY OF USE

WORD LENGTH

VERB STRENGTH

Content Planner

Pragmatic Marker

Insertion

Lexical Choice

Aggregation

Syntactic Template Selection

Restaurant Recommendations: 1000’s of variants

Alt Realization Extra

5 Err... it seems to me that Le Marais isn’t as bad as the others. 1.83

4 Right, I mean, Le Marais is the only restaurant that is any good. 2.83

8 Ok, I mean, Le Marais is a quite french, kosher and steak house place, you know and the atmosphere isn’t nasty, it has nice atmosphere. It has friendly service. It seems to me that the service is nice. It isn’t as bad as the others, is it?

9 Well, it seems to me that I am sure you would like Le Marais. It has good food, the food is sort of rather tasty, the ambience is nice, the atmosphere isn’t sort of nasty, it features rather friendly servers and its price is around 44 dollars.

3 I am sure you would like Le Marais, you know. The atmosphere is acceptable, the servers are nice and it’s a french, kosher and steak house place. Actually, the food is good, even if its price is 44 dollars.

10 It seems to me that Le Marais isn’t as bad as the others. It’s a french, kosher and steak house place. It has friendly servers, you know but it’s somewhat expensive, you know!

2 Basically, actually, I am sure you would like Le Marais. It features friendly service and acceptable atmosphere and it’s a french, kosher and steak house place. Even if its price is 44 dollars, it just has really good food, nice food.

7.0 6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0

Extraversion rating

Extravert Introvert

Rule-Based Extraversion Generation

  Use correlations in literature to set parameters   Significant perceptual differences p < .01   As binary classification, 90% accuracy

Learning Method: Parameter Estimation Models

  Data: 160 randomly generated utterances + generation decisions+ ratings

  Training Multiple Continuous Parameters Models   Independence assumption between parameters   Best regression models selected through cross-validation

  Example: CONTENT POLARITY

CONTENT POLARITY = - 0.102 x emotional stability + 0.970 x agreeableness - 0.110 x conscientiousness + 0.013 x openness to experience + 0.054

Learning Method Evaluation Experiment

  24 subjects rated 50 utterances   Each utterance hits a combination of

Big Five targets

  Correlation between target scores and average ratings

Extraversion = 3.5 Neuroticism = 1.7

Agreeableness = 6.5 Conscientiousn. = 4.0

Openness = 4.5

PERSONAGE is the target ‘rendering engine’ for models learned from film dialogue

Character Creator

  Tools for increasing author creativity when writing interactive stories

  Create parameter models by data mining utterance sets from lead characters in film dialogues

  Discriminative features that map to generation parameters

4. Generate features reflecting linguistic behaviors

Pulp Fiction Script

Vincent’s Dialogue

Jules’ Dialogue

Other’s Dialogue

Jules’ Dialogue Vincent’s

Dialogue

Jules’ LIWC results Vincent’s

LIWC results

Jules’ Tag Question Ratio Vincent’s Tag

Question Ratio

Jules’ other features

Vincent’s other features

Jules’ Overall Polarity

Vincent’s Overall Polarity

1. Collect movie scripts from IMSDb

2. Extract utterances for each character

3. Select leading roles (dialogue > 60 turns)

Method

IMDB: 862 Films

Genre*

Gender

Directors

Film Period Now-2005, 2005-2000, 2000-1995, 1995-1990, 1990-1985,

1985-1980, before 1980

* images from AMC’s www.filmsite.org

•  LEAD CHARACTERS: ~ 2500 characters with > 60 turns

Scene from Annie Hall: Lobby of Sports Club

ALVY: Uh … you-you wanna lift? ANNIE: Turning and aiming her thumb over her shoulder Oh, why-uh … y-y-you gotta car? ALVY: No, um … I was gonna take a cab. ANNIE: Laughing Oh, no, I have a car. ALVY: You have a car? Annie smiles, hands folded in front of her So … Clears his throat. I don’t understand why … if you have a car, so then-then wh-why did you say “Do you have a car?” … like you wanted a lift?

Annie Hall: Getting a lift

Scene from Pulp Fiction: Jack Rabbit Slim’s

Vincent: What do you think about what happened to Antwan? Mia: Who's Antwan? Vincent: Tony Rocky Horror. Mia: He fell out of a window. Vincent: That's one way to say it. Another way is, he was thrown out. Another way is, he was thrown out by Marsellus. And even another way is he was thrown out of a window by Marsellus because of you. Mia: Is that a fact? Vincent: No it's not, it's just what I heard. Mia: Who told you this? Vincent: They. Mia and Vincent smile. Mia: They talk a lot, don't they? Vincent: They certainly do.

Pulp Fiction

Extracting Film Dialogue Features (Step 4)

Sample Feature Set

Sample Features

Basic Number of sentences per turn, number of verbs, number of verbs per sentence, etc.

Linguistic Inquiry and Word Count (LIWC) ratios

Categories: Discrepancy (should, would, could), Pos Emo (love, nice, sweet), Neg Emo (hurt, ugly, nasty), Negate (no, not, never), Certainty (always, never), Tentative (maybe, perhaps, guess), etc.

Polarity SemtiWordNet 3.0 (based on WordNet 3.0) assigns sentiment scores: positivity, negativity, objectivity e.g., “healthy”: pos=0.75, neg=0, ob=0.25 We accumulate sentiment scores of words in dialogue.

Verb Strength Sentiment scores of verbs only

Tag Question Ratio “You’re John, aren’t you?”

Pragmatic Marker “you know”, “I mean”, “well”, etc.

…etc.

Jules’ Learned Model

Vincent’s Learned Model

Generated features

Vincent in SpyFeet utterances

PERSONAGE generator

(ENLG engine)

Jules’ in SpyFeet utterances

Others in SpyFeet utterances

5. Learn models of character (z-scores)

6. Generate new utterances using learned models to control parameters of our dialogue generator

Story domain: SpyFeet utterances

Method (cont)

Learning Character Models (Step 5)   Each character is represented by a vector of feature values

Two ways of learning character models 1.  Z-scores: train models representing individual characters 2.  Classification: train models representing groups of characters We found Z-score models to be more useful for generating utterances, so

we then focused on z-scores

Char Gen LIWC- Posemo

LIWC- Tentat

LIWC- Discrep

LIWC- Negate

Tag- ratio

Verb- strength

…etc

Annie F 3.33 2.08 1.27 3.57 0.0472 0.009

Alvy M 2.66 1.64 1.54 3.31 0.0347 0.011

… etc.

Learning Character Models: Z-Scores

Z-scores: individual models trained by normalizing individual character model against a representative population

Example: normalize Annie (Annie Hall) against all female characters

  z-score >1 or <-1 is more than one standard deviation away from the average

  Indicates parameters that should be high or low

Annie’s z-score

Annie’s vector Averaged female population

Standard deviation female population

Example: Model Learned for Annie

PERSONAGE parameter

Description Sample mapped features (from character model)

Verbosity Control # of propositions in the utterances

Number of sentences per turn, words per sentence

Content polarity

Control polarity of propositions expressed

Polarity-overall, LIWC-Posemo, LIWC-Negemo, LIWC-Negate

Polarization Control expressed polarity as neutral or extreme

1 if polarity-overall is strong negative or positive

Concessions Emphasize one attribute over another

Category-concession 0.83

Positive content first

Determine whether positive propositions – including the claim – are uttered first

Accept-ratio, Accept-first-ratio 1.00

… etc.

Map character model to PERSONAGE parameters: weighted average of features. Parameters either binary, or scalar range 0…1.

Evaluating Character Models

  Would like to be able to estimate the quality of a character model with objective metrics (still developing)

  Objective metrics:   Number of significant features found for individual characters   Confidence in feature value estimate

  Experiments   3 male, 3 female, all with large number of turns   For each character, randomize dialogue turns and separate

into incrementing segments of ~100 turns   Segment 1 contains first ~100 turns, segment 2 contains

first ~100 turns plus next ~100 turns, and so on

Evaluating Character Models: Corpus Size

Evaluating Character Models: Summary

  When we have more utterances for an individual character   Indiana Jones: 3 films   Hermione Granger: 7 films

  Then: z>2 and z<-2, as well as z>3 and z<-3, shows upward trend

  Conclusion: more dialogue more significant features in model

  Perhaps use TV Series?

Evaluating Character Models: Corpus Size on Archetype Groupings   Objective metric: effect of corpus size on number of

significant features found for archetype groupings   Result: combining characters by archetypes reduces number

of significant features

Leading action heroes significant attributes and z-scores

Bourne: (10) LIWC Cause (z=2.05), LIWC Self (z=1.91), LIWC I (1.87), LIWC WPS (-1.16), LIWC Posemo (-1.21), word though (-1.41), etc.

Bourne + The Rock: (9) word since (z=1.56), LIWC Period (z=1.31), LIWC I (z=1.13), word though (z=-1.41), word so (z=-1.51), etc.

Bourne + The Rock + Independence Day: (7) word because (z=1.29), word though (z=-1.41), word so (z=-1.53)

Bourne + The Rock + Independence Day + Die Hard: (5) word because (z=1.12), word so (z=-1.31) word though (z=-1.41)

Conclusion   Combining different characters within archetype resulting

dialogue being “blended” within the whole male population   At least so far, works better to use individual models, rather

than archetype groupings   Better method for combining whole population models with

individual models

Evaluating Character Models: Archetype

Character Creator: Tool for author creativity

  Learn models of character voice from film screenplays

  Use the learned models to control the parameters of an expressive NLG engine(PERSONAGE)

  Apply the learned models to character dialogue in the SpyFeet story domain   A Different!! Domain

  Test human perceptions of the resulting generated utterances

Yes, the 5 of us are the guardians of nature...

What would you do to find

Tiger Beetle

Tortoise

Sparrow

Dr. Cartmill May I have

some cabbage?

And we know what Cartmill is up to…

The story: • Dr. Cartmill is up to no good… • 5 guardians of nature (animal spirits) know Cartmill’s plot Your job: Perform tasks to gain guardians of nature’s trust in order to uncover Dr. Cartmill’s plot

Oooo… fresh blood…

Story Domain: SpyFeet

  NSF RI: project w/ Kurniawan & Wardrip-Fruin   RPG outdoor AR story.   Hypotheses: Dynamic elements will increase self-

efficacy for exercise, replayability, and immersion

Natural Language and Dialogue Systems http://nlds.soe.ucsc.edu

SPYFEET: Dynamic NPC/Player dialogue

Annie (Annie Hall) original dialogue sample

•  H’m? That’s, uh … that’s pretty serious stuff there. Yeah? Yeah? M’hm? M’hm. Yeah. U-huh. •  Hi. Hi, hi. Well, bye. Oh, yeah? So do you. Oh, God, whatta- whatta dumb thing to say, right? I mean, you say it, “You play well,” and right away … I have to say well. Oh, oh … God, Annie. Well … oh, well … la-de-da, la-de-da, la-la

Generated dialogue (SpyFeet story domain)

• Come on, I don’t know, do you? People say Cartmill is strange while I don’t rush to um.. judgment.

•  I don’t know. I think that you brought me cabbage, so I will tell something to you, alright?

• Yea, I’m not sure, would you be? Wolf wears a hard shell but he is really gentle.

•  I see. I am not sure. Obviously, I respect Wolf. However, he isn’t my close friend, is he?

Original and Generated Utterances Annie’s Learned Z-Score Model for our ENLG engine

Verbosity=0.78 Conten polarity =0.77 Polarization =0.72 Repetition polarity=0.79 Concessions =0.83 Concessions Polarity=0.26 Positive content first=1.00 First Person in Claim=0.6 Claim Polarity=0.57 … etc.

Learning Linguistic Features

Generation

CAN PEOPLE PERCEIVE THE CHARACTER THAT WAS MODELLED?

Perceptual Experiment   3 scenes, 6 film characters (3 male, 3 female)

  Alvy, Annie (Annie Hall)   Indy, Marion (Indiana Jones – Raiders)   Vincent, Mia (Pulp Fiction)

  Mix character models with personality models   Collect perceptions of character as well as Big Five

  Ten Item Personality Inventory (TIPI)   Rating 1 to 7 for each Big Five trait   Analyze 3 traits only: extroversion, emotional stability,

agreeableness   29 users (13 female, 16 male), ages 22 to 44   Web-based experiment

Perceptual Experiment Hypotheses

  H1: Rule-based Big Five Personality models will be perceived as expressing intended traits in our story domain   Previously tested ONLY in restaurant domain with

PERSONAGE generator

  H2: Utterances generated using character models will be perceived as being more similar to that character than utterances generated using another randomly selected character model

Generate Utterances from Character Models

Alvy (Annie Hall) • I don’t know. People say Cartmill is st-strange, alright? Err... on the other hand, I don’t rush to judgment. • Right, I am not sure, would you be? I will tell something you because you br-brought me cabbage.

Annie (Annie Hall) • Come on, I don’t know, do you? People say Cartmill is strange while I don’t rush to um.. judgment. • I don’t know. I think that you brought me cabbage, so I will tell something to you, alright?

Indy (Indiana Jones) • I don’t rush to judgment, but people say Cartmill is strange. • I will tell something you since you brought me cabbage. • Wolf is gentle but he wears a hard shell.

Vincent (Pulp Fiction) • Basically, I don’t rush to judgment. On the other hand, people say Cartmill is strange, he is strange. • Yeah, I can answer since you brought me cabbage that.

Perceptual Experiment: Part1: Original Utterances

Part 1: Personality of Original Character

Could be used as intermediate representation for models

Perceived differences perhaps not as distinct as one would like?

Trait Character

Alvy Annie Indy Marion Mia Vincent

Extraversion 2.8 4.4 4.2 5.5 4.8 4.6

Emotional Stability 2.0 2.5 5.0 3.8 4.4 4.1

Agreeableness 4.0 4.5 3.3 3.9 4.0 4.1

Rule-based personality models vs. film-corpus character models: users do not know which ones are which

Perceptual Experiment Part 2: Generated Utterances

Perceptual Experiment Part 3: Can People Tell?

Example: read 3 scenes for Marion, then read 6 sets of generated utterances to determine similarity of style to original utterances

Result: H1 confirmed for Extraversion and Emotional Stability

  High/low perceived in both extroversion and emotional stability (significant difference, p<0.001)

  High/low not perceived in agreeableness   Limited set of utterances tested do not show variability in

agreeableness ?   Need additional parameters in PERSONAGE generator to

show traits of agreeableness ?

Trait High Low P-‐value

Extraversion 5.2 3.3 <0.001

Emo5onal Stability 5.5 2.7 <0.001

Agreeableness 3.4 3.4 -‐-‐

Hypoth 1: Personality Models Perceived?

Hypoth 2: Character Models Perceived?

Average similarity scores (1 to 7) between character and character models. Perfect Performance: A matrix with highest values along diagonal

*significant differences between character and character models of each row

Character Character Models

Alvy Annie Indy Marion Mia Vincent

Alvy 5.2 4.2* 2.1* 2.6* 2.8* 2.3*

Annie 4.2 4.3 2.8* 3.4* 3.9 2.9*

Indy 1.4* 2.2* 4.5 2.8* 3.3* 3.8*

Marion 1.6* 2.8* 3.7 3.1 4.1* 4.2*

Mia 1.7* 2.4* 4.3 3.2 3.6 4.3

Vincent 2.1* 3.2* 4.5 3.5* 3.6* 4.6

Summary

  Character parameter models learned from film dialogue

  Models are applied to utterances in NEW domain   Utterances produced using character models are

generally perceived as similar to that character   Corpus-based character models more specific to

character voice than Big Five personality   Current work:

  Better ways to learn initial models   Ways to improve initial models with author feedback

Current experiments: Improve models

Refining Corpus Models with Active Learning

  Hybrid Models   Film corpus models   User feedback on desired (more like personality learned

models)

  Refine corpus models to author-desired utterance styles using active learning (iterative supervised learning)

  The learner chooses PERSONAGE parameters to generate utterances and query users for similarity rating to their ideal utterances

Active Learning Model

  Active Learning Model 1.  System picks parameter to generate utterances to ask

author for similarity rating to the target model 2.  Author compares these utterances with target model (in

their own head) utterances and rates it 1 to 7 3.  System uses author rating to update the parameter

values accordingly 4.  Repeat the entire process until initial model and target

model are “close enough”

Initial and Target Models: Simulation first

Simulate an initial model progressing to a target model by querying target model for utterance similarity ratings

  Initial Model 1.  Given a film character model (Z-scores), map to PERSONAGE input

parameter values 2.  Vary unspecified parameter values to generate different samples 3.  Feedback similarity ratings rate (1 to 7) between these samples and

original film character dialogue 4.  Create ARFF files for learning these initial models

  Target Model (Blended Model to drive simulations)   Mix film character model (step 1 of initial model) and Big Five personality

model (previous work on restaurant recommendation)   Example: Mia from Pulp Fiction and extroversion

Preliminary Results Pilot author annota5on (not Mechanical Turk) on learning the ini5al model

Ini5al Model (Mia from Pulp Fic)on)

Ac5ve-‐Learning Model (a>er 25 itera5ons)

Target Model (Mia + extroversion)

I don't race to assessment. People, however, say Cartmill is strange.

I am sorry but people say Cartmill is unusual. I don't, however, race to judgement.

I am sorry but people say Cartmill is unusual. I don't race to judgment.

I will tell you something. Since you brought me cabbage, I will tell you something.

I will tell you something since you brought me cabbage.

Wolf is gentle. He, however, wears a hard shell.

Since Sparrow brings excitement to my life, I am allies with her.

I am allies with Sparrow since she brings excitement to my life.

Analysis

  After 25 iterations, some utterances closer to target:   Active-Learning Model: I am sorry but people say

Cartmill is unusual. I don't, however, race to judgment.   Target Model: I am sorry but people say Cartmill is

unusual. I don't race to judgment.

  However, some don’t change   Currently working on refining learning process to

improve performance (requiring less iterations)

Summary

  Character parameter models learned from film dialogue

  Models are applied to utterances in NEW domain   Utterances produced using character models are

generally perceived as similar to that character   Corpus-based character models more specific to

character voice than Big Five personality   Current work:

  Better ways to learn initial models   Ways to improve initial models with author feedback

Questions?

Extra SLIDES

Scene from Pulp Fiction: Jack Rabbit Slim’s after food has arrived

Vincent: What do you think about what happened to Antwan? Mia: Who's Antwan? Vincent: Tony Rocky Horror. Mia: He fell out of a window. Vincent: That's one way to say it. Another way is, he was thrown out. Another way is, he was thrown out by Marsellus. And even another way is he was thrown out of a window by Marsellus because of you. Mia: Is that a fact? Vincent: No it's not, it's just what I heard. Mia: Who told you this? Vincent: They. Mia and Vincent smile. Mia: They talk a lot, don't they? Vincent: They certainly do.

Pulp Fiction: Dialogue strategy of clarification Q, repetition of other

Personage in restaurant recommendation domain

Natural Language Generation Architecture

What to say

Content Planner

How to Say It

Sentence Planner

Surface Realizer

Prosody Assigner

What is Heard

Speech Synthesizer

We want to map findings like those into this architecture

What does the Content Planner do?

USER: I’d like a French restaurant on the Upper West Side. SYSTEM DM:

  speech-act: recommend   relations: justify(nuc1; sat:2); justify(nuc:1; sat:3); justify(nuc:1, sat:4; justify(nuc:1, sat:5); justify(nuc:1, sat:6)   content: 1. assert(best (LeMarais)) 2. assert(has-att (LeMarais, service (3)))

3. assert(has-att (LeMarais, fqual (5))) 4. assert(has-att (LeMarais, price (44))) 5. assert(has-att (LeMarais, ftype (french-kosher)))

6. assert(has-att (LeMarais, décor (4)))

CP selects the assertions & relations to achieve dialog goal

Content VARIES (positive, negative, user preferences, plan structures)   Right, I mean, Le Marais is the only restaurant that is any good.   Ok, I mean, Le Marais is a quite french, kosher and steak house

place, you know and it has nice atmosphere. It seems to me that the service is nice. It isn’t as bad as the others, is it?

  I am sure you would like Le Marais, you know. The servers are nice and it’s a french, kosher and steak house place. Actually, the food is good, even if its price is 44 dollars.

  Basically, actually, I am sure you would like Le Marais. It features friendly service and and it’s a french, kosher and steak house place.

Even if its price is 44 dollars, it just has really good food, nice food.

What does the Sentence Planner do? USER: I’d like a French restaurant on the Upper West Side. SYSTEM DM:

  speech-act: recommend   relations: justify(nuc1; sat:2); justify(nuc:1; sat:3); justify(nuc:1, sat:4; justify(nuc:1, sat:5)   content: 1. assert(best (LeMarais)) 2. assert(has-att (LeMarais, service (3)))

3. assert(has-att (LeMarais, fqual (5))) 4. assert(has-att (LeMarais, price (44))) 5. assert(has-att (LeMarais, ftype (french-kosher)))

SP maps content plan to many forms & stylistic variants

Possible realizations (among 100s)   I am sure you would like Le Marais, you know. The servers are nice

and it’s a french, kosher and steak house place. Actually, the food is good, even if its price is 44 dollars.

  It seems to me that Le Marais isn’t as bad as the others. It’s a french, kosher and steak house place. It has friendly servers, but it’s somewhat expensive, even if the food is pretty good.

  Basically, actually, I am sure you would like Le Marais. It features friendly service you know and and it’s a french, kosher and steak house place. Even if its price is 44 dollars, it just has really good

food, nice food.

Example of Pragmatic Transformation

  Negation insertion   “X has awful food” “X doesn’t have good food”

Wok Mania class: proper noun

number: sg

have class: verb

awful class:adjective

food class: noun number: sg

article: none

Obj Subj

WordNet Database

Look for antonym

“good”

- Negate verb - Replace adjective by antonym Wok Mania

class: proper noun number: sg

have class: verb

negated: true

good class:adjective

food class: noun number: sg

article: none

Obj Subj

Character In Film: One utterance tells it all

Utterances express *multiple* personality traits

So where are we?

  A flexible, real-time, generator   Socially relevant & Personality parameters   Methods for automatically training   Personage trained for ‘Big Five personality’ but could

train to optimize other feedback measure, e.g. dramatic character

  Personalize both content and form   Standard meaning representations: DB Relations,

Content Plan, AI planner

Corpus Annotation: 3 Human Judges (Ten-Item Personality Inventory, Gosling et al. 03)

Extraversion = 3.5 Neuroticism = 2.0

Agreeableness = 6.5 Conscient. = 4.0 Openness = 1.5

expressive generation for interactive stories - ucsc · natural language and dialogue systems lab...

Documents

1 in politeness

politeness (pragmatics)

ucsc immunobrowser

english politeness

politeness powerpoint

book politeness

121180 politeness

tesis ucsc

politeness and impoliteness in interaction · politeness...

the central engine for gamma-ray bursts s. e. woosley (ucsc)...

ucsc guide

politeness (nevala 2010)

presentation on politeness

re examining politeness

guro ako - teacher talks€¦ · web view2017. 5. 13. ·...

gendering politeness

what is politeness

politeness theory and discourse - ksu · politeness theory...

cultural politeness

the politeness package: detecting politeness in natural...