Dialogue Acts
Bin Zhang
February 1, 2011
Outline
A. Stolcke et al. 2000, Dialogue act modeling for automatictagging and recognition of conversational speech
D. Davidov et al. 2010, Semi-supervised recognition of sarcasticsentences in Twitter and Amazon
Dialogue Act
I It is a specialized speech actI Typical dialogue acts
I StatementI QuestionI BackchannelI AgreementI DisagreementI Apology
Examples of Dialogue Acts
Dialogue Act Labeling
I Unlabled dialogue data from Switchboard corpusI Contains conversational telephone speechI Speech is segmented into utterancesI Each utterance has a dialogue act label
I Tag setI Dialogue act markup in several layers (DAMSL)I SWBD-DAMSL tag set: 50 tags original, reduced to 42 tags
for better annotator consistency
I 1115 conversations (205000 utterances) annotated by 8linguistic graduate students in 3 months, κ = 0.80 (excellentagreement)
Dialogue Act Labeling
I Unlabled dialogue data from Switchboard corpusI Contains conversational telephone speechI Speech is segmented into utterancesI Each utterance has a dialogue act label
I Tag setI Dialogue act markup in several layers (DAMSL)I SWBD-DAMSL tag set: 50 tags original, reduced to 42 tags
for better annotator consistency
I 1115 conversations (205000 utterances) annotated by 8linguistic graduate students in 3 months, κ = 0.80 (excellentagreement)
Dialogue Act Labeling
I Unlabled dialogue data from Switchboard corpusI Contains conversational telephone speechI Speech is segmented into utterancesI Each utterance has a dialogue act label
I Tag setI Dialogue act markup in several layers (DAMSL)I SWBD-DAMSL tag set: 50 tags original, reduced to 42 tags
for better annotator consistency
I 1115 conversations (205000 utterances) annotated by 8linguistic graduate students in 3 months, κ = 0.80 (excellentagreement)
Common Dialogue Act Types
Statements descriptive, narrative, or personal statements
Opinions often include such hedges as I think, I believe, itseems, and I mean
Questions yes-no questions, declarative questions, WH questions
Backchannels short utterances that play discourse-structuringroles, e.g., indicating that the speaker should go ontalking
Abandoned utterances those that the speaker breaks off withoutfinishing, and are followed by a restart
Turn exits similar to abandoned utterances, but with speakerchange
Answers yes answers and no answers
Agreements mark the degree to which a speaker accepts someprevious proposal, plan, opinion, or statement
Sequence Modeling of Dialogue Acts
I A conversation can be considered as a sequence of utterances
I Dialogue acts of the utterances in a conversation areinter-dependent
I Denote U as the dialogue act sequence of the conversation,and E as the evidence about the conversation (audio and/ortext)
U∗ = argmaxU
P(U|E )
I Using Bayes rule
U∗ = argmaxU
P(E |U)P(U)
P(E |U) dialogue act likelihoodP(U) discourse grammar
Sequence Modeling of Dialogue Acts
I A conversation can be considered as a sequence of utterances
I Dialogue acts of the utterances in a conversation areinter-dependent
I Denote U as the dialogue act sequence of the conversation,and E as the evidence about the conversation (audio and/ortext)
U∗ = argmaxU
P(U|E )
I Using Bayes rule
U∗ = argmaxU
P(E |U)P(U)
P(E |U) dialogue act likelihoodP(U) discourse grammar
Sequence Modeling of Dialogue Acts
I A conversation can be considered as a sequence of utterances
I Dialogue acts of the utterances in a conversation areinter-dependent
I Denote U as the dialogue act sequence of the conversation,and E as the evidence about the conversation (audio and/ortext)
U∗ = argmaxU
P(U|E )
I Using Bayes rule
U∗ = argmaxU
P(E |U)P(U)
P(E |U) dialogue act likelihoodP(U) discourse grammar
Dialogue Act HMM
I Markov assumption
P(Ui |U1,U2, . . . ,Ui−1) = P(Ui |Ui−k , . . . ,Ui−1)
I Independence assumption
P(E |U) =∏i
P(Ei |Ui )
Dialogue Act HMM
I Markov assumption
P(Ui |U1,U2, . . . ,Ui−1) = P(Ui |Ui−k , . . . ,Ui−1)
I Independence assumption
P(E |U) =∏i
P(Ei |Ui )
Discourse Grammar
I As is the nature of dialogues, the dialogue acts of theutterances in a dialogue are highly related
I Back-off dialogue act n-gram models as a simple and efficientdiscourse model
I If speaker is known, speaker-dependent dialogue act n-grammodels are better
Discourse Grammar
I As is the nature of dialogues, the dialogue acts of theutterances in a dialogue are highly related
I Back-off dialogue act n-gram models as a simple and efficientdiscourse model
I If speaker is known, speaker-dependent dialogue act n-grammodels are better
Discourse Grammar
I As is the nature of dialogues, the dialogue acts of theutterances in a dialogue are highly related
I Back-off dialogue act n-gram models as a simple and efficientdiscourse model
I If speaker is known, speaker-dependent dialogue act n-grammodels are better
Combining Evidence
I Words (W )
I ASR acoustics (A)
I Prosodic features (F ): pitch, duration, energy, etc., of thespeech signal
P(W ,A,F |U)
Combining Evidence
I Words (W )
I ASR acoustics (A)
I Prosodic features (F ): pitch, duration, energy, etc., of thespeech signal
P(W ,A,F |U)
Experimental Results
Outline
A. Stolcke et al. 2000, Dialogue act modeling for automatictagging and recognition of conversational speech
D. Davidov et al. 2010, Semi-supervised recognition of sarcasticsentences in Twitter and Amazon
Sarcasm
I Sarcasm – the activity of saying or writing the opposite ofwhat you mean, or of speaking in a way intended to makesomeone else feel stupid or show them that you are angry
Example of Twitter
Example of Amazon
Sarcastic Sentences in Two Genres
I In Twitter messages, sarcastic sentences appear in a widerange. Some sarcastic sentences are marked #sarcasm by theuser
I In Amazon product reviews, sarcastic sentences are usuallywith the negative reviews
Sarcastic Sentences in Two Genres
I In Twitter messages, sarcastic sentences appear in a widerange. Some sarcastic sentences are marked #sarcasm by theuser
I In Amazon product reviews, sarcastic sentences are usuallywith the negative reviews
Examples of Sarcasm
I thank you Janet Jackson for yet another year of Super Bowlclassic rock! (Twitter)
I Hes with his other woman: XBox 360. Its 4:30 fool. Sure Ican sleep through the gunfire (Twitter)
I Wow GPRS data speeds are blazing fast (Twitter)
I [I] Love The Cover (book, amazon)
I Defective by design (music player, amazon)
The Semi-supervised Approach
I Data processing
I Pattern extraction
I Pattern selection
I Data enrichment
I Classification
Data Processing
I For Twitter: actual user, URL, and hashtags tokenized by[USER], [URL], [HASHTAG], respectively
I For Amazon: product, author, company, and book nametokenized by [PRODUCT], [AUTHOR], [COMPANY],[TITLE], respectively
Pattern Extraction
I Construct pattern templatesI Words are classified into high-frequency words and content
words (less frequent)I Punctuations considered high-frequency wordsI Templates are created using both word classes from knowledge
I InstantiationI Templates are instantiated from the data, with high-frequency
words replaces by actual wordsI Hundres of patterns collectedI During matching, partial matching is allowed
Pattern Extraction
I Construct pattern templatesI Words are classified into high-frequency words and content
words (less frequent)I Punctuations considered high-frequency wordsI Templates are created using both word classes from knowledge
I InstantiationI Templates are instantiated from the data, with high-frequency
words replaces by actual wordsI Hundres of patterns collectedI During matching, partial matching is allowed
Pattern Selection
I Remove patterns that originated from one book/product(Amazon)
I Remove patterns that occur in both classes
Data Enrichment
I Used to get more labeled data
I Motivation: sarcastic sentences usually co-occur
I Use labeled sentences as query to search for more sentenceson the web
I Found sentences are assigned similar label to the query
Classification
I Feature vectors are composed of:I Match score of each pattern in the sentencesI Additional punctuation-based features
I k-nearest neighbor classifier is used
Data Annotation
I Sentences are annotated using five labels (1, 2, 3, 4, 5), 1being not sarcastic, 5 being cleary sarcastic. In classification,1, 2 are negative labels, 3, 4, 5 are positive labels
I The Twitter #sarcasm hashtag is found to be biased andnoisy
I Inter-annotator agreement (fair agreement)I Twitter: κ = 0.41I Amazon: κ = 0.34
I Amazon training data
Positive Negative
Seed 80 505Enriched 471 5020
I Evaluation data: for each genre 90 postive + 90 negativesentences
Data Annotation
I Sentences are annotated using five labels (1, 2, 3, 4, 5), 1being not sarcastic, 5 being cleary sarcastic. In classification,1, 2 are negative labels, 3, 4, 5 are positive labels
I The Twitter #sarcasm hashtag is found to be biased andnoisy
I Inter-annotator agreement (fair agreement)I Twitter: κ = 0.41I Amazon: κ = 0.34
I Amazon training data
Positive Negative
Seed 80 505Enriched 471 5020
I Evaluation data: for each genre 90 postive + 90 negativesentences
Data Annotation
I Sentences are annotated using five labels (1, 2, 3, 4, 5), 1being not sarcastic, 5 being cleary sarcastic. In classification,1, 2 are negative labels, 3, 4, 5 are positive labels
I The Twitter #sarcasm hashtag is found to be biased andnoisy
I Inter-annotator agreement (fair agreement)I Twitter: κ = 0.41I Amazon: κ = 0.34
I Amazon training data
Positive Negative
Seed 80 505Enriched 471 5020
I Evaluation data: for each genre 90 postive + 90 negativesentences
Data Annotation
I Sentences are annotated using five labels (1, 2, 3, 4, 5), 1being not sarcastic, 5 being cleary sarcastic. In classification,1, 2 are negative labels, 3, 4, 5 are positive labels
I The Twitter #sarcasm hashtag is found to be biased andnoisy
I Inter-annotator agreement (fair agreement)I Twitter: κ = 0.41I Amazon: κ = 0.34
I Amazon training data
Positive Negative
Seed 80 505Enriched 471 5020
I Evaluation data: for each genre 90 postive + 90 negativesentences
Experimental Results
I Baseline (star-sentiment, Amazon only): low rating reviewswith string positive sentiment
Experimental Results
I Baseline (star-sentiment, Amazon only): low rating reviewswith string positive sentiment
Thanks!
Improving Speech Recognition using Dialogue Acts
I Word recognition (with evidence)
W ∗i = argmax
Wi
P(Wi |Ai ,E )
I Dialogue acts can be considered as a factor that the acousticand language models depend on
I When the dialogue acts are unknown, the models are themixtures of dialogue act-dependent models
I Mixture-of-posteriors
P(Wi |Ai ,E ) =∑Ui
P(Wi |Ui )P(Ai |Wi )
P(Ai |Ui )P(Ui |E )
I Mixture-of-LMs
P(Wi |Ai ,E ) ≈∑Ui
P(Wi |Ui )P(Ui |E )P(Ai |Wi )
P(Ai )
Experimental Results