speakerlda: discovering topics in transcribed multi-speaker audio contents @ slam 2015
TRANSCRIPT
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio ContentsDamiano Spina, Johanne R. Trippas, Lawrence Cavedon, Mark Sanderson
1
An Extreme Example: Discussing about Merengue (Spanish)
{dance, egg, whip, Terpsichore, Latin, America, white, dessert}{dance, Terpsichore, Latin, America}{dessert, whip, white, egg}What is the dialogue about?Not considering speakersConsidering speakersVS.
2
HypothesisConsidering information about speakerswhich words/fragments correspond to each speakerwould improve topic discovery
3
Example: Topic Discovery for Recommendation
{dance, Terpsichore, Latin, America}{dessert, whip, white, egg}More Like This
More content about dance
More content about desserts
4
Topic Discovery in Multi-Speaker Audio Contents: ApplicationsMulti-Speaker Audio Contents:Podcasts (news, shows, interviews, etc.)MeetingsTV programs
Applications:Content-based Recommendation: more like this
ClusteringGroup search results according to topicsE.g., Search Result Presentation
5
Research QuestionWhat is the impact in terms of effectiveness of adding speaker information to a topic model when compared to traditional approaches (i.e., LDA)?
6
Topic Discovery
[Image from Blei, D. Probabilistic Topic Models, Communication of the ACM, 2012]Distribution of topics over wordsDistribution of topics over documents
7
Topic Discovery vs. Topic SegmentationTopic DiscoveryTopic SegmentationCharacterizes how a conversation evolves over time in terms of topics1 document ~ sequence of topicsCharacterizes documents according to topics1 document ~ distribution of topicst1t3t2t3t2t1time
t1 t2 t3
8
Topic Discovery vs. Topic SegmentationTopic DiscoveryTopic SegmentationNot using speaker informationLatent Dirichlet Allocation (LDA) [Blei et al., 2003]TextTiling [Hearst, 1997]
[Purver et al. 2006]Using speaker information?SITS [Nguyen et al., 2012]
9
Topic Discovery vs. Topic SegmentationTopic DiscoveryTopic SegmentationNot using speaker informationLatent Dirichlet Allocation (LDA) [Blei et al., 2003]TextTiling [Hearst, 1997]
[Purver et al. 2006]Using speaker informationSpeakerLDASITS [Nguyen et al., 2012]
RQ
RQ'
10
Proposed Approach: SpeakerLDA
Split documents (D) according to speakers (S)Run LDACombine topic distributions obtained for each speakers pseudo-document
11
Proposed Approach: SpeakerLDA
12
Evaluation FrameworkTopic models are typically evaluated by
computing intrinsic metrics (e.g., perplexity) of the the model in an unseen set of documents or
being applied to external information access tasks (e.g., topic detection as a clustering task)Needs manually annotated ground truthOne possible measure: Precision/Recall of clustering relationships
13
Evaluation Framework IIIs there any test collection suitable for measuring differences between our approach and existent topic models?
Must satisfy following conditionsEach topic is discussed in two or more documents
Include spoken documents with two or more speakers
The AMI Corpus satisfies both conditions!
14
The AMI Corpus
Augmented Multi-Party Interaction (AMI) Corpus
100 hours of recorded audio
More than 100 meetings with multiple speakers (generally 4)
Real and elicited scenario-driven meetingsSpeakers play different roles:Interface designer, project manager, industrial designer, marketing
Manual transcriptions, including speaker segmentation
Transcripts segmented according to topics and subtopics
15
16
Generating a Gold Standard for Topic Discovery
17
Work in ProgressCompare the effectiveness of SpeakerLDA vs. LDA (and vs. topic segmentation approaches)Extrinsic Evaluation: compare system outputs to clustering gold standard
18
AMI Corpus Topic Segmentation annotations as clustering gold standard
Varying initial number of topics
Considering the n most frequent topics in the topic-document distribution for topic assignment
19
Work in ProgressCompare the effectiveness of SpeakerLDA vs. LDA (and vs. topic segmentation approaches)Extrinsic Evaluation: compare system outputs to clustering gold standard
Challenge: How to define a valid clustering gold standard from topic segmentation annotations?
Opportunity: Compare system output to topic distribution gold standard.Generate distributions from annotated segments
20
{closing=0.09, opening=0.03, components...=0.21, discussion=0.06, industrial...=0.21, interface=0.21, marketing...=0.20}Gold topic distribution for the meeting IS1008c:
21
ConclusionsWe propose SpeakerLDA, a topic model that takes into account speaker information to discover what a set of audio documents (such as podcasts) is aboutIt can be used for clustering search results or content-based recommendation (more like this)We are currently investigating how to generate a clustering gold standard from topic segmentation annotations in the AMI CorpusEvaluate topic models by comparing against a topic distribution gold standard?
22
Thank you!
- For dessert we have...'Merengue'!
23
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio ContentsDamiano Spina, Johanne R. Trippas, Lawrence Cavedon, Mark Sanderson
@damiano10
24