sp11 cs288 lecture 25 -- summarization (2pp)klein/cs288/sp11/slides...sp11 cs288 lecture 25 --...
TRANSCRIPT
1
Statistical NLPSpring 2011
Lecture 25: SummarizationDan Klein – UC Berkeley
Document Summarization
2
Multi-document Summarization
… 27,000+ more
Extractive Summarization
3
Selection
• Maximum Marginal Relevancemid-‘90s
present
Maximize similarity to the query
Minimize redundancy
[Carbonell and Goldstein, 1998]ss11
ss33
ss22
ss44QQ
Greedy search over sentences
• Maximum Marginal Relevance
• Graph algorithms [Mihalcea 05++]
mid-‘90s
present
Selection
4
• Maximum Marginal Relevance
• Graph algorithms
mid-‘90s
presentss11
ss33
ss22
ss44Nodes are sentences
Selection
• Maximum Marginal Relevance
• Graph algorithms
mid-‘90s
presentss11
ss33
ss22
ss44Nodes are sentences
Edges are similarities
Selection
5
• Maximum Marginal Relevance
• Graph algorithms
mid-‘90s
present ss11
ss33
ss22
ss44Nodes are sentences
Edges are similarities
Stationary distribution represents node centrality
Selection
• Maximum Marginal Relevance
• Graph algorithms
• Word distribution models
mid-‘90s
present
Input document distribution Summary distribution
~
ww PPAA(w)(w)
Obama ?
speech ?
health ?
Montana ?
ww PPDD(w)(w)
Obama 0.017
speech 0.024
health 0.009
Montana 0.002
Selection
6
• Maximum Marginal Relevance
• Graph algorithms
• Word distribution models
mid-‘90s
present
SumBasic [Nenkova and Vanderwende, 2005]Value(wi) = PD(wi)
Value(si) = sum of its word values
Choose si with largest value
Adjust PD(w)
Repeat until length constraint
Selection
• Maximum Marginal Relevance
• Graph algorithms
• Word distribution models
• Regression models
mid-‘90s
present
ss11
ss22
ss33
word valuesword values positionposition lengthlength
12 1 24
4 2 14
6 3 18
ss22
ss33
ss11
F(x)
frequency is just one of many features
Selection
7
• Maximum Marginal Relevance
• Graph algorithms
• Word distribution models
• Regression models
• Topic model-based[Haghighi and Vanderwende, 2009]
mid-‘90s
present
Selection
8
9
10
11
H & V 09PYTHY
12
• Maximum Marginal Relevance
• Graph algorithms
• Word distribution models
• Regression models
• Topic models
• Globally optimal search
mid-‘90s
present [McDonald, 2007]
ss11
ss33
ss22
ss44QQ
Optimal search using MMR
Integer Linear Program
Selection
13
[Gillick and Favre, 2008]
Universal health care is a divisive issue.
Obama addressed the House on Tuesday.
President Obama remained calm.
The health care bill is a major test for the Obama administration.ss11
ss22
ss33
ss44
conceptconcept valuevalue
Selection
[Gillick and Favre, 2008]
Universal health care is a divisive issue.
Obama addressed the House on Tuesday.
President Obama remained calm.
The health care bill is a major test for the Obama administration. conceptconcept valuevalue
obama 3
ss11
ss22
ss33
ss44
Selection
14
[Gillick and Favre, 2008]
Universal health care is a divisive issue.
Obama addressed the House on Tuesday.
President Obama remained calm.
The health care bill is a major test for the Obama administration. conceptconcept valuevalue
obama 3
health 2
ss11
ss22
ss33
ss44
Selection
[Gillick and Favre, 2008]
Universal health care is a divisive issue.
Obama addressed the House on Tuesday.
President Obama remained calm.
The health care bill is a major test for the Obama administration. conceptconcept valuevalue
obama 3
health 2
house 1
ss11
ss22
ss33
ss44
Selection
15
[Gillick and Favre, 2008]
Universal health care is a divisive issue.
Obama addressed the House on Tuesday.
President Obama remained calm.
conceptconcept valuevalue
obama 3
health 2
house 1
ss11
ss22
ss33
ss44
The health care bill is a major test for the Obama administration.
summarysummary lengthlength valuevalue
{s1, s3} 17 5
{s2, s3, s4} 17 6
Length limit: 18 words
greedy
optimal
Selection
Maximize Concept Coverage
[Gillick and Favre 09]
Optimization problem: Set Coverage
Value ofconcept c
Set of concepts present in summary s
Set of extractive summariesof document set D
Results
2009
Baseline
Bigram Recall
2009
Baseline
Pyramid
23.5
35.0
4.00
6.85
16
[Gillick, Riedhammer, Favre, Hakkani-Tur, 2008]
total concept value
summary length limit
maintain consistency between selected sentences and concepts
Integer Linear Program for the maximum coverage model
Selection
[Gillick and Favre, 2009]
This ILP is tractable for reasonable problems
Selection
17
•52 submissions
•27 teams
•44 topics• 10 input docs• 100 word summaries
Gillick & Favre• Rating scale: 1-10• Humans in [8.3, 9.3]
• Rating scale: 1-10• Humans in [8.5, 9.3]
• Rating scale: 0-1• Humans in [0.62, 0.77]
• Rating scale: 0-1• Humans in [0.11, 0.15]
Results [G & F, 2009]
[Gillick and Favre, 2008]
Error Breakdown?
18
How to include sentence position?
First sentences are unique
Selection
Some interesting work on sentence ordering[Barzilay et. al., 1997; 2002]
But choosing independent sentences is easier• First sentences usually stand alone well
• Sentences without unresolved pronouns• Classifier trained on OntoNotes: <10% error rate
Baseline ordering module (chronological) is not obviously worse than anything fancier
Selection
19
Problems with Extraction
It is therefore unsurprising that Lindsay pleaded not guilty yesterday afternoon to the charges filed against her, according to her publicist.
What would a human do?
Problems with Extraction
It is therefore unsurprising that Lindsay pleaded not guilty yesterday afternoon to the charges filed against her, according to her publicist.
What would a human do?
20
Sentence Rewriting
[Berg-Kirkpatrick, Gillick, and Klein 11]
Sentence Rewriting
[Berg-Kirkpatrick, Gillick, and Klein 11]
21
Sentence Rewriting
[Berg-Kirkpatrick, Gillick, and Klein 11]
Sentence Rewriting
[Berg-Kirkpatrick, Gillick, and Klein 11]
New Optimization problem: Safe Deletions
Set branch cut deletionsmade in creating summary s
Value ofdeletion d
How do we know how much a given deletion costs?
22
Learning
[Berg-Kirkpatrick, Gillick, and Klein 11]
Features:
Embed ILP in cutting plane algorithm.
Results
7.75
6.85
4.00
Now2009
Baseline
Bigram Recall
35.0
23.5
Now2009
Baseline
Pyramid
41.3
Sentence extraction is limiting
... and boring!
But abstractive summaries are much harder to generate…
in 25 words?
Beyond Extraction / Compression?
23
http://www.rinkworks.com/bookaminute/