abstractive text summarization · outline 1.introduction automaticsummarizationandmotivation...

44
Abstractive Text Summarization Isaac Koak November 19, 2016 University of Minnesota, Morris

Upload: others

Post on 23-Jul-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Abstractive Text Summarization

Isaac KoakNovember 19, 2016

University of Minnesota, Morris

Page 2: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Disappointing Wiki :(

[clicked red link] ⇓

1

Page 3: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Outline

1. Introduction

Automatic Summarization and Motivation

2. Background

N-gram

Document Vector Space and Cosine Similarity

3. Application

WikiWrite

4. Results

Summary Evaluation Metrics

WikiWrite Results

2

Page 4: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Introduction

Automatic Summarization andMotivation

Page 5: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Automatic Summarization

Goal: produce a concise summary using a computer program thatretains the most important points of the original document

Two categories:

• Extractive - concatenate words, phrases, or sentences in theoriginal document to form summary

• Abstractive - attempts a deeper analysis of the text andsummarize using own words

3

Page 6: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Human-written Revision Operations: Hongyan Jing, 2002

Operation Extractive AbstractiveSentence Reduction ✓ ✓Sentence Combination ✓ ✓Syntactic Transformation ✓ ✓Lexical Pharaphrasing ✓Generalization or Specification ✓Reordering ✓ ✓

4

Page 7: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Motivation: Why Abstractive Text Summarization?

Information overload - difficulty a person can have understandingan issue and making decisions that can be caused by the presenceof too much information. - Wikipedia

• Extractive methods stagnating• Greater range of application• Future looks exciting 5

Page 8: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Background

N-gram

Page 9: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

N-gram

n-gram - a continuous sequence of n words

• Example - “the cat is black”• unigrams - the, cat, is, black• bigrams - the cat, cat is, is black• trigrams - the cat is, cat is black• 1-skip-bigram - the is, cat black

6

Page 10: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Term Frequency (TF)

• TF - how many times a term appears in a document.

Term Countthe 3cat 1is 3black 1

Table 1: Document 1

TF(t,d) = number of times a term (t) appears in a document (d)total number of terms in the document (d)

TF(“the”) = 38 = 0.38

TF(“cat”) = 18 = 0.13 7

Page 11: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Term Frequency (TF): Stop Words

stop word - common words that carry little value

• English: a, an, the, is, on, that• Biology: cell• Computer Science: algorithm

8

Page 12: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Term Frequency-Inverse Document Frequency (TF-IDF)

TF(t,d) = number of times a term (t) appears in a document (d)total number of terms in the document (d)

IDF(t,D) = log(

total number of documents (D)number of documents with term t

)

• IDF - measure how informative a term is in a document.• TF-IDF - how important a term is to a document in a collectionof documents.

TFIDF = TF ∗ IDF

9

Page 13: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

TF-IDF: Example

Term Countthe 3cat 1is 3black 1

Table 2: Document 1

Term Countthe 3dog 1is 3black 1

Table 3: Document 2

TF(“the”) = 38 = 0.38

IDF(“the”,D) = log(22 ) = 0

TF(“dog”,d2) =18 = 0.13

IDF(“dog”,D) = log(21 ) = 0.30

TFIDF(“dog”,d2,D) = 0.13 ∗ 0.30 = 0.04

TFIDF(“the”,d2,D) = 0.38 ∗ 0 = 0.00 10

Page 14: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Background

Document Vector Space and CosineSimilarity

Page 15: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Cosine Similarity: Document Vector Space Model

1. a hen lives on a farm.2. a cow lives on a farm.

Term Doc 1 Doc 2a 2 2lives 1 1farm 1 1hen 1 0on 1 1cow 0 1

Table 4: Term Count

• hen: [2, 1, 1, 1, 1, 0]• cow: [2, 1, 1, 0, 1, 1]

11

Page 16: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Cosine Similarity

cosine similarity - measure of similarity between two vectors

cos (θ) = A · B∥A∥ ∥B∥

• Used to find similarity between documents• cosineSimilarity(hen, cow) = 0.87

• cos(0◦) = 1 : Similar• cos(90◦) = 0 : Not similar

12

Page 17: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Application

WikiWrite

Page 18: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Banerjee and Mitra, 2016

1. Current methods assume that the Wikipedia categories areknown

2. Copyright violations3. Coherence issues

13

Page 19: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Framework Overview

14

Page 20: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Paragraph Vector Distributed Memory (PVDM)

• Vector space model that preserves semantics• Purpose for WikiWrite:

1. Identification of similar articles on Wikipedia2. Inference of vector representations of new paragraphs retrievedfrom the web

15

Page 21: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Red-linked Entity

16

Page 22: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Similar Articles

• Use PV-DM to find similar Wikipedia articles to the Red-linkedentity

• Example - Sonia Bianchetti• Referee, International Skating Union (ISU), Judge, etc.

17

Page 23: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Information Retrieval

• Query reformulation - Rewrite query to be more specific.• Example: Machine Learning→ ”Machine Learning” algorithmintelligence.

• Grab top 20 Google search results.

18

Page 24: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Classifiers

• Classifiers - Place query results in the right section of our newarticle.

19

Page 25: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Summarization - Generate New Sentences

• A word-graph approach.• nodes: bigrams• edges: adjacency relationship between bigrams

• Cosine similarity ≥ 0.8 throw away• Sentences with same parent sentence weighed heavily

20

Page 26: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Sentence Importance and Linguistic Quality

• Sentence importance - cosine similarity between new sentenceand reformulated query

• Linguistic quality - trigram language model find best sequenceof words

21

Page 27: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Summarization - Coherence

• Adjacent sentence coherence to ensure global paragraphcoherence

• coherence score• transition frequency⇒ transition probability

• Multiply transition probabilities of individual features(nounsand verbs)

• Use cosine similarity to reduce Redundancy

22

Page 28: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Summarization - Paraphrasing

• Paraphrase Database (PPDB)• Make modifications to sentences and assign readability score

23

Page 29: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Summarization - Paraphrasing Example

• “The NSSP initiative will lead to significant economic benefits forboth countries”1. significant economic => considerable economic2. economic benefits => financial advantages

• Readability score for 1:• “lead to considerable economic benefit for”

• “The NSSP initiative will result in major financial advantages forthe two countries ”

24

Page 30: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Results

Summary Evaluation Metrics

Page 31: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Summary Evaluation Metrics

• Recall-Oriented Understudy for Gisting Evaluation (ROUGE) -Measure n-gram overlap between generated summary andreference summaries.

ROUGE = n-gram match between system and referencesn-grams in references

• F-Measure - Accuracy score

25

Page 32: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

ROUGE: Chin-Yew Lin, 2004

• ROUGE-N: N-gram based co-occurrence statistics

• ROUGE-L: LCS-based statistics

26

Page 33: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

ROUGE: Chin-Yew Lin, 2004

• ROUGE-N: N-gram based co-occurrence statistics• ROUGE-L: LCS-based statistics

26

Page 34: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

ROUGE: Chin-Yew Lin, 2004

• ROUGE-N: N-gram based co-occurrence statistics• ROUGE-L: LCS-based statistics• ROUGE-S: Skip-bigram-based co-occurrence statistics• ROUGE-W: Weighed version of ROUGE-L that favors consecutiveLCSes

27

Page 35: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

ROUGE: Example ROUGE-N

• Summary: police killed the gunman• Ref1: police kill the gunman• Ref2: the gunman kill police

1. ROUGE-2: Ref1 = Ref2• “the gunman”

28

Page 36: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

ROUGE: Example ROUGE-L

• Summary: police killed the gunman• Ref1: police kill the gunman• Ref2: the gunman kill police

1. ROUGE-2: Ref1 = Ref2• “the gunman”

2. ROUGE-L: Ref1 > Ref2• Ref1: “police the gunman”• Ref2: “the gunman”

29

Page 37: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Results

WikiWrite Results

Page 38: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Experiments

• 1000 randomly selected popular articles• Baseline systems:

1. WikiKreator: assumes Wikipedia categories are known2. Perceptron-ILP: extractive

• Experiments:1. Section classification2. Content selection3. Generate new articles

30

Page 39: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Section Classification

• Predict the section title given the section content

Table 5: Section Classification Results

Technique F1 Score Average TimeWikiWrite 0.622 ∼2 minsWikiKreator 0.481 ∼10 mins

31

Page 40: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Content Selection

• Reconstruct Wikipedia articles using knowledge from the web• WikiWrite (Ref) - doesn’t use reformulated query

Table 6: Content Selection Results

Technique ROUGE-1 ROUGE-2WikiWrite 0.441 0.223WikiWrite (Ref) 0.520 0.257WikiKreator 0.371 0.183Perceptron-ILP 0.342 0.169

32

Page 41: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

WikiWrite: Generated Articles

• Generate new articles.

Table 7: 50 Generated Wikipedia articles

StatisticsNumber of articles in mainspace 47Entire edit retained 12Modification of content 35Average number of edits 11Percentage of references retained 72%

33

Page 42: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Conclusion

• Abstractive summarization can be effective in generatingWikipedia articles

• Look into research ethics before committing: link• Abstractive summarization attracting more researchers.• Deep learning using neural networks is the future!

34

Page 43: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

Thanks!

Get the source of this theme and the demo presentation from

github.com/matze/mtheme

The theme itself is licensed under a Creative CommonsAttribution-ShareAlike 4.0 International License.

cba

Questions?35

Page 44: Abstractive Text Summarization · Outline 1.Introduction AutomaticSummarizationandMotivation 2.Background N-gram DocumentVectorSpaceandCosineSimilarity 3.Application WikiWrite 4.Results

References

1. Siddhartha Banerjee and Prasenjit Mitra. Wikiwrite: Generating wikipediaarticles automatically. In Proceedings of the Twenty-Fifth International JointConference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9- 15 July 2016,pages 2740–2746, 2016.

2. Siddhartha Banerjee and Prasenjit Mitra. Wikikreator: Improving wikipedia stubsautomatically. In Proceedings of the 53rd Annual Meeting of the Association forComputational Linguistics and the 7th International Joint Conference on NaturalLanguage Processing of the Asian Federation of Natural Language Processing,ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers, pages 867–877,2015.

3. Siddhartha Banerjee and Prasenjit Mitra. Filling the gaps: Improving wikipediastubs. In Proceedings of the 2015 ACM Symposium on Document Engineering,DocEng ’15, pages 117–120, New York, NY, USA, 2015. ACM.

4. Hongyan Jing. Using hidden markov modeling to decompose human-writtensummaries. Comput. Linguist., 28(4):527–543, December 2002.

5. Chin-Yew Lin. Rouge: A package for automatic evaluation of summaries. In StanSzpakowicz Marie-Francine Moens, editor, Text Summarization Branches Out:Proceedings of the ACL-04 Workshop, pages 74–81, Barcelona, Spain, July 2004.Association for Computational Linguistics.