neural summarization by extracting sentences and words

17
Neural Summarization by Extracting Sentences and Words Jianpeng Cheng and Mirella Lapata ACL 2016 Presentator: Tomonori Kodaira 1

Upload: kodaira-tomonori

Post on 05-Apr-2017

225 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Neural Summarization by Extracting Sentences and Words

Neural Summarization by Extracting Sentences

and WordsJianpeng Cheng and Mirella Lapata

ACL 2016

Presentator: Tomonori Kodaira

1

Page 2: Neural Summarization by Extracting Sentences and Words

Intro

• Task: Single document summarization (extracting sentences or words)

• Model: a neural network-based hierarchical document reader or encoder and an attention-based content extractor.

• Data: DailyMail news

2

Page 3: Neural Summarization by Extracting Sentences and Words

Problem Formulation• Sentence Extraction

a summary from D by selecting a subset of j sentences. (predicting label: yL ∈ {0, 1})

• Word Extractiona language generation task with out put vocabulary restricted to the original document.

3

Page 4: Neural Summarization by Extracting Sentences and Words

Data• They make two large-scale datasets.

4

Page 5: Neural Summarization by Extracting Sentences and Words

Data (sentence extraction)

• They retrieved hundreds of thousands of news articles and their corresponding highlights from DailyMail.

• They designed a rule-based system that determines whether a document sentence matches a highlight.(Woodsend and Lapata, 2010)

5

Page 6: Neural Summarization by Extracting Sentences and Words

Data (word extraction dataset)

• In cases where all highlights words come from the original document, the pair is added the dataset.

• For OOV words, they check if a neighbor, represented by pre-trained embeddings, is in the original document.

• If they cannot find any substitutes, they discard the pair.

• word extraction dataset containing 170K articles.

6

Page 7: Neural Summarization by Extracting Sentences and Words

Neural Summarization Model

• Key components:

• neural network-based hierarchical document reader

• attention-based hierarchical content extractor.

7

Page 8: Neural Summarization by Extracting Sentences and Words

8

kernel K ∈ Rc x d of width cW ∈ Rn x d

Document Reader (Convolutional Sentence Encoder)

sum these sentence vectors

Page 9: Neural Summarization by Extracting Sentences and Words

• Long Short-Term Memory (LSTM) activation unit for ameliorating the vanishing gradient problem when training long sequences (Hochreiter and Schmidhuber, 1997)

9

Document Reader (Recurrent Document Encoder)

Page 10: Neural Summarization by Extracting Sentences and Words

• Their sentence extractor applies attention to directly extract salient sentences after reading them.

• at the beginning, they set pt-t to the true label of the previous sentence; as training goes on, they gradually shift its value to the predicted label.

10

Sentence Extractor

Page 11: Neural Summarization by Extracting Sentences and Words

Word Extractor

• a sequential labeling model

• use n-gram features collected from the document to rerank candidate summaries obtained via beam decoding.

• incorporate the features in a log-linear reranker whose feature weights are optimized with minimum error rate training (Och, 2003)

11

Page 12: Neural Summarization by Extracting Sentences and Words

Experimental Setup

• Datasets:

• two datasets created from DailyMail news: 90% for training, 5% for validation and 5% for testing

• DUC-2002 single document summarization task.

12

Page 13: Neural Summarization by Extracting Sentences and Words

• Parameters:

• Adam (learning rate 0.01)

• The two momentum parameters: 0.99 and 0.999.

• batch size of 20 documents

• The size of word, sentence, document embedding: 150, 300, and 750.(word embedding is pre-trained)

• Kernel sizes {1, 2, 3, 4, 5, 6, 7}

• drop out 0.5

• The depth of each LSTM module: 113

Experimental Setup

Page 14: Neural Summarization by Extracting Sentences and Words

• LEAD (leading three sents.)

• LREG (logistic regression)

• ILP

• NN-ABS (Rush et al. 2015)

• TGRAPH (Parveen et al., 2015)

• URANK (Wan, 2010)

• NN-SE (Sentence extractor)

• NN-WE (Word extractor)14

Results

Page 15: Neural Summarization by Extracting Sentences and Words

Results

15

• evaluate the generated summaries by eliciting human judgments for 20 randomly sampled DUC 2002 test documents.

• Subjects were asked to rank the summaries from best to wrost (with ties allow)

• collect 5 responses per document.

Page 16: Neural Summarization by Extracting Sentences and Words

Results

16

Page 17: Neural Summarization by Extracting Sentences and Words

Conclusion• They developed two classes of models based on

sentence and word extractor.

• Future Work:

• combining their model with a tree-based algorithm (Cohn and Lapata, 2009)

• or phrase-based(Lebret et al., 2015).

17