![Page 1: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/1.jpg)
Institute of Software, School of EECS, Peking University
Yan Xu, Lili Mou, Ge Li, Yunchuan Chen, Hao Peng, Zhi Jin
Sep 21, 2015
Classifying Relations via Long Short Term Memory Networks
along Shortest Dependency Paths
![Page 2: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/2.jpg)
2
Outline 1. Overview of Relation Extraction
2. Linguistic Phenomenon Related to Relation Extraction
3. Deep Neural Networks
Multi-Channel Long Short Term Memory Networks
Regularization: Dropout
4. Experiments
5. Conclusions
![Page 3: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/3.jpg)
3
Information Extraction A trillion gallons of water have been poured into an empty region of
outer space.
A trillion gallons of [water]e1 have been poured into an empty [region]e2 of outer space.
Named Entity Recognition
Relation Classification
Entity-Destination ([water]e1, [region]e2)
![Page 4: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/4.jpg)
4
SemEval 2010 Task 8 - Dataset Evaluation Exercises on Semantic Evaluation - ACL
SigLex event
Tasks 8 – Multi-Way Classification of Semantic Relations Between Pairs of Nominals
Training data
- 8000 training samples
"The system as described above has its greatest application in an arrayed <e1>configuration</e1> of antenna <e2>elements</e2>."
Component-Whole(e2, e1)
Testing data
- 2717 testing samples
![Page 5: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/5.jpg)
5
SemEval 2010 Task 8 - Relations
(1) Cause-Effect
(2) Instrument-Agency
(3) Product-Producer
(4) Content-Container
(5) Entity-Origin
(6) Entity-Destination
(7) Component-Whole
(8) Member-Collection
(9) Message-Topic
(10) Other
![Page 6: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/6.jpg)
6
Outline 1. Overview of Relation Extraction
2. Linguistic Phenomenon Related to Relation Extraction
3. Deep Neural Networks
Multi-Channel Long Short Term Memory Networks
Regularization: Dropout
4. Experiments
5. Conclusions
![Page 7: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/7.jpg)
7
Motivation Which type of sentence structures can be more
appropriate ?
Which type of linguistic information can be incorporated ?
![Page 8: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/8.jpg)
8
Dependency Parse Tree
![Page 9: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/9.jpg)
9
Shortest Dependency Path (SDP)
![Page 10: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/10.jpg)
10
Directionality Dependency trees are a kind of directed graph
The entities’ relation distinguishes its directionality
Subpath1: [water]e1 of gallons poured
Subpath2: poured into [region]e2
![Page 11: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/11.jpg)
11
Info 1: Word Representations Word Embeddings
-Word2vec – map a word to a real-valued vector capturing
word’s syntactic and semantic information (Mikolov, NIPS’ 2013)
Toy example
- Average embeddings + SVM nearly 79% F1-score
![Page 12: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/12.jpg)
12
Info 2: POS tags Ally each word in the path with its POS tag
Take a coarse-grained POS category heuristically
![Page 13: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/13.jpg)
13
Info 3: Grammatical Relations A grammatical relation expresses the dependency
between a governing word and a dependent word
Some grammatical relations reflect semantic relation strongly. like “nsubj”, “dboj”, or “pobj”, etc
In our experiments, grammatical relations are grouped into 19 classes, mainly based on the category proposed by De Marneffe in LREC’ 2006
![Page 14: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/14.jpg)
14
Info 4: Hypernyms With the prior knowledge of hypernyms, we know that
“water is a kind of substance.”
This is a hint that the entities, water and region, are more of “Entity-Destination” relations than other relation like “Communication-Topic”, “Cause-Effect”,
etc.
With the help of supersensetagger (Ciaramita, EMNLP’ 06)
![Page 15: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/15.jpg)
15
Outline 1. Overview of Relation Extraction
2. Linguistic Phenomenon of Relation Extraction
3. Deep Neural Networks
Multi-Channel Long Short Term Memory Networks
Regularization: Dropout
4. Experiments
5. Conclusions
![Page 16: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/16.jpg)
16
Recurrent Neural Network The recurrent neural network is suitable for modeling
sequential data by nature.
Gradient vanishing
Weight matrices for the input connections
Weight matrices for the recurrent connections
![Page 17: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/17.jpg)
17
Long Short Term Memory Networks
![Page 18: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/18.jpg)
18
Long Short Term Memory Networks
![Page 19: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/19.jpg)
19
Framework of SDP-LSTM
![Page 20: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/20.jpg)
20
Framework of SDP-LSTM
![Page 21: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/21.jpg)
21
Dropout strategies Randomly omitting
![Page 22: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/22.jpg)
22
Dropout strategies Randomly omitting
Omitting VS Memorizing
- Dropout different types of neural network layers respectively.
![Page 23: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/23.jpg)
23
Training Objective Penalized cross-entropy error
Number of classes
Frobenius norm
![Page 24: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/24.jpg)
24
Outline 1. Overview of Relation Extraction
2. Linguistic Phenomenon Related to Relation Extraction
3. Deep Neural Networks
Multi-Channel Long Short Term Memory Networks
Regularization: Dropout
4. Experiments
5. Conclusions
![Page 25: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/25.jpg)
25
Effect of Dropout Strategies
![Page 26: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/26.jpg)
26
Effect of Dropout Strategies
![Page 27: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/27.jpg)
27
Effect of Dropout Strategies
![Page 28: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/28.jpg)
28
Effects of Different Channels Channel effects
Traditional recurrent neural network: 82.8%
LSTM over one path: 82.2%
![Page 29: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/29.jpg)
29
Comparison
![Page 30: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/30.jpg)
30
New Mission Impossible
86 % ?
![Page 31: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/31.jpg)
31
Outline 1. Overview of Relation Extraction
2. Linguistic Phenomenon Related to Relation Extraction
3. Deep Neural Networks
Multi-Channel Long Short Term Memory Networks
Regularization: Dropout
4. Experiments
5. Conclusions
![Page 32: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/32.jpg)
32
Conclusions Classifying relation is a challenging task due
to the inherent ambiguity of natural languages and the diversity of sentence expression
The shortest dependency path can be a valuable resource for relation classification
Treating the shortest dependency path as two
sub-paths helps to capture the directionality
LSTM units are effective in feature detection and propagation
along the shortest dependency path
![Page 33: Classifying Relations via Long Short Term Memory Networks ... · Classifying Relations via Long Short Term Memory Networks along Shortest Dependency Paths . 2 Outline 1. Overview](https://reader031.vdocument.in/reader031/viewer/2022040409/5ec698793f83e745073e85eb/html5/thumbnails/33.jpg)
33
Thanks!
Q&A