dependency parsing - com4513/6513 natural language processing · relation extraction, e.g. identify...
TRANSCRIPT
![Page 1: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/1.jpg)
1
Dependency ParsingCOM4513/6513 Natural Language Processing
Nikos [email protected]
@nikaletras
Computer Science Department
Week 5Spring 2020
![Page 2: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/2.jpg)
2
In previous lectures...
Text Classification: Given an instance x (e.g. document),predict a label y ∈ YTasks: sentiment analysis, topic classification, etc.
Algorithm: Logistic Regression
![Page 3: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/3.jpg)
3
In previous lectures...
Sequence labelling: Given a sequence of wordsx = [x1, ...xN ], predict a sequence of labels y ∈ YN
Tasks: part of speech tagging, named entity recognition, etc.
Algorithms: Hidden Markov Models, Conditional RandomFields
![Page 4: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/4.jpg)
4
In this lecture...
Model richer linguistic representations: graphs
Dependency parses: Graphs representing syntactic relationsbetween words in a sentence
Two approaches:
Graph-based Dependency ParsingTransition-based Dependency Parsing
![Page 5: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/5.jpg)
4
In this lecture...
Model richer linguistic representations: graphs
Dependency parses: Graphs representing syntactic relationsbetween words in a sentence
Two approaches:
Graph-based Dependency ParsingTransition-based Dependency Parsing
![Page 6: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/6.jpg)
5
Dependecny Parsing: Applications
Relation extraction, e.g. identify entity pairs (AM, ArcticMonkeys), (Abbey Road, Beatles), (Different Class, Pulp)with the relation music album by
Question answering
Sentiment analysis
![Page 7: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/7.jpg)
6
What is a Dependency Parse?
Dependency parse (or tree): Graph representing syntacticrelations between words in a sentence
Nodes (or vertices): Words in a sentence
Edges (or arcs): Syntactic relations between words, e.g. dog
is the subject (nsubj) of likes (list of standard dependencyrelations)
Dependency Parsing: Automatically identify the syntacticrelations between words in a sentence
![Page 8: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/8.jpg)
6
What is a Dependency Parse?
Dependency parse (or tree): Graph representing syntacticrelations between words in a sentence
Nodes (or vertices): Words in a sentence
Edges (or arcs): Syntactic relations between words, e.g. dog
is the subject (nsubj) of likes (list of standard dependencyrelations)
Dependency Parsing: Automatically identify the syntacticrelations between words in a sentence
![Page 9: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/9.jpg)
6
What is a Dependency Parse?
Dependency parse (or tree): Graph representing syntacticrelations between words in a sentence
Nodes (or vertices): Words in a sentence
Edges (or arcs): Syntactic relations between words, e.g. dog
is the subject (nsubj) of likes (list of standard dependencyrelations)
Dependency Parsing: Automatically identify the syntacticrelations between words in a sentence
![Page 10: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/10.jpg)
7
Graph constraints
Connected: every word can be reached from any other wordignoring edge directionality
Acyclic: can’t re-visit the same word on a directed path
Single-Head: every word can have only one head
![Page 11: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/11.jpg)
7
Graph constraints
Connected: every word can be reached from any other wordignoring edge directionality
Acyclic: can’t re-visit the same word on a directed path
Single-Head: every word can have only one head
![Page 12: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/12.jpg)
7
Graph constraints
Connected: every word can be reached from any other wordignoring edge directionality
Acyclic: can’t re-visit the same word on a directed path
Single-Head: every word can have only one head
![Page 13: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/13.jpg)
8
Well-formed Dependency Parse?
Connected?
NO
Acyclic? YES
Single-headed? YES
Solution?
![Page 14: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/14.jpg)
8
Well-formed Dependency Parse?
Connected? NO
Acyclic? YES
Single-headed? YES
Solution?
![Page 15: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/15.jpg)
8
Well-formed Dependency Parse?
Connected? NO
Acyclic?
YES
Single-headed? YES
Solution?
![Page 16: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/16.jpg)
8
Well-formed Dependency Parse?
Connected? NO
Acyclic? YES
Single-headed? YES
Solution?
![Page 17: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/17.jpg)
8
Well-formed Dependency Parse?
Connected? NO
Acyclic? YES
Single-headed?
YES
Solution?
![Page 18: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/18.jpg)
8
Well-formed Dependency Parse?
Connected? NO
Acyclic? YES
Single-headed? YES
Solution?
![Page 19: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/19.jpg)
9
Well-formed Dependency Parse
Add a special root node with edges to any nodes without heads(main verb and punctuation).
![Page 20: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/20.jpg)
10
Dependency Parsing: Problem setup
Training data is pairs of word sequences (sentences) anddependency trees:
Dtrain = {(x1,G 1x )...(xM ,G M
x )}xm = [x1, ...xN ]
graph Gx = (Vx,Ax)
vertices Vx = {0, 1, ...,N}edges Ax = {(i , j , k)|i , j ∈ V , k ∈ L(labels)}
We want to learn a model to predict the best graph:
Gx = arg maxGx∈Gx
score(Gx, x)
![Page 21: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/21.jpg)
10
Dependency Parsing: Problem setup
Training data is pairs of word sequences (sentences) anddependency trees:
Dtrain = {(x1,G 1x )...(xM ,G M
x )}xm = [x1, ...xN ]
graph Gx = (Vx,Ax)
vertices Vx = {0, 1, ...,N}edges Ax = {(i , j , k)|i , j ∈ V , k ∈ L(labels)}
We want to learn a model to predict the best graph:
Gx = arg maxGx∈Gx
score(Gx, x)
![Page 22: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/22.jpg)
11
Learning a dependency parser
We want to learn a model to predict the best graph:
Gx = arg maxGx∈Gx
score(Gx, x)
where the Gx is a well-formed dependency tree.
Can we learn it using what we know so far?
Enumerationover all possible graphs will be expensive.
How about a classifier that predicts each edge? Maybe.But predicting an edge makes some edges invalid due to theacyclic and single-head constraints.
![Page 23: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/23.jpg)
11
Learning a dependency parser
We want to learn a model to predict the best graph:
Gx = arg maxGx∈Gx
score(Gx, x)
where the Gx is a well-formed dependency tree.
Can we learn it using what we know so far? Enumerationover all possible graphs will be expensive.
How about a classifier that predicts each edge? Maybe.But predicting an edge makes some edges invalid due to theacyclic and single-head constraints.
![Page 24: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/24.jpg)
11
Learning a dependency parser
We want to learn a model to predict the best graph:
Gx = arg maxGx∈Gx
score(Gx, x)
where the Gx is a well-formed dependency tree.
Can we learn it using what we know so far? Enumerationover all possible graphs will be expensive.
How about a classifier that predicts each edge?
Maybe.But predicting an edge makes some edges invalid due to theacyclic and single-head constraints.
![Page 25: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/25.jpg)
11
Learning a dependency parser
We want to learn a model to predict the best graph:
Gx = arg maxGx∈Gx
score(Gx, x)
where the Gx is a well-formed dependency tree.
Can we learn it using what we know so far? Enumerationover all possible graphs will be expensive.
How about a classifier that predicts each edge? Maybe.But predicting an edge makes some edges invalid due to theacyclic and single-head constraints.
![Page 26: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/26.jpg)
12
Maximum Spanning Tree
Spanning Tree: In graph theory, a spanning tree T of anundirected graph G is a subgraph that includes all of thevertices of G, with the minimum possible number of edges.
Tree: In computer science, a tree is a widely used datastructure (Abstract Data Type) that simulates a hierarchicaltree structure, with a root value and subtrees of children witha parent node, represented as a set of linked nodes.
![Page 27: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/27.jpg)
12
Maximum Spanning Tree
Spanning Tree: In graph theory, a spanning tree T of anundirected graph G is a subgraph that includes all of thevertices of G, with the minimum possible number of edges.
Tree: In computer science, a tree is a widely used datastructure (Abstract Data Type) that simulates a hierarchicaltree structure, with a root value and subtrees of children witha parent node, represented as a set of linked nodes.
![Page 28: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/28.jpg)
13
Maximum Spanning Tree
Score all edges, but keep only the max spanning tree usingChu-Liu-Edmonds algorithm, a modification to Kruskal’s algorithmfor extracting Maximum Spanning Trees.
Exact solution in O(N2) time using Chu-Liu-Edmonds.
![Page 29: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/29.jpg)
14
Kruskal’s algorithm
Input scored edges E
sort E by cost (opposit of score)
G = {}while G not spanning do:
pop the next edge e
if connecting different trees :
add e to G
Return G
![Page 30: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/30.jpg)
15
Graph-based Dependency Parsing
Decompose the graph score into arc scores:
Gx = arg maxGx∈Gx
score(Gx, x)
= arg maxGx∈Gx
w · Φ(Gx, x) (linear model)
= arg maxGx∈Gx
∑(i ,j ,l)∈Ax
w · φ((i , j , l), x) (arc-factored)
Can learn the weights with a Conditional Random Field!
![Page 31: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/31.jpg)
16
Feature Representation
What should φ((head , dependent, label), x) be?
unigram: head=had, head=VERB
bigram: head=had & dependent=effect
head=VERB & dependent=NOUN & between=ADJ
head=had & label=dobj & other-label=nsubj
NO!!! Breaks the arc-factored scoring
![Page 32: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/32.jpg)
16
Feature Representation
What should φ((head , dependent, label), x) be?
unigram: head=had, head=VERB
bigram: head=had & dependent=effect
head=VERB & dependent=NOUN & between=ADJ
head=had & label=dobj & other-label=nsubj
NO!!! Breaks the arc-factored scoring
![Page 33: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/33.jpg)
16
Feature Representation
What should φ((head , dependent, label), x) be?
unigram: head=had, head=VERB
bigram: head=had & dependent=effect
head=VERB & dependent=NOUN & between=ADJ
head=had & label=dobj & other-label=nsubj
NO!!! Breaks the arc-factored scoring
![Page 34: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/34.jpg)
16
Feature Representation
What should φ((head , dependent, label), x) be?
unigram: head=had, head=VERB
bigram: head=had & dependent=effect
head=VERB & dependent=NOUN & between=ADJ
head=had & label=dobj & other-label=nsubj
NO!!! Breaks the arc-factored scoring
![Page 35: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/35.jpg)
16
Feature Representation
What should φ((head , dependent, label), x) be?
unigram: head=had, head=VERB
bigram: head=had & dependent=effect
head=VERB & dependent=NOUN & between=ADJ
head=had & label=dobj & other-label=nsubjNO!!! Breaks the arc-factored scoring
![Page 36: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/36.jpg)
17
More Global models
Even though inference and learning are global, features arelocalised to arcs.
Can we have more global features?
Yes we can! Considersubgraphs spanning a few edges. But inference becomesharder, requiring more complex dynamic programs and cleverapproximations.
Is it worth it? Syntactic parsing has many applications, thusbetter compromises between speed and accuracy are alwayswelcome!
![Page 37: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/37.jpg)
17
More Global models
Even though inference and learning are global, features arelocalised to arcs.
Can we have more global features? Yes we can! Considersubgraphs spanning a few edges. But inference becomesharder, requiring more complex dynamic programs and cleverapproximations.
Is it worth it? Syntactic parsing has many applications, thusbetter compromises between speed and accuracy are alwayswelcome!
![Page 38: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/38.jpg)
18
Transition-based Dependency Parsing
Graph-based dependency parsing restricts the features toperform joint inference efficiently.
Transition-based dependency parsing trades joint inferencefor feature flexibility.
No more argmax over graphs, just use a classifier with anyfeatures we want!
![Page 39: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/39.jpg)
18
Transition-based Dependency Parsing
Graph-based dependency parsing restricts the features toperform joint inference efficiently.
Transition-based dependency parsing trades joint inferencefor feature flexibility.
No more argmax over graphs, just use a classifier with anyfeatures we want!
![Page 40: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/40.jpg)
18
Transition-based Dependency Parsing
Graph-based dependency parsing restricts the features toperform joint inference efficiently.
Transition-based dependency parsing trades joint inferencefor feature flexibility.
No more argmax over graphs, just use a classifier with anyfeatures we want!
![Page 41: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/41.jpg)
19
Joint vs incremental prediction
Joint: score (and enumerate) complete outputs (graphs)
![Page 42: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/42.jpg)
20
Joint vs incremental prediction
Incremental: predict a sequenceof actions (transitions)constructing the output
![Page 43: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/43.jpg)
21
Transition system
The actions A the classifier f can predict and their effect on thestate which tracks the prediction: St+1 = S1(α1 . . . αt)
What should the actions (transitions) be for dependency parsing?
![Page 44: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/44.jpg)
21
Transition system
The actions A the classifier f can predict and their effect on thestate which tracks the prediction: St+1 = S1(α1 . . . αt)
What should the actions (transitions) be for dependency parsing?
![Page 45: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/45.jpg)
22
Transition system setup
Input: Vertices Vx = {0, 1, ...,N} (words sentence x)
State S = (Stack,B,A):
Arcs A (dependencies predicted so far)Buffer Buf (words left to process)Stack Stack (last-in, first out memory)
Initial state: S0 = ([], [0, 1, ...,N], {})Final state: Sfinal = (Stack , [],A)
![Page 46: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/46.jpg)
23
Transition system
Shift (Stack , i |Buf ,A)→ (Stack|i ,Buf ,A): push next word fromthe buffer (i) to stack
Reduce (Stack|i ,Buf ,A)→ (Stack,Buf ,A): pop word top of thestack (i) if it has a head
Right-Arc(label) (Stack|i , j |Buf ,A)→(Stack |i |j ,Buf ,A ∪ {(i , j , l)}): create edge (i , j , label) between topof the stack (i) and next in buffer (j), push j
Left-Arc(label) (Stack|i , j |Buf ,A)→ (Stack, j |Buf ,A ∪ {(j , i , l)}):create edge (j , i , label) and pop i , if i has no head
![Page 47: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/47.jpg)
23
Transition system
Shift (Stack , i |Buf ,A)→ (Stack|i ,Buf ,A): push next word fromthe buffer (i) to stack
Reduce (Stack|i ,Buf ,A)→ (Stack ,Buf ,A): pop word top of thestack (i) if it has a head
Right-Arc(label) (Stack|i , j |Buf ,A)→(Stack |i |j ,Buf ,A ∪ {(i , j , l)}): create edge (i , j , label) between topof the stack (i) and next in buffer (j), push j
Left-Arc(label) (Stack|i , j |Buf ,A)→ (Stack, j |Buf ,A ∪ {(j , i , l)}):create edge (j , i , label) and pop i , if i has no head
![Page 48: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/48.jpg)
23
Transition system
Shift (Stack , i |Buf ,A)→ (Stack|i ,Buf ,A): push next word fromthe buffer (i) to stack
Reduce (Stack|i ,Buf ,A)→ (Stack ,Buf ,A): pop word top of thestack (i) if it has a head
Right-Arc(label) (Stack|i , j |Buf ,A)→(Stack |i |j ,Buf ,A ∪ {(i , j , l)}): create edge (i , j , label) between topof the stack (i) and next in buffer (j), push j
Left-Arc(label) (Stack|i , j |Buf ,A)→ (Stack, j |Buf ,A ∪ {(j , i , l)}):create edge (j , i , label) and pop i , if i has no head
![Page 49: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/49.jpg)
23
Transition system
Shift (Stack , i |Buf ,A)→ (Stack|i ,Buf ,A): push next word fromthe buffer (i) to stack
Reduce (Stack|i ,Buf ,A)→ (Stack ,Buf ,A): pop word top of thestack (i) if it has a head
Right-Arc(label) (Stack|i , j |Buf ,A)→(Stack |i |j ,Buf ,A ∪ {(i , j , l)}): create edge (i , j , label) between topof the stack (i) and next in buffer (j), push j
Left-Arc(label) (Stack |i , j |Buf ,A)→ (Stack, j |Buf ,A ∪ {(j , i , l)}):create edge (j , i , label) and pop i , if i has no head
![Page 50: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/50.jpg)
24
Example
Stack = []Buffer = [ROOT, Economic, news, had, little, effect, on, financial,markets, .]
Action?
Shift
![Page 51: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/51.jpg)
24
Example
Stack = []Buffer = [ROOT, Economic, news, had, little, effect, on, financial,markets, .]
Action? Shift
![Page 52: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/52.jpg)
25
Example
Stack = [ROOT]Buffer = [Economic, news, had, little, effect, on, financial,markets, .]
Action?
Shift
![Page 53: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/53.jpg)
25
Example
Stack = [ROOT]Buffer = [Economic, news, had, little, effect, on, financial,markets, .]
Action? Shift
![Page 54: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/54.jpg)
26
Example
Stack = [ROOT, Economic]Buffer = [news, had, little, effect, on, financial, markets, .]
Action?
Left-Arc(amod)
![Page 55: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/55.jpg)
26
Example
Stack = [ROOT, Economic]Buffer = [news, had, little, effect, on, financial, markets, .]
Action? Left-Arc(amod)
![Page 56: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/56.jpg)
27
Example
Stack = [ROOT]Buffer = [news, had, little, effect, on, financial, markets, .]
Action?
Shift
![Page 57: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/57.jpg)
27
Example
Stack = [ROOT]Buffer = [news, had, little, effect, on, financial, markets, .]
Action? Shift
![Page 58: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/58.jpg)
28
Example
Stack = [ROOT, news]Buffer = [had, little, effect, on, financial, markets, .]
Action?
Left-Arc(nsubj)
![Page 59: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/59.jpg)
28
Example
Stack = [ROOT, news]Buffer = [had, little, effect, on, financial, markets, .]
Action? Left-Arc(nsubj)
![Page 60: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/60.jpg)
29
Example
Stack = [ROOT]Buffer = [had, little, effect, on, financial, markets, .]
Action?
Right-Arc(root)
![Page 61: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/61.jpg)
29
Example
Stack = [ROOT]Buffer = [had, little, effect, on, financial, markets, .]
Action? Right-Arc(root)
![Page 62: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/62.jpg)
30
Example
Stack = [ROOT, had]Buffer = [little, effect, on, financial, markets, .]
Action?
Shift
![Page 63: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/63.jpg)
30
Example
Stack = [ROOT, had]Buffer = [little, effect, on, financial, markets, .]
Action? Shift
![Page 64: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/64.jpg)
31
Example
Stack = [ROOT, had, little]Buffer = [effect, on, financial, markets, .]
Action?
Left-Arc(amod)
![Page 65: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/65.jpg)
31
Example
Stack = [ROOT, had, little]Buffer = [effect, on, financial, markets, .]
Action? Left-Arc(amod)
![Page 66: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/66.jpg)
32
Example
Stack = [ROOT, had]Buffer = [effect, on, financial, markets, .]
Action?
Right-Arc(dobj)
![Page 67: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/67.jpg)
32
Example
Stack = [ROOT, had]Buffer = [effect, on, financial, markets, .]
Action? Right-Arc(dobj)
![Page 68: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/68.jpg)
33
Example
Stack = [ROOT, had, effect]Buffer = [on, financial, markets, .]
Action?
let’s fast-forward...
![Page 69: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/69.jpg)
33
Example
Stack = [ROOT, had, effect]Buffer = [on, financial, markets, .]
Action? let’s fast-forward...
![Page 70: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/70.jpg)
34
Example
Stack = [ROOT, had, .]Buffer = []
Empty buffer.
DONE!
![Page 71: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/71.jpg)
34
Example
Stack = [ROOT, had, .]Buffer = []
Empty buffer. DONE!
![Page 72: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/72.jpg)
35
Other transition systems?
This was the arc-eager system. Others:
arc-standard (3 actions)easy-first (not left-to-right), etc.
All operate with actions combining:
moving words from the buffer to the stack and back(shift/un-shift)popping words from the stack (reduce)creating labeled arcs left and right
Intuition: Define actions that are easy to learn
![Page 73: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/73.jpg)
35
Other transition systems?
This was the arc-eager system. Others:
arc-standard (3 actions)
easy-first (not left-to-right), etc.
All operate with actions combining:
moving words from the buffer to the stack and back(shift/un-shift)popping words from the stack (reduce)creating labeled arcs left and right
Intuition: Define actions that are easy to learn
![Page 74: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/74.jpg)
35
Other transition systems?
This was the arc-eager system. Others:
arc-standard (3 actions)easy-first (not left-to-right), etc.
All operate with actions combining:
moving words from the buffer to the stack and back(shift/un-shift)popping words from the stack (reduce)creating labeled arcs left and right
Intuition: Define actions that are easy to learn
![Page 75: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/75.jpg)
35
Other transition systems?
This was the arc-eager system. Others:
arc-standard (3 actions)easy-first (not left-to-right), etc.
All operate with actions combining:
moving words from the buffer to the stack and back(shift/un-shift)popping words from the stack (reduce)creating labeled arcs left and right
Intuition: Define actions that are easy to learn
![Page 76: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/76.jpg)
36
Transition-based Dependency Parsing
Input: sentence x
state S1 = initialize(x); timestep t = 1
while St not final do
action αt = arg maxα∈A
f (α,St)
St+1 = St(αt); t = t + 1
What is f ?
A multiclass classifierWhat do we need to learn it?
learning algorithm (e.g. logistic regression)
labelled training data
feature representation
![Page 77: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/77.jpg)
36
Transition-based Dependency Parsing
Input: sentence x
state S1 = initialize(x); timestep t = 1
while St not final do
action αt = arg maxα∈A
f (α,St)
St+1 = St(αt); t = t + 1
What is f ? A multiclass classifierWhat do we need to learn it?
learning algorithm (e.g. logistic regression)
labelled training data
feature representation
![Page 78: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/78.jpg)
36
Transition-based Dependency Parsing
Input: sentence x
state S1 = initialize(x); timestep t = 1
while St not final do
action αt = arg maxα∈A
f (α,St)
St+1 = St(αt); t = t + 1
What is f ? A multiclass classifierWhat do we need to learn it?
learning algorithm (e.g. logistic regression)
labelled training data
feature representation
![Page 79: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/79.jpg)
37
What are the right actions?
We only have sentences labelledwith graphs:Dtrain = {(x1,G 1
x )...(xM ,G Mx )}
Ask an oracle to tell us theactions constructing the graph!
In our case, a set of rules comparing the current stateS = (Stack ,Buffer ,ArcsPredicted) with Gx returning the correctaction as label
![Page 80: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/80.jpg)
37
What are the right actions?
We only have sentences labelledwith graphs:Dtrain = {(x1,G 1
x )...(xM ,G Mx )}
Ask an oracle to tell us theactions constructing the graph!
In our case, a set of rules comparing the current stateS = (Stack,Buffer ,ArcsPredicted) with Gx returning the correctaction as label
![Page 81: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/81.jpg)
37
What are the right actions?
We only have sentences labelledwith graphs:Dtrain = {(x1,G 1
x )...(xM ,G Mx )}
Ask an oracle to tell us theactions constructing the graph!
In our case, a set of rules comparing the current stateS = (Stack ,Buffer ,ArcsPredicted) with Gx returning the correctaction as label
![Page 82: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/82.jpg)
38
Learning from an oracle
Given a labelled sentence and a transition system, an oracle returnsstates labelled with the correct actions.
Dtrain = {(x1,G 1x )...(xM ,G M
x )}xm = [x1, ..., xN ]
graph Gx = (Vx,Ax)
vertices Vx = {0, 1, ...,N}edges Ax = {(i , j , k)|i , j ∈ V , k ∈ L(labels)}
states Sm= [S1, ...,ST ]
actions αm= [α1, ..., αT ]
![Page 83: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/83.jpg)
39
Feature Representation
Stack = [ROOT, had, effect]Buffer = [on, financial, markets, .]
What features would help us predict the correction actionRight-Arc(prep)?
![Page 84: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/84.jpg)
40
Feature Representation
Words/PoS in stack and buffer:wordS1=effect, wordB1=on, wordS2=had, posS1=NOUN,etc.
Dependencies so far:depS1=dobj, depLeftChildS1=amod,depRightChildS1=NULL, etc.
Previous actions:αt−1 = Right-Arc(dobj), etc.
![Page 85: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/85.jpg)
40
Feature Representation
Words/PoS in stack and buffer:wordS1=effect, wordB1=on, wordS2=had, posS1=NOUN,etc.
Dependencies so far:depS1=dobj, depLeftChildS1=amod,depRightChildS1=NULL, etc.
Previous actions:αt−1 = Right-Arc(dobj), etc.
![Page 86: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/86.jpg)
40
Feature Representation
Words/PoS in stack and buffer:wordS1=effect, wordB1=on, wordS2=had, posS1=NOUN,etc.
Dependencies so far:depS1=dobj, depLeftChildS1=amod,depRightChildS1=NULL, etc.
Previous actions:αt−1 = Right-Arc(dobj), etc.
![Page 87: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/87.jpg)
41
Transition-based vs Graph-based parsing
Transition-based tends to be better on shorter sentences,graph-based on longer ones
Graph-based tends to be better on long-range dependencies
Graph-based lacks the rich structural features
Transition-based is greedy and suffers from early mistakes
Actually, can we ameliorate the greedy issue?
Use Beam Search!
![Page 88: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/88.jpg)
41
Transition-based vs Graph-based parsing
Transition-based tends to be better on shorter sentences,graph-based on longer ones
Graph-based tends to be better on long-range dependencies
Graph-based lacks the rich structural features
Transition-based is greedy and suffers from early mistakes
Actually, can we ameliorate the greedy issue?Use Beam Search!
![Page 89: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/89.jpg)
41
Transition-based vs Graph-based parsing
Transition-based tends to be better on shorter sentences,graph-based on longer ones
Graph-based tends to be better on long-range dependencies
Graph-based lacks the rich structural features
Transition-based is greedy and suffers from early mistakes
Actually, can we ameliorate the greedy issue?Use Beam Search!
![Page 90: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/90.jpg)
41
Transition-based vs Graph-based parsing
Transition-based tends to be better on shorter sentences,graph-based on longer ones
Graph-based tends to be better on long-range dependencies
Graph-based lacks the rich structural features
Transition-based is greedy and suffers from early mistakes
Actually, can we ameliorate the greedy issue?Use Beam Search!
![Page 91: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/91.jpg)
41
Transition-based vs Graph-based parsing
Transition-based tends to be better on shorter sentences,graph-based on longer ones
Graph-based tends to be better on long-range dependencies
Graph-based lacks the rich structural features
Transition-based is greedy and suffers from early mistakes
Actually, can we ameliorate the greedy issue?Use Beam Search!
![Page 92: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/92.jpg)
42
Non-Projectivity
Arcs are crossing each other
long-range dependencies
free word order
![Page 93: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/93.jpg)
43
Non-projective Transition-based parsing
The standard stack-based systems cannot do it.
But there are extensions:
swap actions: word reoderingk-planar parsing: use multiple stacks (usually 2)
Standard graph-based parsing handles non-projectivity.
![Page 94: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/94.jpg)
44
Incremental Language Processing
Other problems solved with similar approaches (a.k.a.transition-based, greedy):
semantic parsing (converting a natural language utterance to alogical form)coreference resolution
Whenever you have a problem with a very large space ofoutputs, worth considering
![Page 95: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/95.jpg)
45
Evaluation
Head-finding word-accuracy:unlabelled: % of words with the right headlabelled: % of words with the right head and label
Sentence accuracy: % of sentences with correct graph
![Page 96: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/96.jpg)
46
Bibliography
Chapter 11 from Eisenstein
Nivre and McDonald’s tutorial slides
Nivre’s article on deterministic transition-based dependencyparsing
Nivre and McDonald’s paper comparing their approaches
![Page 97: Dependency Parsing - COM4513/6513 Natural Language Processing · Relation extraction, e.g. identify entity pairs (AM, Arctic Monkeys), (Abbey Road, Beatles), (Di erent Class, Pulp)](https://reader033.vdocument.in/reader033/viewer/2022042420/5f3779143d229c7a6f7ce6d5/html5/thumbnails/97.jpg)
47
Coming up next week...
Feed-forward Neural Networks
Getting ready for Assignment 2!