dep3.pdf

56
Dependency Parsing - Part II Pawan Goyal CSE, IIT Kharagpur October 16, 2014 Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 1 / 48

Upload: amar-kaswan

Post on 08-Nov-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

  • Dependency Parsing - Part II

    Pawan Goyal

    CSE, IIT Kharagpur

    October 16, 2014

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 1 / 48

  • Maximum Spanning Tree Based

    Basic IdeaStarting from all possible connections, find the maximum spanning tree.

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 2 / 48

  • Directed Spanning Trees

    A directed spanning tree of a (multi-)digraph G = (V ,A) is a subgraphG = (V ,A) such that:

    I V = VI A A, and |A|= |V| 1I G is a tree (acyclic)

    A spanning tree of the following (multi-)digraphs

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 3 / 48

  • Directed Spanning Trees

    A directed spanning tree of a (multi-)digraph G = (V ,A) is a subgraphG = (V ,A) such that:

    I V = VI A A, and |A|= |V| 1I G is a tree (acyclic)

    A spanning tree of the following (multi-)digraphs

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 3 / 48

  • Weighted Directed Spanning Trees

    Assume we have a weight function for each arc in a multi-digraphG = (V ,A).

    Define wijk 0 to be the weight of (i, j,k) A for a multi-digraphDefine the weight of directed spanning tree G of graph G as

    w(G) =

    (i,j,k)Gwijk

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 4 / 48

  • Maximum Spanning Trees (MST)

    Let T(G) be the set of all spanning trees for graph G

    The MST problem

    Find the spanning tree G of the graph G that has the highest weight

    G = argmaxGT(G)

    w(G) = argmaxGT(G)

    (i,j,k)G

    wijk

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 5 / 48

  • Maximum Spanning Trees (MST)

    Let T(G) be the set of all spanning trees for graph G

    The MST problem

    Find the spanning tree G of the graph G that has the highest weight

    G = argmaxGT(G)

    w(G) = argmaxGT(G)

    (i,j,k)G

    wijk

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 5 / 48

  • Finding MST

    Directed Graph

    For each sentence x, define the directed graph Gx = (Vx,Ex) given by

    Vx = {x0 = root,x1, . . . ,xn}Ex = {(i, j) : i , j,(i, j) [0 : n] [1 : n]}

    Gx is a graph withthe sentence words and the dummy root symbol as vertices and

    a directed edge between every pair of distinct words and

    a directed edge from the root symbol to every word

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 6 / 48

  • Finding MST

    Directed Graph

    For each sentence x, define the directed graph Gx = (Vx,Ex) given by

    Vx = {x0 = root,x1, . . . ,xn}Ex = {(i, j) : i , j,(i, j) [0 : n] [1 : n]}

    Gx is a graph withthe sentence words and the dummy root symbol as vertices and

    a directed edge between every pair of distinct words and

    a directed edge from the root symbol to every word

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 6 / 48

  • MST and Dependency Graph

    TheoremEvery valid dependency graph for sentence x is equivalent to a directedspanning tree for Gx that originates out of vertex 0

    Three problems

    Defining wijk corresponding to a feature space

    Learning wijk from gold-standard data

    At run-time, inferring the MST for a sentence x using the learnt weights

    Lets talk about inference part first

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 7 / 48

  • MST and Dependency Graph

    TheoremEvery valid dependency graph for sentence x is equivalent to a directedspanning tree for Gx that originates out of vertex 0

    Three problems

    Defining wijk corresponding to a feature space

    Learning wijk from gold-standard data

    At run-time, inferring the MST for a sentence x using the learnt weights

    Lets talk about inference part first

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 7 / 48

  • MST and Dependency Graph

    TheoremEvery valid dependency graph for sentence x is equivalent to a directedspanning tree for Gx that originates out of vertex 0

    Three problems

    Defining wijk corresponding to a feature space

    Learning wijk from gold-standard data

    At run-time, inferring the MST for a sentence x using the learnt weights

    Lets talk about inference part first

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 7 / 48

  • Chu-Liu-Edmonds Algorithm

    Chu-Liu-Edmonds AlgorithmEach vertex in the graph greedily selects the incoming edge with thehighest weight.

    If a tree results, it must be a maximum spanning tree.If not, there must be a cycle.

    I Identify the cycle and contract it into a single vertex.I Recalculate edge weights going into and out of the cycle.

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 8 / 48

  • Chu-Liu-Edmonds Algorithm

    x = John saw Mary

    Build the directed graph

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 9 / 48

  • Chu-Liu-Edmonds Algorithm

    Find the highest scoring incoming arc for each vertex

    If this is a tree, then we have found MST.

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 10 / 48

  • Chu-Liu-Edmonds Algorithm

    If not a tree, identify cycle and contract

    Recalculate arc weights into and out-of cycle

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 11 / 48

  • Chu-Liu-Edmonds Algorithm

    Outgoing arc weightsEqual to the max of outgoing arc over all vertices in cycle

    e.g., John Mary is 3 and saw Mary is 30.

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 12 / 48

  • Chu-Liu-Edmonds Algorithm

    Incoming arc weightsEqual to the weight of best spanning tree that includes head of incomingarc and all nodes in cycle

    root saw John is 40root John saw is 29

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 13 / 48

  • Chu-Liu-Edmonds Algorithm

    Calling the algorithm again on the contracted graph:

    This is a tree and the MST for the contracted graph

    Go back up the recursive call and reconstruct final graph

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 14 / 48

  • Chu-Liu-Edmonds Algorithm

    The edge from wjs to Mary was from saw

    The edge from root to wjs represented a tree from root to saw to John.

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 15 / 48

  • Arc weights as linear classifiers

    wijk = w.f (i, j,k)

    Arc weights are a linear combination of features of the arc f (i, j,k) and acorresponding weight vector w

    What arc features?

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 16 / 48

  • Arc weights as linear classifiers

    wijk = w.f (i, j,k)

    Arc weights are a linear combination of features of the arc f (i, j,k) and acorresponding weight vector w

    What arc features?

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 16 / 48

  • Arc Features f (i, j,k)

    FeaturesIdentities of the words wi and wj for a label lkhead = saw & dependent=with

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 17 / 48

  • Arc Features f (i, j,k)

    FeaturesPart-of-speech tags of the words wi and wj for a label lkhead-pos = Verb & dependent-pos=Preposition

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 18 / 48

  • Arc Features f (i, j,k)

    FeaturesPart-of-speech of words surrounding and between wi and wj

    inbetween-pos = Nouninbetween-pos = Adverb

    dependent-pos-right = Pronounhead-pos-left=Noun

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 19 / 48

  • Arc Features f (i, j,k)

    FeaturesNumber of words between wi and wj, and their orientation

    arc-distance = 3arc-direction = right

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 20 / 48

  • Arc Features f (i, j,k)

    FeaturesCombinationshead-pos=Verb & dependent-pos=Preposition & arc-label=PPhead-pos=Verb & dependent=with & arc-distance=3

    No limit : any feature over arc (i, j,k) or input x

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 21 / 48

  • Learning the parameters

    Re-write the inference problem

    G = argmaxGT(Gx)

    (i,j,k)G

    wijk

    = argmaxGT(Gx)

    w

    (i,j,k)Gf (i, j,k)

    = argmaxGT(Gx)

    w f (G)

    Which can be plugged into learning algorithms

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 22 / 48

  • Learning the parameters

    Re-write the inference problem

    G = argmaxGT(Gx)

    (i,j,k)G

    wijk

    = argmaxGT(Gx)

    w

    (i,j,k)Gf (i, j,k)

    = argmaxGT(Gx)

    w f (G)

    Which can be plugged into learning algorithms

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 22 / 48

  • Inference-based Learning

    Training data: T = {(xt,Gt)}|T |t=11. w(0) = 0; i = 02. for n : 1..N3. for t : 1..|T |4. Let G = argmaxGw(i).f (G)5. if G , G6. w(i+1) = w(i)+ f (Gt) f (G)7. i = i+18. return wi

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 23 / 48

  • Dependency Parsing as a Constraint Satisfaction problem

    The last two approaches were data-driven

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 24 / 48

  • Constraint Satisfaction

    Uses Constraint Dependency Grammar (CDG)

    Grammatical rules are given as constraints on word-to-word modification

    Parsing is formalized as a constraint satisfaction problem

    Parsing is an eliminative process rather than a constructive process

    Constraint satisfaction removes values that contradict constraints

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 25 / 48

  • Constraint Propagation

    Three stepsStep 1. Form initial constraint network using a core grammar

    Step 2. Remove local inconsistencies

    Step 3. If ambiguity remains, add new constraints and repeat step 2.

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 26 / 48

  • Constraint Propagation Example

    Senence: Put the block on the floor on the table in the room

    Simplified representation: V1 NP2 PP3 PP4 PP5Correct analysis

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 27 / 48

  • Initial Constraints

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 28 / 48

  • Initial Constraints

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 29 / 48

  • Initial Constraints

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 30 / 48

  • Initial Constraints

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 31 / 48

  • Adding New Constraints

    Still 14 possible analyses.

    Introduce more constrains:

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 32 / 48

  • Modified Tables

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 33 / 48

  • Modified Tables

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 34 / 48

  • Modified Tables

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 35 / 48

  • Modified Tables

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 36 / 48

  • Modified Tables

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 37 / 48

  • Modified Tables

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 38 / 48

  • Modified Tables

    No object can be on two different objects.

    Once every arc has been resolved, we get a unique dependency tree

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 39 / 48

  • Modified Tables

    No object can be on two different objects.

    Once every arc has been resolved, we get a unique dependency tree

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 39 / 48

  • Example from a Sanskrit Sentence: POS along withdependency

    Consider the sentence:

    ramah. vanam. gacchati

    1. ramah. = rama {masc.} {sg.} {nom.}2. ramah. = ra {pr.} {1p} {pl.}3. vanam. = vana {neu.} {sg.} {nom.}4. vanam. = vana {neu.} {sg.} {acc.}5. gacchati = gam {pr.} {3p.} {sg.}6. gacchati = gam {pr. part.} {masc.} {sg.} {loc.}7. gacchati = gam {pr. part.} {neu.} {sg.} {loc.}

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 40 / 48

  • Example from a Sanskrit Sentence

    1. ramah. = rama {masc.} {sg.} {nom.}2. ramah. = ra {pr.} {1p} {pl.}3. vanam. = vana {neu.} {sg.} {nom.}4. vanam. = vana {neu.} {sg.} {acc.}5. gacchati = gam {pr.} {3p.} {sg.}6. gacchati = gam {pr. part.} {masc.} {sg.} {loc.}7. gacchati = gam {pr. part.} {neu.} {sg.} {loc.}

    Figure: Possible Relations

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 41 / 48

  • Example from a Sanskrit Sentence

    Figure: Adjacency and Possiblepaths

    1. ramah. = rama {masc.} {sg.} {nom.}2. ramah. = ra {pr.} {1p} {pl.}3. vanam. = vana {neu.} {sg.} {nom.}4. vanam. = vana {neu.} {sg.} {acc.}5. gacchati = gam {pr.} {3p.} {sg.}6. gacchati = gam {pr. part.} {masc.} {sg.} {loc.}7. gacchati = gam {pr. part.} {neu.} {sg.} {loc.}

    A path P is a sequence of edges whichconnects the nodes from S to F.

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 42 / 48

  • Example from a Sanskrit Sentence

    Figure: A sample Path

    For example, S-1-3-5-F is a path.

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 43 / 48

  • How the Parser works?

    Figure: Adjacency and Possible paths

    Figure: Relations

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 44 / 48

  • How the Parser works?

    Figure: Adjacency and Possible paths

    Figure: Relations

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 45 / 48

  • How the Parser works?

    Figure: Adjacency and Possible paths

    Figure: Relations

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 46 / 48

  • How the Parser works?

    Figure: Adjacency and Possible paths

    Figure: Relations

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 47 / 48

  • How the Parser works?

    Figure: Adjacency and Possible paths

    Figure: Relations

    Pawan Goyal (IIT Kharagpur) Dependency Parsing - Part II October 16, 2014 48 / 48