1 cs546: machine learning and natural language preparation to the term project: - dependency parsing...

41
1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling Slides for Dependency Parsing are based on Joakim Nivre and Sandar Kuebler slides from ACL 06 Tutorial

Upload: brent-chapman

Post on 27-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

1

CS546: Machine Learning and Natural Language

Preparation to the Term Project:- Dependency Parsing- Dependency Representation for Semantic Role Labeling

Slides for Dependency Parsing are based on Joakim Nivre and Sandar Kuebler slides from ACL 06 Tutorial

Page 2: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

2

Outline

–Dependency Parsing:• Formalism• Dependency Parsing algorithms

– Semantic Role Labeling• Dependency Formalism• Basic Approach for the First Part of the Term

Project– Pipeline for the first assignment

=5

Page 3: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

3

=5

• Formalization by Lucien Tesniere [Tesniere, 1959]• Idea known long before (e.g., Panini, India, >2000 yrs ago)• Studied extensively in the Prague School approach in syntax• (in US, research was focused more on constituent formalism)

Page 4: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

4

=5

Page 5: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

5

=5

Page 6: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

(or Constituent Structure)

Page 7: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 8: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

8

=5

• There are advantages of dependency structures:– for free (or semi-free) order languages– easier to convert to predicate-argument structure– ...

• But there are drawbacks too...• You can try to convert one representation into

another– but, in general, these formalisms are not equivalent

Constituent vs Dependency

Page 9: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

9

=5

• Most of approaches have been focused constituent tree-based features

• But now it changes– Machine Translation (e.g., Menezes & Quirk, 07)– Summarization and sentence compression (e.g.,

Fillippova & Strube, 08)– Opinion mining, (e.g., Lerman et al, 08)– Information extraction, Question Answering (e.g.,

Bouma et al, 06)

Dependency structures for NLP tasks

Page 10: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 11: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 12: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

All these conditions will be violated for semantic dependency graphs we will consider later

Page 13: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

You can think of it as (related) planarity

Page 14: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

14

=5

• Global inference algorithms:– graph-based approaches– transition-based approaches

• We will not consider– rule-based systems– constraint satisfaction

Algorithms

Page 15: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

15

=5

Idea:• Convert dependency structures to constituent

structures– easy for projective dependency structures

• Apply algorithms for constituent parsing to them– E.g., CKY - if some of you attend the class by Julia

Hockenmaier on parsing it was/will be covered there

Converting to Constituent Formalism

Page 16: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

16

=5

Converting to Constituent Formalism

• Different independence assumption lead to different statistical models– both accuracy and parsing time (dynamic

programming) varies

Page 17: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 18: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

• Features f(i,j) can include dependence on any words in the sentence, i.e. f(i, j, sent) • But still the score decomposes over edges in the graph•Strong independence assumption

Page 19: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

Online Learning (Structured Perceptron)

• Joint feature representation:– we will talk about it more later

• Algoritm:

Here we run MST or Eisner’s algorithm

Features over edges only

Page 20: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

• Here, when we say parsing algorithm (=derivation order) we often mean mapping:– Given a tree map it to a sequence of actions which create

this tree

• Tree T is equivalent to these sequence of actions:d1, ..., dn

• Therefore, P(T) = P(d1, ..., dn)

• P(T) = P(d1, ..., dn) = P(d1) P(d2|d1)... P(dn|dn-1, ..., d1)

• Ambigous: some times “parsing algorithms” refers to the decoding algorithm to find the most likely sequence

Parsing Algorithms

You can use classifiers here and search for most likely sequence (Recall Maryam’s talk)

Page 21: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 22: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

• Most algorithms are restricted to projective structures, but not all

Page 23: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

It can handle only projective structures

Page 24: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 25: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 26: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 27: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 28: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 29: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 30: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 31: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

• Your training examples are{(dj; d1,....,dn-1)} -- collections of parsing contexts

• Your want to predict correct actionsP(dn|dn-1, ..., d1)• How to define feature representation of (dn-1, ..., d1)• You can think instead of (dn-1, ..., d1) in terms of:

– partial tree corresponding to them– current contents of queue (Q) and stack (S)– The most important features are top of S and front of Q (only

between them you can potentially create links)

• (Inference: you can do it greedily or with beam search)

How to learn in this case?

Page 32: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

CoNLL-2006 Shared Task, Average over 12 langs (Labeled Attachment Score)

McDonald et al (MST): 80.27Nivre et al (Transitions): 80.19

• Results are the same• A lot of research in both directions,– e.g., Latent Variable Models for Transition Based

Parsing (Titov and Henderson, 07) – best single-model system in CoNLL-2007 (third overall)

Results: Transition-based vs Graph-Based

Page 33: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

• Graph-Based Algorithms (McDonald)• Post-Processing of Projective Algorithms (Hall

and Novak, 05)• Transition-Based Algorithms which handle

non-projectivity (Attardi 06, Titov et al, 08; Nivre et al, 08)

• Pseudo Projective Parsing: Removing non-projective (crossing) links and encoding them in labels (Nivre and Nilsson, 05)

Non-Projective Parsing

Page 34: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

• Graph-Based Algorithms (McDonald)• Post-Processing of Projective Algorithms (Hall

and Novak, 05)• Transition-Based Algorithms which handle

non-projectivity (Attardi 06, Titov et al, 08; Nivre et al, 08)

• Pseudo Projective Parsing: Removing non-projective (crossing) links and encoding them in labels (Nivre and Nilsson, 05)

Non-Projective Parsing

Page 35: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 36: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling
Page 37: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

37

First Phase of Term Project– The goal is to construct a joint syntax-SRL

(Semantic Role Labeling) dependency structures– Similar to CoNLL-2008, 09 Shared Tasks– 2nd phase will focus on SRL – Now we need to create the entire pipeline

• Tagger: SVM tagger• Pseudo-Projective Transformations: tool by Nilsson & Nivre• Dependency Parser: Malt Parser by Nivre et al• Implement a basic classifier for SRL (see next slide)

– Due after Spring Break– I’ll send email and description by email

=5

Page 38: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

38

First Phase of Term Project

=5

Syntactic structure

Semantic structure

•Properties of the Semantic (SRL) Structure •Multiple heads (parents)•Need to annotate predicates with senses (predicates are potential parents in the graph) – not indicated in the figure

It is not the most standard formalism for SRL

Page 39: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

39

SRL Pipeline– 1st Stage: For every word you decide if a word

is a predicate (binary classification)– 2nd Stage: For all the words which are

predicates predict their sense – 3rd Stage: For every pair of words decide:• word A is an argument of word B• word B is an argument of word A• there is no SRL relation between them(constraint: only predicates can be parents)

– 4th Stage: Label all the relations

=5

Page 40: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

40

SRL Pipeline

– Use any features: • hint: dependency parse features are going to be very

useful• see the CoNLL 2008 shared task papers to see which

features were useful

– Use any learning algorithm• You can use a package (e.g., SnOW)• Or implement it (e.g., averaged perceptron is easy)

– Do not use any SRL tools

=5

Page 41: 1 CS546: Machine Learning and Natural Language Preparation to the Term Project: - Dependency Parsing - Dependency Representation for Semantic Role Labeling

41

Next lectures

– I will be away for 2 weeks– Next week (Mar, 9 – Mar, 15):• Wednesday: Alex Klementiev on Weak Supervision • Friday: Kevin Small on Active Learning

+ student presentation by Ryan on Friday

– 2nd week (Mar, 16 – Mar, 22):• work on the project

– 1st phase will be due around April, 1 (exact dates later)

=5