joint ambiguity modeling in nlp - uio.no · joint ambiguity modeling in nlp woodley packard...

158
Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for the ERG Evaluation Metrics Joint Ambiguity Modeling in NLP Woodley Packard Universitet i Oslo March 21, 2011

Upload: ngothu

Post on 19-Aug-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Joint Ambiguity Modeling in NLP

Woodley Packard

Universitet i Oslo

March 21, 2011

Page 2: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Introduction

Ambiguity is a central phenomenon in natural language,affecting accuracy and efficiency in most if not all types ofNLP.

Page 3: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Word boundaries

Page 4: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Word boundaries

◮ Difficult in some languages – e.g. Chinese

Page 5: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Word boundaries

◮ Difficult in some languages – e.g. Chinese

◮ Norwegian examples lifted from recent mailing listactivity:

◮ Lege-ring, sei-del, sel-skap, bru-sau-tomat,

sports-av-iser...

Page 6: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Page 7: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Sentence Boundaries

Page 8: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Sentence Boundaries

◮ Again, not marked in some languages

Page 9: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Sentence Boundaries

◮ Again, not marked in some languages

◮ Even in English, the markers are overloaded:

◮ The CEO had the P.R. Department leaders make risky

moves.

◮ The citizens voted in the U.S. Presidential Election

polls.

Page 10: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Word Senses

Page 11: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Word Senses

◮ English word bank:

Page 12: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Word Senses

◮ English word bank:

◮ (n) Financial institution

Page 13: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Word Senses

◮ English word bank:

◮ (n) Financial institution

◮ (n) Side of a river / stream

Page 14: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Word Senses

◮ English word bank:

◮ (n) Financial institution

◮ (n) Side of a river / stream

◮ (n) Repository of resources (tree bank, bank of

switches)

Page 15: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Word Senses

◮ English word bank:

◮ (n) Financial institution

◮ (n) Side of a river / stream

◮ (n) Repository of resources (tree bank, bank of

switches)

◮ (v) To bet everything (on ...)

Page 16: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Word Senses

◮ English word bank:

◮ (n) Financial institution

◮ (n) Side of a river / stream

◮ (n) Repository of resources (tree bank, bank of

switches)

◮ (v) To bet everything (on ...)

◮ (v) Tilting to make a turn in an airplane

Page 17: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Word Senses

◮ English word bank:

◮ (n) Financial institution

◮ (n) Side of a river / stream

◮ (n) Repository of resources (tree bank, bank of

switches)

◮ (v) To bet everything (on ...)

◮ (v) Tilting to make a turn in an airplane

◮ (v) To do business at a bank (financial institution)

◮ .... others.

Page 18: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

Page 19: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Syntactic Ambiguity

Page 20: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Syntactic Ambiguity

◮ I saw the man with the telescope.

Page 21: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Syntactic Ambiguity

◮ I saw the man with the telescope.

◮ Anaphora

Page 22: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Syntactic Ambiguity

◮ I saw the man with the telescope.

◮ Anaphora

◮ Jessie and Alex grew up together, but eventually he

moved to the West coast and she moved to the East

coast.

Page 23: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Types of Ambiguity

◮ Syntactic Ambiguity

◮ I saw the man with the telescope.

◮ Anaphora

◮ Jessie and Alex grew up together, but eventually he

moved to the West coast and she moved to the East

coast.

◮ .... others.

Page 24: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Traditional Approaches to Dealing with Ambiguity

◮ How well can we resolve some particular type ofambiguity?

Page 25: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Traditional Approaches to Dealing with Ambiguity

◮ How well can we resolve some particular type ofambiguity?

◮ Rich literature about how to handle each type ofambiguity.

Page 26: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Traditional Approaches to Dealing with Ambiguity

◮ How well can we resolve some particular type ofambiguity?

◮ Rich literature about how to handle each type ofambiguity.

◮ None are completely solved problems.

Page 27: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Traditional Approaches to Dealing with Ambiguity

◮ How well can we resolve some particular type ofambiguity?

◮ Rich literature about how to handle each type ofambiguity.

◮ None are completely solved problems.

◮ Some have been solved fairly well (word boundaries,sentence boundaries).

Page 28: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Traditional Approaches to Dealing with Ambiguity

◮ How well can we resolve some particular type ofambiguity?

◮ Rich literature about how to handle each type ofambiguity.

◮ None are completely solved problems.

◮ Some have been solved fairly well (word boundaries,sentence boundaries).

◮ But most have room for improvement.

Page 29: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

An Emerging Trend in Research

◮ Results about how diverse types of information can behelpful in ambiguity resolution.

Page 30: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

An Emerging Trend in Research

◮ Results about how diverse types of information can behelpful in ambiguity resolution.

◮ Large body of research about using syntax as a guidefor resolving anaphora.

Page 31: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

An Emerging Trend in Research

◮ Results about how diverse types of information can behelpful in ambiguity resolution.

◮ Large body of research about using syntax as a guidefor resolving anaphora.

Some others:

◮ “Using Syntactic Dependency as Local Context toResolve Word Sense Ambiguity” D. Lin, ACL 1997.

Page 32: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

An Emerging Trend in Research

◮ Results about how diverse types of information can behelpful in ambiguity resolution.

◮ Large body of research about using syntax as a guidefor resolving anaphora.

Some others:

◮ “Using Syntactic Dependency as Local Context toResolve Word Sense Ambiguity” D. Lin, ACL 1997.

◮ “Improving Parsing and PP attachment Performancewith Sense Information” E. Agirre, T. Baldwin, D.Martinez, ACL 2008.

Page 33: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Joint Modeling of AmbiguityIdea:Instead of modeling individual marginal distributions for eachtype of ambiguity, or conditional models involving two typesof ambiguity, what if we model a joint distribution for all thetypes of ambiguity we are interested in at once?

Page 34: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Joint Modeling of AmbiguityIdea:Instead of modeling individual marginal distributions for eachtype of ambiguity, or conditional models involving two typesof ambiguity, what if we model a joint distribution for all thetypes of ambiguity we are interested in at once?

◮ Gain: modeling flexibility

Page 35: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Joint Modeling of AmbiguityIdea:Instead of modeling individual marginal distributions for eachtype of ambiguity, or conditional models involving two typesof ambiguity, what if we model a joint distribution for all thetypes of ambiguity we are interested in at once?

◮ Gain: modeling flexibility

◮ Cost: greater complexity

Page 36: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Joint Modeling of AmbiguityIdea:Instead of modeling individual marginal distributions for eachtype of ambiguity, or conditional models involving two typesof ambiguity, what if we model a joint distribution for all thetypes of ambiguity we are interested in at once?

◮ Gain: modeling flexibility

◮ Cost: greater complexity

◮ Possible framework: graphical models

Page 37: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Joint Modeling of AmbiguityIdea:Instead of modeling individual marginal distributions for eachtype of ambiguity, or conditional models involving two typesof ambiguity, what if we model a joint distribution for all thetypes of ambiguity we are interested in at once?

◮ Gain: modeling flexibility

◮ Cost: greater complexity

◮ Possible framework: graphical models

◮ Inference and parameter estimation: sometimestractable, sometimes not.

Page 38: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

A Graphical Model for Ambiguity

Anaphora

Syntax

Hobbs 1978

WSD

Lin 1997Agirre et al. 2008

Page 39: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Other Dependencies Seem Likely

Page 40: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Other Dependencies Seem Likely

◮ Information about anaphora may be helpful indisambiguating syntax.

Page 41: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Other Dependencies Seem Likely

◮ Information about anaphora may be helpful indisambiguating syntax.

◮ Word Priming: information about the senses of words ina given sentence may be helpful for determining wordsenses in nearby subsequent sentences.

Page 42: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Other Dependencies Seem Likely

◮ Information about anaphora may be helpful indisambiguating syntax.

◮ Word Priming: information about the senses of words ina given sentence may be helpful for determining wordsenses in nearby subsequent sentences.

◮ Syntactic Priming: information about the constructionsused in a given sentence may be helpful for determiningthe syntax of nearby subsequent sentences.

Page 43: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Other Dependencies Seem Likely

◮ Information about anaphora may be helpful indisambiguating syntax.

◮ Word Priming: information about the senses of words ina given sentence may be helpful for determining wordsenses in nearby subsequent sentences.

◮ Syntactic Priming: information about the constructionsused in a given sentence may be helpful for determiningthe syntax of nearby subsequent sentences.

◮ Maybe others too!

Page 44: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

A Revised Graphical Model for Ambiguity

Anaphora

Syntax1 Syntax2 Syntax3

WSD1 WSD2 WSD3

Page 45: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

A Roadmap for my Project

Page 46: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

A Roadmap for my Project

◮ Build basic disambiguation systems for a few types ofambiguity

Page 47: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

A Roadmap for my Project

◮ Build basic disambiguation systems for a few types ofambiguity

◮ Set baselines for how well we can disambiguate withouta joint model

Page 48: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

A Roadmap for my Project

◮ Build basic disambiguation systems for a few types ofambiguity

◮ Set baselines for how well we can disambiguate withouta joint model

◮ Learn how to combine information from disparatesystems to form a joint model

Page 49: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

A Roadmap for my Project

◮ Build basic disambiguation systems for a few types ofambiguity

◮ Set baselines for how well we can disambiguate withouta joint model

◮ Learn how to combine information from disparatesystems to form a joint model

◮ Evaluate the joint model’s performance on each type ofambiguity vs. the baselines

Page 50: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Where am I on that roadmap?

Page 51: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Where am I on that roadmap?

◮ Not very far along, really. I’ve built a syntaxdisambiguation system and set a strong baseline forsyntax disambiguation in isolation.

Page 52: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Where am I on that roadmap?

◮ Not very far along, really. I’ve built a syntaxdisambiguation system and set a strong baseline forsyntax disambiguation in isolation.

◮ Started some experiments into using more globalinformation, but so far nothing worth reporting.

Page 53: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Next, shifting gears...

I’ll describe the work I’ve done exploring the space of syntaxdisambiguation for HPSG grammars.

Page 54: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Syntax Disambiguation

◮ Given an utterance, find the best analysis licensed byyour grammar.

Page 55: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Syntax Disambiguation

◮ Given an utterance, find the best analysis licensed byyour grammar.

◮ With broad-coverage grammars, there can be a lot ofcandidates analyses!

Page 56: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Syntax Disambiguation

◮ Given an utterance, find the best analysis licensed byyour grammar.

◮ With broad-coverage grammars, there can be a lot ofcandidates analyses!

◮ ERG licenses more than 10,000 distinct analyses for:

I would still have an appointment slot free on Tuesday, the

sixth of April, but only in the afternoon.

Page 57: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Syntax DisambiguationThe different analyses are usually not all semanticallyequivalent. How do we know which meaning was intendedby the speaker?

Page 58: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Syntax DisambiguationThe different analyses are usually not all semanticallyequivalent. How do we know which meaning was intendedby the speaker?

Common solution: annotate the intended meaning on asufficiently large corpus of example utterances, and thenapply machine learning techniques to build a model that willallow us to guess the intended meaning on unseen data.

Page 59: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Layout

Modeling Ambiguity

Syntax DisambiguationMaximum Entropy ModelsMaxEnt for the ERGEvaluation Metrics

Page 60: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Conditional log-linear models

p(y |x) =ew ·f (x ,y)

y ′ ew ·f (x ,y ′)

where w is a vector of feature weights, typically learned fromthe training data.

Page 61: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Conditional log-linear models

p(y |x) =ew ·f (x ,y)

y ′ ew ·f (x ,y ′)

where w is a vector of feature weights, typically learned fromthe training data.

◮ Conditional probability model for classification/rankingproblems.

Page 62: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Conditional log-linear models

p(y |x) =ew ·f (x ,y)

y ′ ew ·f (x ,y ′)

where w is a vector of feature weights, typically learned fromthe training data.

◮ Conditional probability model for classification/rankingproblems.

◮ Describe relationship of class y to input x by n

real-valued feature functions fj(x , y).

Page 63: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Conditional log-linear models

p(y |x) =ew ·f (x ,y)

y ′ ew ·f (x ,y ′)

where w is a vector of feature weights, typically learned fromthe training data.

◮ Conditional probability model for classification/rankingproblems.

◮ Describe relationship of class y to input x by n

real-valued feature functions fj(x , y).

◮ Equivalently, a vector valued feature function f (x , y).

Page 64: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Conditional log-linear models

p(y |x) =ew ·f (x ,y)

y ′ ew ·f (x ,y ′)

where w is a vector of feature weights, typically learned fromthe training data.

◮ Conditional probability model for classification/rankingproblems.

◮ Describe relationship of class y to input x by n

real-valued feature functions fj(x , y).

◮ Equivalently, a vector valued feature function f (x , y).

Since the denominator is a function only of x and does notdepend on y , determining arg maxy p(y |x) amounts tomaximizing w · f (x , y).

Page 65: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Maximum Entropy Models

◮ One common way of selecting w .

Page 66: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Maximum Entropy Models

◮ One common way of selecting w .

◮ Also known as Multinomial Logistic Regression.

Page 67: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Maximum Entropy Models

◮ One common way of selecting w .

◮ Also known as Multinomial Logistic Regression.

◮ Corresponds to maximum likelihood estimation –maximize the conditional likelihood of the training data.

Page 68: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Maximum Entropy Models

◮ One common way of selecting w .

◮ Also known as Multinomial Logistic Regression.

◮ Corresponds to maximum likelihood estimation –maximize the conditional likelihood of the training data.

◮ Alternate description: model with maximum entropysubject to Ep(f (x , y)) = f̃ (x , y), the empiricalexpectation of f (x , y) on the training data.

Page 69: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Maximum Entropy Models

◮ One common way of selecting w .

◮ Also known as Multinomial Logistic Regression.

◮ Corresponds to maximum likelihood estimation –maximize the conditional likelihood of the training data.

◮ Alternate description: model with maximum entropysubject to Ep(f (x , y)) = f̃ (x , y), the empiricalexpectation of f (x , y) on the training data.

◮ Popular applications: POS tagging, NER, parsedisambiguation, ...

Page 70: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Maximum Entropy Models

◮ One common way of selecting w .

◮ Also known as Multinomial Logistic Regression.

◮ Corresponds to maximum likelihood estimation –maximize the conditional likelihood of the training data.

◮ Alternate description: model with maximum entropysubject to Ep(f (x , y)) = f̃ (x , y), the empiricalexpectation of f (x , y) on the training data.

◮ Popular applications: POS tagging, NER, parsedisambiguation, ...

◮ Other techniques for picking w : SVMs, Perceptrons, ...

Page 71: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Layout

Modeling Ambiguity

Syntax DisambiguationMaximum Entropy ModelsMaxEnt for the ERGEvaluation Metrics

Page 72: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

The ERG and WeScience

◮ English Resource Grammar (ERG): broad coverageprecision computational English grammar based onHPSG

Page 73: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

The ERG and WeScience

◮ English Resource Grammar (ERG): broad coverageprecision computational English grammar based onHPSG

◮ WeScience: a treebank of gold ERG analyses for around9000 sentences from Wikipedia in the domain ofComputational Linguistics.

Page 74: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

How well can we disambiguate when parsing WeScienceinputs just by looking at ERG analyses in isolation?

Page 75: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

How well can we disambiguate when parsing WeScienceinputs just by looking at ERG analyses in isolation?

◮ Use Maximum Entropy modeling; x = an inputsentence and y = a candidate analysis.

Page 76: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

How well can we disambiguate when parsing WeScienceinputs just by looking at ERG analyses in isolation?

◮ Use Maximum Entropy modeling; x = an inputsentence and y = a candidate analysis.

◮ f (x , y) = a vector of features describing a candidateanalysis, possibly making reference to the input item.

Page 77: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

How well can we disambiguate when parsing WeScienceinputs just by looking at ERG analyses in isolation?

◮ Use Maximum Entropy modeling; x = an inputsentence and y = a candidate analysis.

◮ f (x , y) = a vector of features describing a candidateanalysis, possibly making reference to the input item.

◮ Train a MaxEnt model on WeScience.

Page 78: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

How well can we disambiguate when parsing WeScienceinputs just by looking at ERG analyses in isolation?

◮ Use Maximum Entropy modeling; x = an inputsentence and y = a candidate analysis.

◮ f (x , y) = a vector of features describing a candidateanalysis, possibly making reference to the input item.

◮ Train a MaxEnt model on WeScience.

◮ To disambiguate an unseen input, select the candidateanalysis with the largest p(y |x).

Page 79: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

How well can we disambiguate when parsing WeScienceinputs just by looking at ERG analyses in isolation?

◮ Use Maximum Entropy modeling; x = an inputsentence and y = a candidate analysis.

◮ f (x , y) = a vector of features describing a candidateanalysis, possibly making reference to the input item.

◮ Train a MaxEnt model on WeScience.

◮ To disambiguate an unseen input, select the candidateanalysis with the largest p(y |x).

◮ Scoring: on a held-out test set, for what proportion ofthe inputs can our model identify the correct goldanalysis? (exact match)

Page 80: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

How well can we disambiguate when parsing WeScienceinputs just by looking at ERG analyses in isolation?

◮ Use Maximum Entropy modeling; x = an inputsentence and y = a candidate analysis.

◮ f (x , y) = a vector of features describing a candidateanalysis, possibly making reference to the input item.

◮ Train a MaxEnt model on WeScience.

◮ To disambiguate an unseen input, select the candidateanalysis with the largest p(y |x).

◮ Scoring: on a held-out test set, for what proportion ofthe inputs can our model identify the correct goldanalysis? (exact match)

◮ Use 10-fold cross-validation to reduce measurementnoise.

Page 81: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

An ERG analysis consists of a derivation tree and an MRS

meaning representation.Simplified candidate analysis for The very large cat meowed.:

SB-HD MC C

SP-HD N C

D - THE LEThe

AJ-HDN NORM C

SP-HD HC C

AV - DG-V LEvery

AJ - I LElarge

N SG ILR

N - C LEcat

W PERIOD PLR

V PST OLR

V - LEmeowed.

{

the q(x1), cat n(x1), large a(e2, x1), meow v(e3, x1), very x deg(e4, e2)}

Page 82: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

What’s there left to do?Define f (x , y).

Page 83: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

What’s there left to do?Define f (x , y).

◮ Note that since the analysis y includes all theinformation needed to reconstruct x , we may as welljust talk about f (y) instead of f (x , y).

Page 84: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

What’s there left to do?Define f (x , y).

◮ Note that since the analysis y includes all theinformation needed to reconstruct x , we may as welljust talk about f (y) instead of f (x , y).

◮ We’ll make f (y) a sparse and very high dimensionalvector, with a few 1’s and 2’s here and there.

Page 85: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

What’s there left to do?Define f (x , y).

◮ Note that since the analysis y includes all theinformation needed to reconstruct x , we may as welljust talk about f (y) instead of f (x , y).

◮ We’ll make f (y) a sparse and very high dimensionalvector, with a few 1’s and 2’s here and there.

◮ Each dimension is called a feature.

Page 86: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Cookie Cutter Features

◮ In previous work on HPSG parse disambiguation, eachfeature records the number of times some particularpattern of tree nodes or MRS predications/variables isfound in the candidate analysis y .

Page 87: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Cookie Cutter Features

◮ In previous work on HPSG parse disambiguation, eachfeature records the number of times some particularpattern of tree nodes or MRS predications/variables isfound in the candidate analysis y .

◮ Defining f (y) amounts to making a list of subgraphs tolook for in the tree and MRS for y .

Page 88: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Cookie Cutter Features

◮ In previous work on HPSG parse disambiguation, eachfeature records the number of times some particularpattern of tree nodes or MRS predications/variables isfound in the candidate analysis y .

◮ Defining f (y) amounts to making a list of subgraphs tolook for in the tree and MRS for y .

◮ A straightforward way of defining a bunch of“interesting” subgraphs to look for is to decide on a“cookie-cutter” shape, and use that to cut out sectionsof all the trees in the treebank.

Page 89: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Cookie Cutter Features

◮ In previous work on HPSG parse disambiguation, eachfeature records the number of times some particularpattern of tree nodes or MRS predications/variables isfound in the candidate analysis y .

◮ Defining f (y) amounts to making a list of subgraphs tolook for in the tree and MRS for y .

◮ A straightforward way of defining a bunch of“interesting” subgraphs to look for is to decide on a“cookie-cutter” shape, and use that to cut out sectionsof all the trees in the treebank.

◮ This allows us to quickly enumerate tens or hundreds ofthousands of subgraphs that can occur in analyses, andbuild feature vectors out of them.

Page 90: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Baseline FeaturesOne of the simplest useful cookie cutters looks like this:

Cookie Cutter Example Subgraph

?

? ?

SP-HD HC C

AV - DG-V LE AJ - I LE

This cookie-cutter matches about 57,000 distinct subgraphsfrom WeScience.

Page 91: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Baseline Features (cont’d)

◮ So, we get a 57,000 dimensional feature space.

Page 92: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Baseline Features (cont’d)

◮ So, we get a 57,000 dimensional feature space.

◮ MaxEnt model accuracy: 40.4%

Page 93: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Baseline Features (cont’d)

◮ So, we get a 57,000 dimensional feature space.

◮ MaxEnt model accuracy: 40.4%

◮ For comparison, random choice accuracy: 8.1%

Page 94: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Baseline Features (cont’d)

◮ So, we get a 57,000 dimensional feature space.

◮ MaxEnt model accuracy: 40.4%

◮ For comparison, random choice accuracy: 8.1%

◮ It turns out 40.4% is a fairly strong baseline.

Page 95: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Baseline Features (cont’d)

◮ So, we get a 57,000 dimensional feature space.

◮ MaxEnt model accuracy: 40.4%

◮ For comparison, random choice accuracy: 8.1%

◮ It turns out 40.4% is a fairly strong baseline.

◮ We’ll take this as our baseline against which to evaluateother ideas.

Page 96: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Can we do better?

Page 97: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Can we do better?

◮ Of course!

Page 98: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Can we do better?

◮ Of course!

◮ I tried about 60 different combinations of feature sets.

Page 99: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Can we do better?

◮ Of course!

◮ I tried about 60 different combinations of feature sets.

◮ I’ll show you several of the most interesting ones.

Page 100: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Can we do better?

◮ Of course!

◮ I tried about 60 different combinations of feature sets.

◮ I’ll show you several of the most interesting ones.

◮ Baseline features included in addition to those I’lldescribe.

Page 101: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

GrandparentingThe single most helpful feature type just adds some parentsto the baseline cookie cutter:

GP[1] GP[2] GP[3]

?

?

? ?

?

?

?

? ?

?

?

?

?

? ?

Page 102: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

GrandparentingThe single most helpful feature type just adds some parentsto the baseline cookie cutter:

GP[1] GP[2] GP[3]

?

?

? ?

?

?

?

? ?

?

?

?

?

? ?42.97% 44.46% 44.9%

Page 103: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Uncles

?

? ?

? ? ?

? ?

Page 104: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Uncles

?

? ?

? ? ?

? ?

42.97%

Page 105: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Syntactic Dependencies

◮ Converts the tree into a list of syntactic dependencies

Page 106: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Syntactic Dependencies

◮ Converts the tree into a list of syntactic dependencies

◮ e.g.: SB-HD MC C (meowed. : V - LE, cat : N - C LE)

Page 107: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Syntactic Dependencies

◮ Converts the tree into a list of syntactic dependencies

◮ e.g.: SB-HD MC C (meowed. : V - LE, cat : N - C LE)

◮ Each such dependency is considered a feature.

Page 108: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Syntactic Dependencies

◮ Converts the tree into a list of syntactic dependencies

◮ e.g.: SB-HD MC C (meowed. : V - LE, cat : N - C LE)

◮ Each such dependency is considered a feature.

◮ Accuracy: 42.97%

Page 109: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Lexicalizations

◮ Decorate each node in the tree with information aboutthe lexical head of that subtree.

Page 110: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Lexicalizations

◮ Decorate each node in the tree with information aboutthe lexical head of that subtree.

◮ Some decoration choices: part of speech, lexical type,HEAD value, lexeme name, surface form, stem

Page 111: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Lexicalizations

◮ Decorate each node in the tree with information aboutthe lexical head of that subtree.

◮ Some decoration choices: part of speech, lexical type,HEAD value, lexeme name, surface form, stem

◮ Best performing decoration: lexeme name

Page 112: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Lexicalizations

◮ Decorate each node in the tree with information aboutthe lexical head of that subtree.

◮ Some decoration choices: part of speech, lexical type,HEAD value, lexeme name, surface form, stem

◮ Best performing decoration: lexeme name

◮ Accuracy with baseline cookie cutter: 44.16%

Page 113: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Lexicalizations

◮ Decorate each node in the tree with information aboutthe lexical head of that subtree.

◮ Some decoration choices: part of speech, lexical type,HEAD value, lexeme name, surface form, stem

◮ Best performing decoration: lexeme name

◮ Accuracy with baseline cookie cutter: 44.16%

◮ We can do this with other cookie cutters as well.

Page 114: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Lexicalizations

◮ Decorate each node in the tree with information aboutthe lexical head of that subtree.

◮ Some decoration choices: part of speech, lexical type,HEAD value, lexeme name, surface form, stem

◮ Best performing decoration: lexeme name

◮ Accuracy with baseline cookie cutter: 44.16%

◮ We can do this with other cookie cutters as well.

◮ Accuracy with GP[2] cookie cutter: 45.89%

Page 115: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

MRS Features˘

the q(x1), cat n(x1), large a(e2, x1), meow v(e3, x1), very x deg(e4, e2)}

So far, we’ve only described features extracted from thederivation tree portion of the analysis.

Page 116: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

MRS Features˘

the q(x1), cat n(x1), large a(e2, x1), meow v(e3, x1), very x deg(e4, e2)}

So far, we’ve only described features extracted from thederivation tree portion of the analysis.

◮ What about the MRS?

Page 117: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

MRS Features˘

the q(x1), cat n(x1), large a(e2, x1), meow v(e3, x1), very x deg(e4, e2)}

So far, we’ve only described features extracted from thederivation tree portion of the analysis.

◮ What about the MRS?

◮ Tried two schemes for encoding MRS into features:variable-centric and predication-centric

Page 118: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Variable-centric MRS Features˘

the q(x1), cat n(x1), large a(e2, x1), meow v(e3, x1), very x deg(e4, e2)}

For each MRS variable, features for all (unordered) pairs andtriples from the set of relations that are known to apply tothat variable.Example triple for x1: (large a.ARG1, meow v.ARG1, cat n.ARG0)

Example pair for e2: (large a.ARG0, very x deg.ARG1)

Page 119: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Variable-centric MRS Features{

the q(x1), cat n(x1), large a(e2, x1), meow v(e3, x1), very x deg(e4, e2)}

For each MRS variable, features for all (unordered) pairs andtriples from the set of relations that are known to apply tothat variable.Example triple for x1: (large a.ARG1, meow v.ARG1, cat n.ARG0)

Example pair for e2: (large a.ARG0, very x deg.ARG1)

Accuracy: 41.40%

Page 120: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Predication-centric MRS Features{

the q(x1), cat n(x1), large a(e2, x1), meow v(e3, x1), very x deg(e4, e2)}

One feature for each elementary predication in the MRS,describing the relations pointed to by the non-ARG0 roles.Example feature: very x deg(ARG1=large a)

Page 121: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Predication-centric MRS Features{

the q(x1), cat n(x1), large a(e2, x1), meow v(e3, x1), very x deg(e4, e2)}

One feature for each elementary predication in the MRS,describing the relations pointed to by the non-ARG0 roles.Example feature: very x deg(ARG1=large a)Accuracy: 42.61%

Page 122: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Predication-centric MRS Features{

the q(x1), cat n(x1), large a(e2, x1), meow v(e3, x1), very x deg(e4, e2)}

One feature for each elementary predication in the MRS,describing the relations pointed to by the non-ARG0 roles.Example feature: very x deg(ARG1=large a)Accuracy: 42.61%Both MRS feature sets combined: 42.63% ...

Page 123: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Combining Several Feature Sets

◮ Information contained in different feature templates isnot orthogonal

Page 124: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Combining Several Feature Sets

◮ Information contained in different feature templates isnot orthogonal

◮ Subadditivity of performance improvements

Page 125: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Combining Several Feature Sets

◮ Information contained in different feature templates isnot orthogonal

◮ Subadditivity of performance improvements

◮ However, modest improvements are possible.

Page 126: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Combining Several Feature Sets

◮ Information contained in different feature templates isnot orthogonal

◮ Subadditivity of performance improvements

◮ However, modest improvements are possible.

◮ Best combination I’ve found: all the MRS features,lexeme name lexicalization, GP[2], and a few otherfeatures I didn’t describe that don’t perform well inisolation.

Page 127: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Combining Several Feature Sets

◮ Information contained in different feature templates isnot orthogonal

◮ Subadditivity of performance improvements

◮ However, modest improvements are possible.

◮ Best combination I’ve found: all the MRS features,lexeme name lexicalization, GP[2], and a few otherfeatures I didn’t describe that don’t perform well inisolation.

◮ Accuracy: 47.52%

Page 128: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

MaxEnt Disambiguation for WeScience: Summary

Picking the right analysis is hard.

Page 129: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

MaxEnt Disambiguation for WeScience: Summary

Picking the right analysis is hard.

Random choice 8.1%Strong baseline 40.4%

Best model 47.5%

Page 130: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Layout

Modeling Ambiguity

Syntax DisambiguationMaximum Entropy ModelsMaxEnt for the ERGEvaluation Metrics

Page 131: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Doesn’t 47% seem kind of sad?Well, yes, in a way.

Page 132: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Doesn’t 47% seem kind of sad?Well, yes, in a way.

◮ But there are other ways to evaluate that give us muchcheerier numbers!

Page 133: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Doesn’t 47% seem kind of sad?Well, yes, in a way.

◮ But there are other ways to evaluate that give us muchcheerier numbers!

◮ 47% of the time we get an exactly correct answer.

Page 134: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Doesn’t 47% seem kind of sad?Well, yes, in a way.

◮ But there are other ways to evaluate that give us muchcheerier numbers!

◮ 47% of the time we get an exactly correct answer.

◮ But we don’t assign ourselves any partial credit forgetting a partially correct answer.

◮ Many other metrics do.

Page 135: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Some Other Metrics

Random Baseline Best Model

Exact Tree Match 8.1% 40.4% 47.5%Exact MRS Match 8.8% 41.3% 48.5%

Unlabeled PARSEVAL 80.7% 93.3% 94.7%Labeled PARSEVAL 70.3% 88.0% 90.5%Unlabeled Syn-Deps 79.2% 92.1% 93.8%Labeled Syn-Deps 71.0% 89.1% 91.3%Elementary Deps 82.1% 94.2% 95.4%Leaf Ancestor 79.0% 92.4% 93.7%

Page 136: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Do metrics agree with each other?What if models that perform well under Exact Tree Matchdon’t necessarily perform will under, say, PARSEVAL?

Page 137: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Do metrics agree with each other?What if models that perform well under Exact Tree Matchdon’t necessarily perform will under, say, PARSEVAL?

◮ For an arbitrary pair of evaluation metrics, that couldhappen.

◮ Different metrics can evaluate different aspects of amodel’s performance.

Page 138: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Do metrics agree with each other?What if models that perform well under Exact Tree Matchdon’t necessarily perform will under, say, PARSEVAL?

◮ For an arbitrary pair of evaluation metrics, that couldhappen.

◮ Different metrics can evaluate different aspects of amodel’s performance.

◮ But how about for the metrics that people commonlyemploy as overall figures of merit?

Page 139: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Fortunately, this doesn’t turn out to be much of an issue.

Page 140: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Fortunately, this doesn’t turn out to be much of an issue.

◮ Conducted two experiments

Page 141: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Fortunately, this doesn’t turn out to be much of an issue.

◮ Conducted two experiments

◮ First: optimizing MaxEnt meta-parameter

Page 142: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Fortunately, this doesn’t turn out to be much of an issue.

◮ Conducted two experiments

◮ First: optimizing MaxEnt meta-parameter

◮ Second: picking feature combinations

Page 143: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Optimizing MaxEnt meta-parameter

◮ MaxEnt has a regularization parameter

Page 144: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Optimizing MaxEnt meta-parameter

◮ MaxEnt has a regularization parameter

◮ Controls the trade-off between generalization andoverfitting

Page 145: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Optimizing MaxEnt meta-parameter

◮ MaxEnt has a regularization parameter

◮ Controls the trade-off between generalization andoverfitting

◮ Given an evaluation metric, we can determine the“optimal” value for the regularization parameterthrough cross-validation.

Page 146: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Optimizing MaxEnt meta-parameter

◮ MaxEnt has a regularization parameter

◮ Controls the trade-off between generalization andoverfitting

◮ Given an evaluation metric, we can determine the“optimal” value for the regularization parameterthrough cross-validation.

◮ How does this optimal value vary as a function of whichmetric is used?

Page 147: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Optimizing MaxEnt meta-parameter (cont’d)

24

26

28

30

32

34

36

38

40

42

0.001 0.01 0.1 1 10 100 1000

Exa

ct M

atch

Acc

urac

y (%

)

Regularization Variance Parameter

Regularized Performance of pcfg baseline

pcfg baseline

Page 148: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Optimizing MaxEnt meta-parameter (cont’d)

-1.5

-1

-0.5

0

0.5

1

0.001 0.01 0.1 1 10 100 1000

Z-S

core

s

Regularization

Z-Score Comparison of Metrics

Page 149: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Optimizing MaxEnt meta-parameter (cont’d)Evidently, in practice the optimum meta-parameter is almostexactly the same for all the different metrics.

Page 150: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Optimizing MaxEnt meta-parameter (cont’d)Evidently, in practice the optimum meta-parameter is almostexactly the same for all the different metrics.

◮ Maximum error rate increase from optimizing with adifferent metric from the set listed a few slides ago, onbaseline feature configuration: 0.41%.

Page 151: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Optimizing MaxEnt meta-parameter (cont’d)Evidently, in practice the optimum meta-parameter is almostexactly the same for all the different metrics.

◮ Maximum error rate increase from optimizing with adifferent metric from the set listed a few slides ago, onbaseline feature configuration: 0.41%.

◮ Averaging of that figure over all the feature setconfigurations: 0.81%.

Page 152: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Optimizing MaxEnt meta-parameter (cont’d)Evidently, in practice the optimum meta-parameter is almostexactly the same for all the different metrics.

◮ Maximum error rate increase from optimizing with adifferent metric from the set listed a few slides ago, onbaseline feature configuration: 0.41%.

◮ Averaging of that figure over all the feature setconfigurations: 0.81%.

◮ Conclusion: it doesn’t really matter what metric youuse to optimize the meta-parameter.

Page 153: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Selecting a feature set combination

◮ Above, I described a handful of feature configurations; Iactually tested about 60 different combinations.

Page 154: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Selecting a feature set combination

◮ Above, I described a handful of feature configurations; Iactually tested about 60 different combinations.

◮ In principal, the metric that I used to decide which onewas best matters.

Page 155: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Selecting a feature set combination

◮ Above, I described a handful of feature configurations; Iactually tested about 60 different combinations.

◮ In principal, the metric that I used to decide which onewas best matters.

◮ However, in fact the metrics all ranked the sameconfiguration as the best.

Page 156: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Metrics: SummaryThere are many different syntax disambiguation evaluationmetrics available.

Page 157: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Metrics: SummaryThere are many different syntax disambiguation evaluationmetrics available.However, from the point of view of optimizing a model,there is little difference between the 6 or so most commonlyused metrics.

Page 158: Joint Ambiguity Modeling in NLP - uio.no · Joint Ambiguity Modeling in NLP Woodley Packard Modeling Ambiguity Syntax Disambiguation Maximum Entropy Models MaxEnt for …

Joint AmbiguityModeling in NLP

Woodley Packard

ModelingAmbiguity

SyntaxDisambiguation

Maximum EntropyModels

MaxEnt for the ERG

Evaluation Metrics

Syntax Disambiguation: ConclusionMy best combination model appears to represent a decentapproximation of the best performance available fromcurrent techniques when viewed through any of thecommonly used metrics.Hence it is a suitable baseline for judging the success offuture forays into joint disambiguation.