linguistics 187 week 4 ambiguity and robustness. discourse language has pervasive ambiguity walk...

64
Linguistics 187 Week 4 Linguistics 187 Week 4 Ambiguity and Robustness Ambiguity and Robustness

Post on 15-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Linguistics 187 Week 4Linguistics 187 Week 4

Ambiguity and RobustnessAmbiguity and Robustness

Page 2: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Discourse

Language has pervasive ambiguityLanguage has pervasive ambiguity

walk untieable knot bank? Noun or Verb (untie)able or un(tieable)? river or financial?

walk untieable knot bank? Noun or Verb (untie)able or un(tieable)? river or financial?

Every man loves a woman. The same woman or each their own? John told Tom he had to go.

Who had to go?

Every man loves a woman. The same woman or each their own? John told Tom he had to go.

Who had to go?

I like Jan. |Jan|.| or |Jan.|.| (sentence end or abbreviation) I like Jan. |Jan|.| or |Jan.|.| (sentence end or abbreviation)

EntailmentSemanticsSyntaxMorphologyTokenization

John didn’t wait to go. now or never?

John didn’t wait to go. now or never?

Bill fell. John kicked him.because or after?

Bill fell. John kicked him.because or after?

The duck is ready to eat. Cooked or hungry? The duck is ready to eat. Cooked or hungry?

Page 3: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

AmbiguityAmbiguity

Syntactically legitimate ambiguity (vs. spurious ambiguity: “boys and girls” & pushup)

Sources: – Alternative c-structure rules– Disjunctions in f-structure description– Lexical categories

XLE’s display/computation of ambiguity Dealing with ambiguity

– Recognize legitimate ambiguity– OT marks for preferences (later in the course)– Stochastic disambiguation

Page 4: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Syntactic AmbiguitySyntactic Ambiguity

Lexical– part of speech– subcategorization frames

Syntactic– attachments– coordination

Implemented system highlights interactions

Page 5: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Lexical Ambiguity: POSLexical Ambiguity: POS

verb-nounI saw her duck. I saw [NP her duck]. I saw [NP her] [VP duck].

noun-adjectivethe [N/A mean] rule that child is [A mean]. he calculated the [N mean].

Page 6: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Morphology and POS ambiguityMorphology and POS ambiguity

English has impoverished morphology and hence extreme POS ambiguity– leaves: leave +Verb +Pres +3sg

leaf +Noun +Pl

leave +Noun +Pl– will: +Noun +Sg; +Aux; +Verb +base

Even languages with extensive morphology have ambiguities

Page 7: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Lexical ambiguity: Subcat framesLexical ambiguity: Subcat frames

Words often have more than one subcategorization frame– transitive/intransitive

I broke it./It broke.– intransitive/oblique

He went./He went to London.– transitive/transitive with infinitive

I want it./I want it to leave.

Page 8: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Subcat-Rule interactionsSubcat-Rule interactions

OBL vs. ADJUNCT with intransitive/oblique– He went to London.

[ PRED ‘go<(^ SUBJ)(^ OBL)>’

SUBJ [PRED ‘he’]

OBL [PRED ‘to<(^ OBJ)>’

OBJ [ PRED ‘London’]]]

[ PRED ‘go<(^ SUBJ)>’

SUBJ [PRED ‘he’]

ADJUNCT { [PRED ‘to<(^ OBJ)>’

OBJ [ PRED ‘London’]]}]

Page 9: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

OBL-ADJUNCT cont.OBL-ADJUNCT cont.

Passive by phrase– It was eaten by the boys. [ PRED ‘eat<(^ OBL-AG)(^ SUBJ)>’ SUBJ [PRED ‘it’] OBL-AG [PRED ‘by<(^ OBJ)>’ OBJ [PRED ‘boy’]]]– It was eaten by the window. [ PRED ‘eat<NULL(^ SUBJ)>’ SUBJ [PRED ‘it’] ADJUNCT { [PRED ‘by<(^ OBJ)>’ OBJ [PRED ‘boy’]]}]

Page 10: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

XCOMP-ADJUNCTXCOMP-ADJUNCT

to infinitives can be arguments or adjuncts (purpose clauses)– I want her to leave.

[ PRED ‘want<(^ SUBJ)(^ XCOMP)>(^ OBJ)’

SUBJ [ PRED ‘I’ ]

OBJ [ PRED ‘her’ ]1

XCOMP [ PRED ‘leave<(^ SUBJ)>’

SUBJ [ 1 ] ] ]

Page 11: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

XCOMP-ADJUNCT cont.XCOMP-ADJUNCT cont.

– I want money to buy that.

[ PRED ‘want<(^ SUBJ)(^ OBJ)>’

SUBJ [ PRED ‘I’ ]

OBJ [ PRED ‘money’ ]

ADJUNCT { [ PRED ‘buy<(^ SUBJ)(^ OBJ)>’

SUBJ [ PRED ‘pro’ ]

OBJ [ PRED ‘that’ ] ] } ]

But both sentences get both analyses– The syntax does not have world knowledge

Page 12: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

OBJ-TH and Noun-Noun compoundsOBJ-TH and Noun-Noun compounds

Many OBJ-TH verbs are also transitive– I took the cake. I took Mary the cake.

The grammar needs a rule for noun-noun compounds– the tractor trailer, a grammar rule

These can interact– I took the grammar rules– I took [NP the grammar rules]– I took [NP the grammar] [NP rules]

Page 13: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Syntactic AmbiguitiesSyntactic Ambiguities

Even without lexical ambiguity, there is legitimate syntactic ambiguity– PP attachment– Coordination

Want to:– constrain these to legitimate cases– make sure they are processed efficiently

Page 14: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

PP AttachmentPP Attachment

PP adjuncts can attach to VPs and NPs Strings of PPs in the VP are ambiguous

– I see the girl with the telescope.

I see [the girl with the telescope].

I see [the girl] [with the telescope].

This ambiguity is reflected in:– the c-structure (constituency) – the f-structure (ADJUNCT attachment)

Page 15: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

PP attachment cont.PP attachment cont.

This ambiguity multiplies with more PPs– I saw the girl with the telescope– I saw the girl with the telescope in the garden– I saw the girl with the telescope in the garden on

the lawn

The syntax has no way to determine the attachment, even if humans can.

Page 16: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Ambiguity in coordinationAmbiguity in coordination

Vacuous ambiguity of non-branching trees– this can be avoided (pushup)

Legitimate ambiguity– old men and women

old [N men and women]

[NP old men ] and [NP women ]– I turned and pushed the cart

I [V turned and pushed ] the cart

I [VP turned ] and [VP pushed the cart ]

Page 17: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Grammar Engineering and ambiguityGrammar Engineering and ambiguity

Large-scale grammars will have lexical and syntactic ambiguities

With real data they will interact, resulting in many parses– these parses are (syntactically) legitimate– they are not intuitive to humans

(but more plausible words can make them better)

XLE provides tools to manage ambiguity– grammar writer interfaces– computation

Page 18: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

XLE displayXLE display

Four windows– c-structure (top left)– f-structure (bottom left)– packed f-structure (top right)– choice space (bottom right)

C-structure and f-structure “next” buttons Other two windows are packed

representations of all the parses– clicking on a choice will display that choice in the

left windows

Page 19: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

ExampleExample

I see the girl in the garden PP attachment ambiguity

– both ADJUNCTS– difference in ADJUNCT-TYPE

Page 20: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Packed F-structure and Choice spacePacked F-structure and Choice space

Page 21: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Sorting through the analysesSorting through the analyses

“Next” button on c-structure and then f-structure windows– impractical with many choices – independent vs. interacting ambiguities– hard to detect spurious ambiguity

The packed representations show all the analyses at once– (in)dependence more visible– click on choice to view– spurious ambiguities appear as blank choices

» but legitimate ambiguities may also do so

Page 22: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Ambiguity DemoAmbiguity Demo– eng-week4-demo.lfg– eng-week4-demo-test.lfg

Attachment– the girl ate the banana with the monkey

Subcategorization– the girl thought about the banana

Feature– the sheep laughed

All three (2 c-structures; 8 analyses)– the girl thought about the banana with the monkey

Page 23: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

XLE Ambiguity ManagementXLE Ambiguity Management

The sheep liked the fish.How many sheep?

How many fish?

The sheep-sg liked the fish-sg.The sheep-pl liked the fish-sg.The sheep-sg liked the fish-pl.The sheep-pl liked the fish-pl.

Options multiplied out

The sheep liked the fish sgpl

sgpl

Options packed

Packed representation is a “free choice” system– Encodes all dependencies without loss of information– Common items represented, computed once– Key to practical efficiency

Page 24: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

… but it’s wrongIt doesn’t encode all dependencies, choices are not free.

Dependent choicesDependent choices

Das Mädchen-nom sah die Katze-nomDas Mädchen-nom sah die Katze-accDas Mädchen-acc sah die Katze-nomDas Mädchen-acc sah die Katze-acc

Das Mädchen sah die Katzenomacc

nomacc

The girl saw the cat

Again, packing avoids duplication

badThe girl saw the catThe cat saw the girl bad

Who do you want to succeed? I want to succeed John want intrans, succeed trans I want John to succeed want trans, succeed intrans

Page 25: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Solution: Label dependent choicesSolution: Label dependent choices

Das Mädchen-nom sah die Katze-nomDas Mädchen-nom sah die Katze-accDas Mädchen-acc sah die Katze-nomDas Mädchen-acc sah die Katze-acc

badThe girl saw the catThe cat saw the girl bad

• Label each choice with distinct Boolean variables p, q, etc.• Record acceptable combinations as a Boolean expression • Each analysis corresponds to a satisfying truth-value assignment

(free choice from the true lines of ’s truth table)

Das Mädchen sah die Katze p:nom

p:acc

q:nom

q:acc

(pq)

(pq) =

Page 26: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Ambiguity and RobustnessAmbiguity and Robustness

Large-scale grammars are massively ambiguous

Grammars parsing real text need to be robust– "loosening" rules to allow robustness increases

ambiguity even more

Need a way to control the ambiguity– version of Optimality Theory (OT)

Page 27: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Theoretical OTTheoretical OT Grammar has a set of violable constraints Constraints are ranked by each language

– This gives cross-linguistic variation

Candidates (analyses) compete– John waited for Mary. vs. John waited for 3 hours.

Constraint ranking determines winning candidate Issues for XLE

– Candidates can be very ungrammatical» we have a grammar to produce grammatical analyses

» even with robust, ungrammatical analyses, these are controlled

– Generation, not parsing direction» we know what the string is already

» for generation we have a very specified analysis

Page 28: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

XLE OTXLE OT

Incorporate idea of ranking and (dis)preference Filter syntactic and lexical ambiguity Reconcile robustness and accuracy Allow parsing grammar to be used for generation

Page 29: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

XLE OT ImplementationXLE OT Implementation

OT marks in– grammar rules– templates– lexical entries

CONFIG states– preference vs. dispreference– ranking– parsing vs. generation orders

Page 30: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

The o:: projectionThe o:: projection OT marks are not f-structure features OT marks are in their own projection

c-structure

f-structure

o-structure(set of OT marks)

Page 31: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

The o:: projectionThe o:: projection

The o-structure is just a set of marks { PPadj GuessedN }

Instead of ^ and !, have o::* (NB: !f::*) PP: (^ ADJUNCT)=!

PPadj $ o::* ;– the f-structure is exactly the same– there is now an additional o-structure

Page 32: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Ranking analysesRanking analyses

Specify relative importance of OT marks in the CONFIGOPTIMALITYORDER Mark3 Mark2 +Mark1.

Comparing analyses– Find most important mark where the analyses differ– Prefer the analysis with the

» Least number of dispreference marks (no +)

» Most number of preference marks (+)

Importance

Page 33: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Ranking analyses (continued)Ranking analyses (continued)

an analysis with Mark2 is preferred over an analysis with Mark3

an analysis with no mark is preferred over an analysis with Mark2 or Mark3

an analysis with one Mark2 is preferred over one with two Mark2

an analysis with Mark1 is preferred over an analysis with no mark

an analysis with two Mark1 is preferred over an analysis with one Mark1

ImportanceOPTIMALITYORDER Mark3 Mark2 +Mark1.

Page 34: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Difference with Theoretical OTDifference with Theoretical OT

Theoretical OT: only dispreference marks XLE OT:

– dispreference marks: Mark1– preference marks: +Mark1– NOTE: + is only indicated in the CONFIG

only the name (Mark1) appears in the

grammar

Deciding which to use can be difficult

Page 35: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Example: PP ambiguitiesExample: PP ambiguities

John waited for Mary. John waited for 3 hours. Rule with OT marks Using template OT(_mark)=_mark $ o::*.

VP --> V

(NP: (^ OBJ)=!)

PP*: { (^ OBL)=!

@(OT PPobl)

|! $ (^ ADJUNCT)

@(OT PPadj)}.

Page 36: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Basic StructuresBasic Structures

John waited for Maryf-str:[ PRED 'wait<SUBJ OBL>' SUBJ [ PRED 'John'] OBL [ PRED 'for<OBJ>' OBJ [ PRED 'Mary' ]]]o-str: { PPobl }

John waited for Maryf-str:[ PRED 'wait<SUBJ>' SUBJ [ PRED 'John'] ADJ {[ PRED 'for<OBJ>' OBJ [ PRED 'Mary' ]]}]o-str: { PPadj }

Page 37: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Ranking for ExampleRanking for Example

Disprefer ADJUNCTs– OPTIMALITYORDER PPadj.– Problem: will disprefer adjuncts even when no

OBL analysis is possible

Prefer OBLs– OPTIMALITYORDER +PPobl.– Problem: will prefer OBL even when the other

analysis was not an ADJUNCT– Still probably better than dispreferring ADJUNCTs– Solution: local OT marks (not discussed here)

Page 38: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Special OT marks in XLESpecial OT marks in XLE

Separate other marks into fields Marks preceding

– NOGOOD: remove parts of the grammar for debugging or specializing

– STOPPOINT: apply on a second pass for extending grammar on failure

– CSTRUCTURE: filter when the c-structure is built for speed

There is lots of discussion in the XLE documentation; the reading on the web is a bit out of date for these marks

Page 39: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

The NOGOOD MarkThe NOGOOD Mark

OT marks can be used to remove parts of the grammar– rules or rule parts– templates or template parts– lexical items or parts of them

Use for– grammar adaptation/sharing– grammar development

Example– OPTIMALITYORDER FrontMatter NOGOOD.

Page 40: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

NOGOOD ExampleNOGOOD Example

ROOT rule allows for front matter for special corpus ROOT --> (FR-MAT: (^ ID)=!

@(OT FrontMatter))

S.

FR-MAT --> NUMBER

(PERIOD).

1. The light flashes.

Page 41: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

FR-MATFR-MAT

Grammars for corpora with front matter will not rank the OT mark FrontMatter (unranked marks are neutral)

Grammars for corpora without front matter will make the OT mark a NOGOOD OPTIMALITYORDER FrontMatter NOGOOD.

Effective ROOT rule: ROOT --> S.

Allows rule sharing across grammars Can also be used for debugging

Page 42: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

RobustnessRobustness

What to do if the grammar doesn't provide an analysis?

Graceful failure– FRAGMENTs– Specific relaxations

Ungrammatical analysis only if no grammatical one

Avoid ungrammatical analyses in generation

Page 43: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Robustness: STOPPOINTRobustness: STOPPOINT

On first pass, STOPPOINT is treated as NOGOOD Small, fast grammar for standard constructions

If first pass fails, ignore STOPPOINT and extend grammar– Relaxation possibilities precede STOPPOINT– OPTIMALITYORDER BadDetNAgr STOPPOINT.

Page 44: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

STOPPOINT Mark exampleSTOPPOINT Mark example

Example: NP: this boy NP: this boys Template call with OT mark:

DEMON(_P _N) = (^ SPEC PRED)='_P' { (^ NUM)=c _N |(^ NUM)~= _N @(OT BadDetNAgr)}.

Lexical entry: this DET XLE @(DEMON %stem sg).

RankingOPTIMALITYORDER BadDetNAgr STOPPOINT.

Page 45: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Structures for STOPOINT exampleStructures for STOPOINT exampleNP: this boyf-str [ PRED 'boy' NUM sg SPEC [ PRED 'this' ]]o-str

NP: this boysf-str [ PRED 'boy' NUM pl SPEC [ PRED 'this' ]]o-str { BadDetNAgr }

Parsing this boys will be slow: the grammar has to parse a second time But the ungrammatical input gets a parse Only put OT marks behind the STOPPOINT if they will be rarely triggered

Page 46: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Preference marks and STOPPOINTPreference marks and STOPPOINT

Preference marks behind the STOPPOINT are tried first (counter to intuitition)– OPTIMALITYORDER +MWE STOPPOINT.

Use MWE readings if at all possible If fail, do a second pass with the analytic

(non-MWE) structure (inefficient if fail) Example:

print` quality N * @(NOUN %STEM) @(OT MWE).

The [N print quality] is excellent.I want to [V print] [NP quality documents].

Page 47: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

CSTRUCTURE MarksCSTRUCTURE Marks

Apply marks before f-structure constraints are processed– OPTIMALITYORDER NoCloseQuote Guessed

CSTRUCTURE.

Improve performance by filtering early May loose some analyses

– coverage/efficiency tradeoff

Page 48: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

CSTRUCTURE example: GuessedCSTRUCTURE example: Guessed

Only use guessed form if another form is not found in the morphology/lexicon– OPTIMALITYORDER Guessed CSTRUCTURE.

Trade-off: lose some parses, but much fasterThe foobar is good.

no entry for foobar ==> parse with guessed N

The audio is good.

audio: only A in morphology ==> no parse

Page 49: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

CSTRUCTURE example: QuoteCSTRUCTURE example: Quote

Only allow unbalanced quote marks if there is no other quote markThen I left." vs. He said, "they appeared."

METARULEMACRO: … _CAT QT: @(OT NoCloseQt); … XLE only tries balanced version, not double

unbalanced version– failure when really needed two unbalanced quotes

Page 50: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Combining the OT marksCombining the OT marks

All the types of OT marks can be used in one grammar– ordering of NOGOOD, CSTRUCTURE,

STOPPOINT are important

ExampleOPTIMALITYORDER

Verbmobil NOGOOD

Guessed CSTRUCTURE

+MWE Fragment STOPPOINT

RareForm StrandedP +Obl.

Page 51: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Other FeaturesOther Features

Grouping: have marks treated as being of equal importance– OPTIMALITYORDER (Paren Appositive) Adjunct.

Ungrammatical markup: have XLE report analyses with this mark with a *– these are treated like any dispreference mark for

determining the optimal analyses– OPTIMALITYORDER *NoDetAgr STOPPOINT.

Page 52: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

GenerationGeneration

XLE uses the same basic grammar to parse and generate

Do not always want to generate all the possibilities that can be parsed

Put in special OT marks for generation to block or prefer certain strings– fix up bad subject-verb agreement– only allow certain adverb placements– control punctuation options

GENOPTIMALITYORDER

Page 53: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

OT Marks: Main pointsOT Marks: Main points

Ambiguity: broad coverage results in ambiguity – OT marks allow preferences

Robustness: want fall back parses only when regular parses fail – OT marks allow multipass grammar

XLE provides for complex orderings of OT marks– NOGOOD, CSTRUCTURE, STOPPOINT– preference, dispreference, ungrammatical– see the XLE documentation for details

Page 54: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

FRAGMENT grammarFRAGMENT grammar

What to do when the grammar does not get a parse– always want some type of output– want the output to be maximally useful

Why might it fail:– construction not covered yet– "bad" input– took too long (XLE parsing parameters)

Page 55: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Grammar engineering approachGrammar engineering approach

First try to get a complete parse If fail, build up chunks that get complete

parses (c-str and f-str) Have a fall back for things without even chunk

parses Link these chunks and fall backs together in a

single f-structure

Page 56: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Basic ideaBasic idea

XLE has a REPARSECAT which it tries if there is no complete parse

Grammar writer specifies what category the possible chunks are

OT marks are used to – build the fewest chunks possible– disprefer using the fall back over the chunks

Page 57: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Sample outputSample output

the the dog appears. Split into:

– "token" the– sentence "the dog appears"– ignore the period

Page 58: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

C-structureC-structure

Page 59: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

F-structureF-structure

Page 60: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

How to get thisHow to get this

FRAGMENTS -->

{ NP: (^ FIRST)=! @(OT-MARK Fragment) |S: (^ FIRST)=! @(OT-MARK Fragment) |TOKEN: (^ FIRST)=! @(OT-MARK Fragment) }

(FRAGMENTS: (^ REST)=! ).

Lexicon: -token TOKEN * (^ TOKEN)=%stem @(OT-MARK Token).

Page 61: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Why First-Rest?Why First-Rest? FIRST-REST

[ FIRST [ PRED …]

REST [ FIRST [ PRED … ]

REST … ] ]– Efficient– Encodes order

Possible alternative: set{ [ PRED … ]

[ PRED … ] }– Not as efficient (copying)– Even less efficient if mark scope facts

Page 62: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Accuracy?Accuracy?

Evaluation against gold standard “PARC 700” f-structure bank for Wall Street Journal

Measure: F-score on dependency triples– F-score: average of precision and recall – Dependency triples: separate f-structure features Subj(run, dog) Tense(run, past)

Results for best-matching f-structure:– Full parses: F=88.5– Fragment parses: F=76.7

(Riezler et al, 2002)

Page 63: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?

Fragments summaryFragments summary XLE has a chunking strategy for when the

grammar does not provide a full analysis Each chunk gets full c-str and f-str The grammar writer defines the chunks based

on what will be best for that grammar and application

Quality– Fragments have reasonable but degraded f-scores– Usefulness in applications is being tested

Page 64: Linguistics 187 Week 4 Ambiguity and Robustness. Discourse Language has pervasive ambiguity  walk untieable knot bank ? Noun or Verb (untie)able or un(tieable)?