cs460/626 : natural language processing/speech nlp and the

47
CS460/626 : Natural Language Processing/Speech NLP and the Web Processing/Speech, NLP and the Web (Lecture 28– Grammar; Constituency, Dependency) Dependency) Pushpak Bhattacharyya Pushpak Bhattacharyya CSE Dept., IIT Bombay 21 t M h 2011 21 st March, 2011

Upload: others

Post on 14-Apr-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS460/626 : Natural Language Processing/Speech NLP and the

CS460/626 : Natural Language Processing/Speech NLP and the WebProcessing/Speech, NLP and the Web

(Lecture 28– Grammar; Constituency, Dependency)Dependency)

Pushpak BhattacharyyaPushpak BhattacharyyaCSE Dept., IIT Bombay

21 t M h 201121st March, 2011

Page 2: CS460/626 : Natural Language Processing/Speech NLP and the

Grammar

A finite set of rules that generates only and all sentences of athat generates only and all sentences of a language.that assigns an appropriate structural g pp pdescription to each one.

Page 3: CS460/626 : Natural Language Processing/Speech NLP and the

Grammatical Analysis Techniques

Two main devices

Breaking up a StringLabeling the Constituents

– MorphologicalC i l

– SequentialHi hi l – Categorial

– Functional – Hierarchical– Transformational

Page 4: CS460/626 : Natural Language Processing/Speech NLP and the

Breaking up and LabelingSequential Breaking up

Seq enti l B e king p nd Mo phologi lSequential Breaking up and Morphological LabelingSequential Breaking up and Categorial Labeling Seque a ea g up a d Ca ego a abe gSequential Breaking up and Functional Labeling

Hierarchical Breaking upHierarchical Breaking upHierarchical Breaking up and Categorial LabelingHierarchical Breaking up and Functional Labelingg p g

Page 5: CS460/626 : Natural Language Processing/Speech NLP and the

Sequential Breaking up

• That student solved the problems.

that student solve ed the problem s+ + + + + +

Page 6: CS460/626 : Natural Language Processing/Speech NLP and the

Sequential Breaking up and Morphological LabelingMorphological Labeling

That student solved the problems.

th t t d t l d h blthat student solve ed the problem s

word word stem affix word stem affix

Page 7: CS460/626 : Natural Language Processing/Speech NLP and the

Sequential Breaking up and Categorial LabelingCategorial Labeling

This boy can solve the problem.this boy can solve the problem

Det N Aux V Det N

• They called her a taxi.They call ed taxi

Pron V Affi N

her

P on

a

DetPron V Affix NPron Det

Page 8: CS460/626 : Natural Language Processing/Speech NLP and the

Sequential Breaking up and Functional LabelingFunctional Labeling

They called taxiher a

Subject Verbal IndirectDirectSubject Verbal IndirectObject

Direct Object

They called taxiher a

Subject Verbal DirectObject

Indirect Object

Page 9: CS460/626 : Natural Language Processing/Speech NLP and the

Hierarchical Breaking up

Old men and women

Old Old men

andmen and women

andwomen

Old men and women Old men and women

womenandmenmenOld menOld

Page 10: CS460/626 : Natural Language Processing/Speech NLP and the

Hierarchical Breaking up and Categorial LabelingCategorial Labeling

Poor John ran away.

S

VPNP

V AdvA N

ran awayPoor John

Page 11: CS460/626 : Natural Language Processing/Speech NLP and the

Hierarchical Breaking up and F ti l L b liFunctional Labeling

• Immediate Constituent (IC) AnalysisImmediate Constituent (IC) Analysis• Construction types in terms of the function of the

constituents:– Predication (subject + predicate)– Modification (modifier + head)– Complementation (verbal + complement)– Subordination (subordinator + dependent unit)– Coordination (independent unit + coordinator)

Page 12: CS460/626 : Natural Language Processing/Speech NLP and the

Predication

[Birds]subject [fly]predicate

SS

PredicateSubject

Birds flyy

Page 13: CS460/626 : Natural Language Processing/Speech NLP and the

Modification

[A]modifier [flower]head

John [slept]head [in the room]modifier

S

PredicateSubject PredicateSubject

John H d M difiJohn Head Modifier

slept In the roomslept

Page 14: CS460/626 : Natural Language Processing/Speech NLP and the

ComplementationComplementationHe [saw]verbal [a lake]complement

S

PredicateSubject

He Verbal Complement

saw a lake

19/12/2004 ICON 2004 Tutorial

Page 15: CS460/626 : Natural Language Processing/Speech NLP and the

SubordinationSubordinationJohn slept [in]subordinator [the room]dependent unit

S

PredicateSubject

John Head Modifier

slept Subordinator Dependent Unit

the roomin

Page 16: CS460/626 : Natural Language Processing/Speech NLP and the

Coordination

[John came in time] independent unit [but]coordinatorp[Mary was not ready] independent unit

SS

C di tI d d t U it Independent UnitCoordinatorIndependent Unit Independent Unit

John came in time but Mary was not ready

Page 17: CS460/626 : Natural Language Processing/Speech NLP and the

An Example

SIn the morning, the sky looked much brighter.An Example

S

HeadModifier

Subordinator DU PredicateSubject

HeadHead Verbal ComplementModifierModifier

HeadModifier

In the morning, the sky looked much brighter

Page 18: CS460/626 : Natural Language Processing/Speech NLP and the

Hierarchical Breaking up and Categorial / Functional LabelingCategorial / Functional Labeling

Hierarchical Breaking up coupled with C t i l /F ti l L b li iCategorial /Functional Labeling is a very powerful device.B t th bi iti hi hBut there are ambiguities which demand something more powerful.E L f G dE.g., Love of God

Someone loves GodG d lGod loves someone

Page 19: CS460/626 : Natural Language Processing/Speech NLP and the

Hierarchical Breaking up

Categorial Labeling Functional Labeling

Love of God Love of God

Noun Ph

Prepositional Phrase

Head ModifierPhrase Phrase

DUSub

Godoflove love of God

Page 20: CS460/626 : Natural Language Processing/Speech NLP and the

Types of Generative GrammarypFinite State Model

( i l)(sequential)Phrase Structure Model

(sequential + hierarchical) + (categorial)

Transformational ModelTransformational Model (sequential + hierarchical + transformational) +

(categorial + functional)(categorial + functional)

Page 21: CS460/626 : Natural Language Processing/Speech NLP and the

Phrase Structure Grammar (PSG)Phrase Structure Grammar (PSG)

A phrase-structure grammar G consists of a f t l (V T S P) hfour tuple (V, T, S, P), where V is a finite set of alphabets (or vocabulary)

E g N V A Adv P NP VP AP AdvP PPE.g., N, V, A, Adv, P, NP, VP, AP, AdvP, PP, student, sing, etc.

T is a finite set of terminal symbols: T ⊂ VT is a finite set of terminal symbols: T ⊂ VE.g., student, sing, etc.

S is a distinguished non-terminal symbol, also g y ,called start symbol: S ∈ VP is a set of productions.

Page 22: CS460/626 : Natural Language Processing/Speech NLP and the

Noun Phrases

• John • the student • the intelligent student

NP NP NPNP

N

NP

NDet

NP

NDet AdjP

John studentthe studentthe intelligent

j

Page 23: CS460/626 : Natural Language Processing/Speech NLP and the

Noun Phrase

• his first five PhD students

NP

QuantDet Ord NN

five

Quant

his

Det

first

Ord

students

N

PhD

N

fivehis first studentsPhD

Page 24: CS460/626 : Natural Language Processing/Speech NLP and the

Noun PhraseNoun Phrase• The five best students of my class

NP

fi

Quant

th

Det NAP PP

fivethe studentsbest of my class

Page 25: CS460/626 : Natural Language Processing/Speech NLP and the

Verb Phrases• can sing • can hit the ball

VP VPVP

VAux

VP

NPAux V

singcan the ball

NP

can

Aux

hit

V

g

Page 26: CS460/626 : Natural Language Processing/Speech NLP and the

Verb PhraseVerb Phrase• Can give a flower to Mary

VP

NPAux V PP

a flower

NP

can

Aux

give

V

to Mary

PP

g o a y

Page 27: CS460/626 : Natural Language Processing/Speech NLP and the

Verb Phrase• may make John the chairman

VP

NPAux V NP

John

NP

may

Aux

make

V

the chairman

NP

y e c a a

Page 28: CS460/626 : Natural Language Processing/Speech NLP and the

Verb Phrase

• may find the book very interesting

VP

NPAux V AP

the bookmay find very interesting

Page 29: CS460/626 : Natural Language Processing/Speech NLP and the

Prepositional Phrases

• in the classroom • near the river

PPPP

NPPNPP

the rivernearthe classroomin

Page 30: CS460/626 : Natural Language Processing/Speech NLP and the

Adjective PhrasesAdjective Phrases• intelligent • very honest • fond of sweets

AP AP AP

A ADegree PPA

intelligent honestvery of sweetsfond

Page 31: CS460/626 : Natural Language Processing/Speech NLP and the

Adjective Phrase

• very worried that she might have done badly in the i tassignment

AP

S’Degree A S

very

Degree

worried

A

that she might have done badly in the

very worried

assignment

Page 32: CS460/626 : Natural Language Processing/Speech NLP and the

Phrase Structure Rules

Rewrite Rules:

• The boy hit the ball.

Rewrite Rules:(i) S NP VP(ii) NP D t N(ii) NP Det N(iii) VP V NP

h(iv) Det the(v) N man, ball(v) V hit

We interpret each rule X Y as the instruction rewrite X as Y.

Page 33: CS460/626 : Natural Language Processing/Speech NLP and the

DerivationDerivation• The boy hit the ball.

SentenceNP + VP (i)Det + N + VP (ii)Det + N + V + NP (iii)The + N + V + NP (iv)The + N + V + NP (iv)The + boy + V + NP (v)The + boy + hit + NP (vi)The + boy + hit + Det + N (ii)The + boy + hit + the + N (iv)The + boy + hit + the + ball (v)The + boy + hit + the + ball (v)

Page 34: CS460/626 : Natural Language Processing/Speech NLP and the

PSG Parse Tree

The boy hit the ball.S

VPNP

VND t NPVNDet

the

NP

Nb Dethitthe Nboy Dethit

the ballthe ball

Page 35: CS460/626 : Natural Language Processing/Speech NLP and the

PSG Parse Tree

John wrote those words in the Book of ProverbsSProverbs.S

VPNP

VPropN NP PP

NPP

John wrote thosewords

in

th of

NP PP

thebook

ofproverbs

Page 36: CS460/626 : Natural Language Processing/Speech NLP and the

Penn POS Tags

• John wrote those words in the Book of Proverbs.

[John/NNP ]t /VBDwrote/VBD

[ those/DT words/NNS ]in/IN [ the/DT Book/NN ][ the/DT Book/NN ]of/IN [ Proverbs/NNS ][ Proverbs/NNS ]

Page 37: CS460/626 : Natural Language Processing/Speech NLP and the

Penn Treebank• John wrote those words in the Book of Proverbs.

(S (NP-SBJ (NP John))(VP t(VP wrote

(NP those words)(PP-LOC in(NP (NP-TTL (NP the Book)(NP (NP TTL (NP the Book)

(PP of(NP Proverbs)))(NP Proverbs)))

Page 38: CS460/626 : Natural Language Processing/Speech NLP and the

PSG Parse Tree

Official trading in the shares will start in Pa is on No 6Paris on Nov 6. S

VPNP

NP

PP

PP

PPVAux

NAP

PPNPP

PPVAux

A

official trading will start on Nov 6in the shares in Paris

Page 39: CS460/626 : Natural Language Processing/Speech NLP and the

Penn POS Tagsg• Official trading in the shares will start in Paris on

N 6

[ Official/JJ trading/NN ]Nov 6.

in/IN [ the/DT shares/NNS ][ the/DT shares/NNS ]will/MD start/VB in/IN [ P i /NNP ][ Paris/NNP ]on/IN [ Nov./NNP 6/CD ]

Page 40: CS460/626 : Natural Language Processing/Speech NLP and the

Penn Treebank

• Official trading in the shares will start in Paris on Nov 6

( (S (NP-SBJ (NP Official trading)(PP in

Nov 6.

(PP in(NP the shares)))

(VP will(VP start(PP-LOC in

(NP Paris))(NP Paris))(PP-TMP on

(NP (NP Nov 6)( ( )

Page 41: CS460/626 : Natural Language Processing/Speech NLP and the

Penn POS Tag SsetAdjective: JJAdverb: RBCardinal Number: CDDeterminer: DTPreposition: INCoordinating Conjunction CCSubordinating Conjunction: INSingular Noun: NNPlural Noun: NNSPlural Noun: NNSPersonal Pronoun: PPProper Noun: NPVerb base form: VBVerb base form: VBModal verb: MDVerb (3sg Pres): VBZWh-determiner: WDTWh determiner: WDTWh-pronoun: WP

Page 42: CS460/626 : Natural Language Processing/Speech NLP and the

Diff b tDifference between constituency and dependencyconstituency and dependency

Page 43: CS460/626 : Natural Language Processing/Speech NLP and the

Constituency GrammarConstituency GrammarCategorical Uses part of speechContext Free Grammar (CFG)Basic elements Phrases

Page 44: CS460/626 : Natural Language Processing/Speech NLP and the

Dependency Grammar

FunctionalContext Free GrammarContext Free GrammarBasic elements Units of Predication/ Modification/ Complementation/Modification/ Complementation/ Subordination/ Co-ordination

Page 45: CS460/626 : Natural Language Processing/Speech NLP and the

Bridge between Constituency and Dependency parse

Constituency uses phrasesDependencies consist of Head-modifier combinationThis is a cricket bat.This is a cricket bat.

Cricket (Category: N, Functional: Adj)Bat (Category: N, Functional: N)

For languages which are free word order we use g gdependency parser to uncover the relations between the words.Raam ne Shaam ko dekha . (Ram saw Shyam)Sh k R d kh (R Sh )Shaam ko Ram ne dekha. (Ram saw Shyam)Case markers cling to the nouns they subordinate

Page 46: CS460/626 : Natural Language Processing/Speech NLP and the

Example of CG and DG output

Birds Fly.Birds Fly.

S S

NP VP Subject Predicate

N VBirds fly

Birds fly

Birds fly

Birds fly

Page 47: CS460/626 : Natural Language Processing/Speech NLP and the

Some probabilistic parsers and why they are used

Stanford, Collins, Charniack, RASPWhy Probabilistic parsersWhy Probabilistic parsers

For a single sentence we can have multiple parsesparses. Probability for the parse is calculated and then the parse with the highest probabilitythen the parse with the highest probability is selected.This is needed in many applications of NLP,This is needed in many applications of NLP, that need parsing.