mid-term reviews - texas a&m...
TRANSCRIPT
![Page 1: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/1.jpg)
Mid-termReviewsPreprocessing,languagemodelsSequencemodels,SyntacticParsing
![Page 2: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/2.jpg)
Preprocessing
• WhatisaLemma?Whatisawordform?• Whatisawordtype? Whatisatoken?• Whatistokenization?• Whatislemmatization?• Whatisstemming?
![Page 3: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/3.jpg)
How many words?
• I do uh main- mainly business data processing• Fragments, filled pauses
• Seuss’s cat in the hat is different from other cats! • Lemma: same stem, part of speech, rough word sense• cat and cats = same lemma
•Wordform: the full inflected surface form• cat and cats = different wordforms
![Page 4: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/4.jpg)
How many words?they lay back on the San Francisco grass and looked at the stars and their
• Type: an element of the vocabulary.• Token: an instance of that type in running text.• How many?
• 15 tokens (or 14)• 13 types (or 12) (or 11?)
![Page 5: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/5.jpg)
Issues in Tokenization• Finland’s capital → Finland Finlands Finland’s ?• what’re, I’m, isn’t → What are, I am, is not• Hewlett-Packard → Hewlett Packard ?• state-of-the-art → state of the art ?• Lowercase → lower-case lowercase lower case ?• San Francisco → one token or two?• m.p.h., PhD. → ??
![Page 6: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/6.jpg)
Lemmatization• Reduce inflections or variant forms to base form
• am, are, is → be• car, cars, car's, cars' → car
• the boy's cars are different colors → the boy car be different color
• Lemmatization: have to find correct dictionary headword form
Context dependent. for instance:in our last meeting (noun, meeting).We’re meeting (verb, meet) tomorrow.
![Page 7: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/7.jpg)
Stemming• Reduce terms to their stems in information retrieval• Stemming is crude chopping of affixes
• language dependent• e.g., automate(s), automatic, automation all reduced to automat.
for example compressed and compression are both accepted as equivalent to compress.
for exampl compress andcompress ar both acceptas equival to compress
context independent
![Page 8: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/8.jpg)
NaïveBayes
• Howtotrainanaïvebayes model?Howtoestimatepriorprobabilitiesandconditionalprobabilities?
• Howtoapplylaplace smoothing?
![Page 9: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/9.jpg)
Bayes’RuleAppliedtoDocumentsandClasses
•Foradocumentd andaclassc
P(c | d) = P(d | c)P(c)P(d)
![Page 10: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/10.jpg)
LearningtheMultinomialNaïve BayesModel
•Firstattempt:maximumlikelihoodestimates• simplyusethefrequenciesinthedata
Sec.13.3
P̂(wi | cj ) =count(wi,cj )count(w,cj )
w∈V∑
P̂(cj ) =doccount(C = cj )
Ndoc
![Page 11: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/11.jpg)
Laplace(add-1)smoothing:unknownwords
P̂(wu | c) = count(wu,c)+1
count(w,cw∈V∑ )
#
$%%
&
'(( + V +1
Addoneextrawordtothevocabulary,the“unknownword”wu
=1
count(w,cw∈V∑ )
#
$%%
&
'(( + V +1
![Page 12: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/12.jpg)
Maxent andPerceptron
• Whatarethedifferencesbetweenagenerativemodelandadiscriminatemodel?
• Whatarefeaturesinadiscriminatemodel?• What’stherelationbetweenmaxent andlogisticregression?• What’sthegeneralformofmaxent?• What’stheformofaperceptronclassifier?
![Page 13: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/13.jpg)
Jointvs.ConditionalModels
• Wehavesomedata{(d,c)}ofpairedobservationsd andhiddenclassesc.
• Joint(generative)modelsplaceprobabilitiesoverbothobserveddataandthehiddenstuff(gene-ratetheobserveddatafromhiddenstuff):
• AlltheclassicStatNLP models:• n-grammodels,NaiveBayesclassifiers,hiddenMarkovmodels,probabilisticcontext-freegrammars,IBMmachinetranslationalignmentmodels
P(c,d)
![Page 14: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/14.jpg)
Jointvs.ConditionalModels
• Discriminative(conditional)modelstakethedataasgiven,andputaprobabilityoverhiddenstructuregiventhedata:
• Logisticregression,conditionalloglinear ormaximumentropymodels,conditionalrandomfields
• Also,SVMs,(averaged)perceptron,etc.arediscriminativeclassifiers(butnotdirectlyprobabilistic)
P(c|d)
![Page 15: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/15.jpg)
Features
• InNLPuses,usuallyafeaturespecifies1. anindicatorfunction– ayes/noboolean matchingfunction– of
propertiesoftheinputand2. aparticularclass
fi(c, d) º [Φ(d) Ù c = cj] [Valueis0or1]
• Eachfeaturepicksoutadatasubsetandsuggestsalabelforit
![Page 16: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/16.jpg)
Feature-BasedLinearClassifiers• Exponential(log-linear,maxent,logistic,Gibbs)models:
• MakeaprobabilisticmodelfromthelinearcombinationSlifi(c,d)
• P(LOCATION|in Québec) = e1.8e–0.6/(e1.8e–0.6 + e0.3 + e0) = 0.586• P(DRUG|in Québec) = e0.3 /(e1.8e–0.6 + e0.3 + e0) = 0.238• P(PERSON|in Québec) = e0 /(e1.8e–0.6 + e0.3 + e0) = 0.176
• The weights are the parameters of the probability model, combined via a “soft max” function
∑ ∑'
),'(expc i
ii dcfλ=),|( λdcP
∑i
ii dcf ),(exp λ Makes votes positive
Normalizes votes
![Page 17: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/17.jpg)
Perceptron Algorithm
17
![Page 18: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/18.jpg)
LanguageModeling
• Howtocalculatetheprobabilityofasentenceusingalanguagemodel?
• WhatarethemainSmoothingAlgorithmsforlanguagemodels?• Extrinsicv.s IntrinsicEvaluation• IntrinsicEvaluationMetricoflanguagemodels
![Page 19: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/19.jpg)
Bigramestimatesofsentenceprobabilities
P(<s>Iwantenglish food</s>)=P(I|<s>)× P(want|I)× P(english|want)× P(food|english)× P(</s>|food)
=.000031
![Page 20: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/20.jpg)
Anexample
<s>IamSam</s><s>SamIam</s><s>Idonotlikegreeneggsandham</s>
€
P(wi |wi−1) =c(wi−1,wi)c(wi−1)
![Page 21: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/21.jpg)
Backoff and Interpolation• Sometimesithelpstouseless context
• Conditiononlesscontextforcontextsyouhaven’tlearnedmuchabout
• Backoff:• usetrigramifyouhavegoodevidence,• otherwisebigram,otherwiseunigram
• Interpolation:• mixunigram,bigram,trigram
• Interpolationworksbetter
![Page 22: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/22.jpg)
Advancedsmoothingalgorithms
• Intuitionusedbymanysmoothingalgorithms• Good-Turing• Kneser-Ney
•Usethecountofthingswe’veseen• tohelpestimatethecountofthingswe’veneverseen
![Page 23: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/23.jpg)
• Betterestimateforprobabilitiesoflower-orderunigrams!• Shannongame:Ican’tseewithoutmyreading___________?• “Francisco”ismorecommonthan“glasses”• …but“Francisco”alwaysfollows“San”
• InsteadofP(w):“Howlikelyisw”• Pcontinuation(w):“Howlikelyiswtoappearasanovelcontinuation?
• Foreachword,countthenumberofuniquebigramtypesitcompletes• Everybigramtypewasanovelcontinuationthefirsttimeitwasseen
Francisco
Kneser-Ney Smoothing I (smart backoff)
glasses
PCONTINUATION (w)∝ {wi−1 : c(wi−1,w)> 0}
![Page 24: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/24.jpg)
ExtrinsicevaluationofN-grammodels
•BestevaluationforcomparingmodelsAandB• Puteachmodelinatask
• spellingcorrector,speechrecognizer,MTsystem• Runthetask,getanaccuracyforAandforB
• Howmanymisspelledwordscorrectedproperly• Howmanywordstranslatedcorrectly
• CompareaccuracyforAandB
![Page 25: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/25.jpg)
Perplexity
Perplexityistheinverseprobabilityofthetestset,normalizedbythenumberofwords:
Chainrule:
Forbigrams:
Minimizingperplexityisthesameasmaximizingprobability
Thebestlanguagemodelisonethatbestpredictsanunseentestset• GivesthehighestP(sentence)
PP(W ) = P(w1w2...wN )−
1N
=1
P(w1w2...wN )N
![Page 26: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/26.jpg)
SequenceTagging
• Whatissequencetagging?whatarecommonsequencetaggingproblemsinNLP?
• WhatistheformofTrigramHMM?• What’stheruntimecomplexityoftheviterbi algorithmforTrigramHMM?
![Page 27: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/27.jpg)
Part-of-Speech Tagging
INPUT:Profits soared at Boeing Co., easily topping forecasts on Wall Street,as their CEO Alan Mulally announced first quarter results.
OUTPUT:Profits/N soared/V at/P Boeing/N Co./N ,/, easily/ADV topping/Vforecasts/N on/P Wall/N Street/N ,/, as/P their/POSS CEO/NAlan/N Mulally/N announced/V first/ADJ quarter/N results/N ./.
N = NounV = VerbP = PrepositionAdv = AdverbAdj = Adjective. . .
![Page 28: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/28.jpg)
Named Entity Extraction as TaggingINPUT:Profits soared at Boeing Co., easily topping forecasts on Wall Street,as their CEO Alan Mulally announced first quarter results.
OUTPUT:Profits/NA soared/NA at/NA Boeing/SC Co./CC ,/NA easily/NAtopping/NA forecasts/NA on/NA Wall/SL Street/CL ,/NA as/NAtheir/NA CEO/NA Alan/SP Mulally/CP announced/NA first/NAquarter/NA results/NA ./NA
NA = No entitySC = Start CompanyCC = Continue CompanySL = Start LocationCL = Continue Location. . .
![Page 29: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/29.jpg)
Why the Name?
p(x1 . . . xn, y1 . . . yn) = q(STOP|yn�1, yn)
nY
j=1
q(yj | yj�2, yj�1)
| {z }Markov Chain
⇥nY
j=1
e(xj | yj)
| {z }xj’s are observed
![Page 30: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/30.jpg)
The Viterbi Algorithm with BackpointersInput: a sentence x1 . . . xn, parameters q(s|u, v) and e(x|s).
Initialization: Set ⇡(0, *, *) = 1
Definition: S�1 = S0 = {⇤}, Sk = S for k 2 {1 . . . n}Algorithm:
I For k = 1 . . . n,
I For u 2 Sk�1, v 2 Sk,
⇡(k, u, v) = max
w2Sk�2
(⇡(k � 1, w, u)⇥ q(v|w, u)⇥ e(xk|v))
bp(k, u, v) = arg max
w2Sk�2
(⇡(k � 1, w, u)⇥ q(v|w, u)⇥ e(xk|v))
I Set (yn�1, yn) = argmax(u,v) (⇡(n, u, v)⇥ q(STOP|u, v))
I For k = (n� 2) . . . 1, yk = bp(k + 2, yk+1, yk+2)
IReturn the tag sequence y1 . . . yn
![Page 31: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/31.jpg)
The Viterbi Algorithm: Running Time
I O(n|S|3) time to calculate q(s|u, v)⇥ e(xk
|s) forall k, s, u, v.
I n|S|2 entries in ⇡ to be filled in.
I O(|S|) time to fill in one entry
I ) O(n|S|3) time in total
![Page 32: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/32.jpg)
SyntacticParsing
• What’saPCFG?• What’stheprobabilityofaparsetreeunderaPCFG?• What’stheChomskynormalformofCFG?• What’stheruntimecomplexityoftheCKYalgorithm?
![Page 33: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/33.jpg)
A Probabilistic Context-Free Grammar (PCFG)
S ) NP VP 1.0VP ) Vi 0.4VP ) Vt NP 0.4VP ) VP PP 0.2NP ) DT NN 0.3NP ) NP PP 0.7PP ) P NP 1.0
Vi ) sleeps 1.0Vt ) saw 1.0NN ) man 0.7NN ) woman 0.2NN ) telescope 0.1DT ) the 1.0IN ) with 0.5IN ) in 0.5
I Probability of a tree t with rules
↵1 ! �1,↵2 ! �2, . . . ,↵n ! �n
is p(t) =Qn
i=1 q(↵i ! �i) where q(↵ ! �) is the probabilityfor rule ↵ ! �.
![Page 34: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/34.jpg)
Chomsky Normal Form
A context free grammar G = (N,⌃, R, S) in ChomskyNormal Form is as follows
I N is a set of non-terminal symbols
I⌃ is a set of terminal symbols
I R is a set of rules which take one of two forms:I
X ! Y1Y2 for X 2 N , and Y1, Y2 2 N
IX ! Y for X 2 N , and Y 2 ⌃
I S 2 N is a distinguished start symbol
![Page 35: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/35.jpg)
The Full Dynamic Programming AlgorithmInput: a sentence s = x1 . . . xn, a PCFG G = (N,⌃, S,R, q).Initialization:For all i 2 {1 . . . n}, for all X 2 N ,
⇡(i, i,X) =
⇢q(X ! xi) if X ! xi 2 R
0 otherwise
Algorithm:
I For l = 1 . . . (n� 1)
I For i = 1 . . . (n� l)
I Set j = i+ l
I For all X 2 N , calculate
⇡(i, j,X) = max
X!Y Z2R,
s2{i...(j�1)}
(q(X ! Y Z)⇥ ⇡(i, s, Y )⇥ ⇡(s+ 1, j, Z))
and
bp(i, j,X) = arg max
X!Y Z2R,
s2{i...(j�1)}
(q(X ! Y Z)⇥ ⇡(i, s, Y )⇥ ⇡(s+ 1, j, Z))
Output: Return ⇡(1, n, S) = maxt2T (s) p(t), and backpointers bp
which allow recovery of argmaxt2T (s) p(t).
What’stheruntimeComplexity?
![Page 36: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/36.jpg)
DependencyParsing
• Canyoudrawadependencyparsetreeforasimplesentence?• Whatisprojectivity?
![Page 37: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/37.jpg)
Dependencysyntaxpostulatesthatsyntacticstructureconsistsoflexicalitemslinkedbybinaryasymmetricrelations(“arrows”)calleddependencies
Thearrowconnectsahead (governor,superior,regent)withadependent (modifier,inferior,subordinate)
Usually,dependenciesformatree(connected,acyclic,single-head)
DependencyGrammarandDependencyStructure
submitted
Bills were
Brownback
Senator
nsubjpass auxpass prep
nn
immigrationconj
by
cc
and
portspobj
prep
onpobj
Republican
Kansaspobj
prep
of
appos
![Page 38: Mid-term Reviews - Texas A&M Universityfaculty.cse.tamu.edu/huangrh/Fall17/mid-term-review.pdfMid-term Reviews Preprocessing, language models Sequence models, Syntactic Parsing](https://reader034.vdocument.in/reader034/viewer/2022051723/5abb92f27f8b9a441d8cfbe1/html5/thumbnails/38.jpg)
• DependenciesfromaCFGtreeusingheads,mustbeprojective
• Theremustnotbeanycrossingdependencyarcswhenthewordsarelaidoutintheirlinearorder,withallarcsabovethewords.
• Butdependencytheorynormallydoesallownon-projectivestructurestoaccountfordisplacedconstituents
• Youcan’teasilygetthesemanticsofcertainconstructionsrightwithoutthesenonprojective dependencies
WhodidBillbuythecoffeefromyesterday?
Projectivity