named entity tagging thanks to dan jurafsky, jim martin, ray mooney, tom mitchell for slides
TRANSCRIPT
![Page 1: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/1.jpg)
Named Entity Named Entity TaggingTagging
Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides
![Page 2: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/2.jpg)
OutlineOutline
Named Entities and the basic ideaNamed Entities and the basic idea IOB TaggingIOB Tagging A new classifier: Logistic RegressionA new classifier: Logistic Regression
Linear regression Logistic regression Multinomial logistic regression = MaxEnt
Why classifiers aren’t as good as sequence Why classifiers aren’t as good as sequence modelsmodels
A new sequence model:A new sequence model: MEMM = Maximum Entropy Markov Model
![Page 3: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/3.jpg)
Named Entity TaggingNamed Entity Tagging
Slide from Jim Martin
CHICAGO (AP) — Citing high fuel prices, United Airlines said Friday it has increased fares by $6 per round trip on flights to some cities also served by lower-cost carriers. American Airlines, a unit AMR, immediately matched the move, spokesman Tim Wagner said. United, a unit of UAL, said the increase took effect Thursday night and applies to most routes where it competes against discount carriers, such as Chicago to Dallas and Atlanta and Denver to San Francisco, Los Angeles and New York.
![Page 4: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/4.jpg)
Named Entity TaggingNamed Entity Tagging
CHICAGOCHICAGO (AP) — Citing high fuel prices, (AP) — Citing high fuel prices, United AirlinesUnited Airlines said said Friday it has increased fares by $6 per round trip on flights to Friday it has increased fares by $6 per round trip on flights to some cities also served by lower-cost carriers. some cities also served by lower-cost carriers. American American AirlinesAirlines, a unit , a unit AMRAMR, immediately matched the move, , immediately matched the move, spokesman spokesman Tim WagnerTim Wagner said. said. UnitedUnited, a unit of , a unit of UALUAL, said the , said the increase took effect Thursday night and applies to most routes increase took effect Thursday night and applies to most routes where it competes against discount carriers, such as where it competes against discount carriers, such as ChicagoChicago to to DallasDallas and and AtlantaAtlanta and and DenverDenver to to San Francisco, Los AngelesSan Francisco, Los Angeles and and New York.New York.
Slide from Jim Martin
![Page 5: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/5.jpg)
Named Entity Named Entity RecognitionRecognition Find the named entities and classify them by typeFind the named entities and classify them by type Typical approachTypical approach
Acquire training data Encode using IOB labeling Train a sequential supervised classifier Augment with pre- and post-processing using available
list resources (census data, gazetteers, etc.)
Slide from Jim Martin
![Page 6: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/6.jpg)
Temporal and Numerical Temporal and Numerical ExpressionsExpressions TemporalsTemporals
Find all the temporal expressions Normalize them based on some reference point
Numerical ExpressionsNumerical Expressions Find all the expressions Classify by type Normalize
Slide from Jim Martin
![Page 7: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/7.jpg)
NE TypesNE Types
Slide from Jim Martin
![Page 8: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/8.jpg)
NE Types: ExamplesNE Types: Examples
Slide from Jim Martin
![Page 9: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/9.jpg)
AmbiguityAmbiguity
Slide from Jim Martin
![Page 10: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/10.jpg)
Biomedical EntitiesBiomedical Entities
DiseaseDisease SymptomSymptom DrugDrug Body PartBody Part TreatmentTreatment EnzimeEnzime ProteinProtein Difficulty: discontiguous or overlapping mentionsDifficulty: discontiguous or overlapping mentions
Abdomen is soft, nontender, nondistended, negative bruits
![Page 11: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/11.jpg)
NER ApproachesNER Approaches
As with partial parsing and chunking there are As with partial parsing and chunking there are two basic approaches (and hybrids)two basic approaches (and hybrids) Rule-based (regular expressions)
• Lists of names• Patterns to match things that look like names• Patterns to match the environments that
classes of names tend to occur in. ML-based approaches
• Get annotated training data• Extract features• Train systems to replicate the annotation
Slide from Jim Martin
![Page 12: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/12.jpg)
ML ApproachML Approach
Slide from Jim Martin
![Page 13: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/13.jpg)
Encoding for Sequence Encoding for Sequence LabelingLabeling We can use IOB encoding:We can use IOB encoding:
……United AirlinesUnited Airlines said Friday it has increased said Friday it has increasedB_ORG I_ORG O O O O O
the move , spokesman Tim Wagner said.
O O O O B_PER I_PER O
How many tags?How many tags? For N classes we have 2*N+1 tags
• An I and B for each class and one O for no-class
Each token in a text gets a tagEach token in a text gets a tag Can use simpler IO tagging if what?Can use simpler IO tagging if what?
![Page 14: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/14.jpg)
NER FeaturesNER Features
Slide from Jim Martin
![Page 15: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/15.jpg)
How to do NE tagging?How to do NE tagging?
ClassifiersClassifiers Naïve Bayes Logistic Regression
Sequence ModelsSequence Models HMMs MEMMs CRFs
Sequence models work betterSequence models work better
![Page 16: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/16.jpg)
Linear RegressionLinear Regression
Example from Freakonomics (Levitt and Example from Freakonomics (Levitt and Dubner 2005)Dubner 2005) Fantastic/cute/charming versus granite/maple
Can we predict price from # of adjs?Can we predict price from # of adjs?
![Page 17: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/17.jpg)
Linear RegressionLinear Regression
![Page 18: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/18.jpg)
Muliple Linear RegressionMuliple Linear Regression
Predicting values:Predicting values:
In general:In general:
Let’s pretend an extra “intercept” feature f0 with value 1
Multiple Linear RegressionMultiple Linear Regression
![Page 19: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/19.jpg)
Learning in Linear Learning in Linear RegressionRegression Consider one instance Consider one instance xxjj
We’d like to choose weights to minimize the We’d like to choose weights to minimize the difference between predicted and observed difference between predicted and observed value for value for xxjj::
This is an optimization problem that turns out to This is an optimization problem that turns out to have a closed-form solutionhave a closed-form solution
![Page 20: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/20.jpg)
Put the weight from the training set into matrix Put the weight from the training set into matrix XX of observations of observations ff((ii))
Put the observed values in a vector Put the observed values in a vector yyFormula that mimimizes the cost:Formula that mimimizes the cost:
W = W = ((XXTTXX))−−11XXTTyy
![Page 21: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/21.jpg)
Logistic RegressionLogistic Regression
![Page 22: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/22.jpg)
Logistic RegressionLogistic Regression
But in these language problems we are doing But in these language problems we are doing classificationclassification Predicting one of a small set of discrete values
Could we just use linear regression for this?Could we just use linear regression for this?
![Page 23: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/23.jpg)
Logistic regressionLogistic regression Not possible: the result doesn’t fall between 0 and Not possible: the result doesn’t fall between 0 and
11
Instead of predicting prob, predict ratio of probs:Instead of predicting prob, predict ratio of probs:
but still not good: doesn’t lie between 0 and 1
So how about if we predict the log:So how about if we predict the log:
![Page 24: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/24.jpg)
Logistic regressionLogistic regression Solving this for Solving this for pp((y=truey=true))
![Page 25: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/25.jpg)
Logistic functionLogistic function
![Page 26: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/26.jpg)
Logistic RegressionLogistic Regression How do we do classification?How do we do classification?
Or:Or:
Or back to explicit sum notation:Or back to explicit sum notation:
![Page 27: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/27.jpg)
Multinomial logistic Multinomial logistic regressionregressionMultiple classes:Multiple classes:
One change: indicator functions One change: indicator functions ff((c,xc,x)) instead of real valuesinstead of real values
![Page 28: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/28.jpg)
Estimating the weightEstimating the weight
Gradient Iterative ScalingGradient Iterative Scaling
![Page 29: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/29.jpg)
FeaturesFeatures
![Page 30: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/30.jpg)
Summary so farSummary so far
Naïve Bayes ClassifierNaïve Bayes Classifier Logistic Regression ClassifierLogistic Regression Classifier
Sometimes called MaxEnt classifiers
![Page 31: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/31.jpg)
How do we apply How do we apply classification to classification to sequences?sequences?
![Page 32: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/32.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
Slide from Ray Mooney
John saw the saw and decided to take it to the table.
classifier
NNP
![Page 33: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/33.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
John saw the saw and decided to take it to the table.
classifier
VBD
Slide from Ray Mooney
![Page 34: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/34.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
John saw the saw and decided to take it to the table.
classifier
DT
Slide from Ray Mooney
![Page 35: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/35.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
John saw the saw and decided to take it to the table.
classifier
NN
Slide from Ray Mooney
![Page 36: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/36.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
John saw the saw and decided to take it to the table.
classifier
CC
Slide from Ray Mooney
![Page 37: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/37.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
John saw the saw and decided to take it to the table.
classifier
VBD
Slide from Ray Mooney
![Page 38: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/38.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
John saw the saw and decided to take it to the table.
classifier
TO
Slide from Ray Mooney
![Page 39: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/39.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
John saw the saw and decided to take it to the table.
classifier
VB
Slide from Ray Mooney
![Page 40: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/40.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
John saw the saw and decided to take it to the table.
classifier
PRP
Slide from Ray Mooney
![Page 41: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/41.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
John saw the saw and decided to take it to the table.
classifier
IN
Slide from Ray Mooney
![Page 42: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/42.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
John saw the saw and decided to take it to the table.
classifier
DT
Slide from Ray Mooney
![Page 43: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/43.jpg)
Sequence Labeling as Sequence Labeling as ClassificationClassification Classify each token independently but use as Classify each token independently but use as
input features, information about the input features, information about the surrounding tokens (sliding window).surrounding tokens (sliding window).
John saw the saw and decided to take it to the table.
classifier
NN
Slide from Ray Mooney
![Page 44: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/44.jpg)
Using Outputs as InputsUsing Outputs as Inputs
Better input features are usually the Better input features are usually the categoriescategories of the surrounding tokens, of the surrounding tokens, but these are not available yetbut these are not available yet
Can use category of either the Can use category of either the preceding or succeeding tokens by preceding or succeeding tokens by going forward or back and using going forward or back and using previous outputprevious output
Slide from Ray Mooney
![Page 45: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/45.jpg)
Forward ClassificationForward Classification
John saw the saw and decided to take it to the table.
classifier
NNP
Slide from Ray Mooney
![Page 46: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/46.jpg)
Forward ClassificationForward Classification
NNPJohn saw the saw and decided to take it to the table.
classifier
VBD
Slide from Ray Mooney
![Page 47: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/47.jpg)
Forward ClassificationForward Classification
NNP VBDJohn saw the saw and decided to take it to the table.
classifier
DT
Slide from Ray Mooney
![Page 48: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/48.jpg)
Forward ClassificationForward Classification
NNP VBD DTJohn saw the saw and decided to take it to the table.
classifier
NN
Slide from Ray Mooney
![Page 49: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/49.jpg)
Forward ClassificationForward Classification
NNP VBD DT NNJohn saw the saw and decided to take it to the table.
classifier
CC
Slide from Ray Mooney
![Page 50: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/50.jpg)
Forward ClassificationForward Classification
NNP VBD DT NN CCJohn saw the saw and decided to take it to the table.
classifier
VBD
Slide from Ray Mooney
![Page 51: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/51.jpg)
Forward ClassificationForward Classification
NNP VBD DT NN CC VBDJohn saw the saw and decided to take it to the table.
classifier
TO
Slide from Ray Mooney
![Page 52: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/52.jpg)
Forward ClassificationForward Classification
NNP VBD DT NN CC VBD TOJohn saw the saw and decided to take it to the table.
classifier
VB
Slide from Ray Mooney
![Page 53: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/53.jpg)
Backward ClassificationBackward Classification
Disambiguating “to” in this case would be Disambiguating “to” in this case would be even easier backward.even easier backward.
DT NNJohn saw the saw and decided to take it to the table.
classifier
IN
Slide from Ray Mooney
![Page 54: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/54.jpg)
Backward ClassificationBackward Classification
Disambiguating “to” in this case would be Disambiguating “to” in this case would be even easier backward.even easier backward.
IN DT NNJohn saw the saw and decided to take it to the table.
classifier
PRP
Slide from Ray Mooney
![Page 55: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/55.jpg)
Backward ClassificationBackward Classification
Disambiguating “to” in this case would be Disambiguating “to” in this case would be even easier backward.even easier backward.
PRP IN DT NNJohn saw the saw and decided to take it to the table.
classifier
VB
Slide from Ray Mooney
![Page 56: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/56.jpg)
Backward ClassificationBackward Classification
Disambiguating “to” in this case would be Disambiguating “to” in this case would be even easier backward.even easier backward.
VB PRP IN DT NNJohn saw the saw and decided to take it to the table.
classifier
TO
Slide from Ray Mooney
![Page 57: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/57.jpg)
Backward ClassificationBackward Classification
Disambiguating “to” in this case would be Disambiguating “to” in this case would be even easier backward.even easier backward.
TO VB PRP IN DT NN John saw the saw and decided to take it to the table.
classifier
VBD
Slide from Ray Mooney
![Page 58: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/58.jpg)
Backward ClassificationBackward Classification
Disambiguating “to” in this case would be Disambiguating “to” in this case would be even easier backward.even easier backward.
VBD TO VB PRP IN DT NN John saw the saw and decided to take it to the table.
classifier
CC
Slide from Ray Mooney
![Page 59: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/59.jpg)
Backward ClassificationBackward Classification
Disambiguating “to” in this case would be Disambiguating “to” in this case would be even easier backward.even easier backward.
CC VBD TO VB PRP IN DT NN John saw the saw and decided to take it to the table.
classifier
VBD
Slide from Ray Mooney
![Page 60: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/60.jpg)
Backward ClassificationBackward Classification
Disambiguating “to” in this case would be Disambiguating “to” in this case would be even easier backward.even easier backward.
VBD CC VBD TO VB PRP IN DT NNJohn saw the saw and decided to take it to the table.
classifier
DT
Slide from Ray Mooney
![Page 61: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/61.jpg)
Backward ClassificationBackward Classification
Disambiguating “to” in this case would be Disambiguating “to” in this case would be even easier backward.even easier backward.
DT VBD CC VBD TO VB PRP IN DT NNJohn saw the saw and decided to take it to the table.
classifier
VBD
Slide from Ray Mooney
![Page 62: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/62.jpg)
Backward ClassificationBackward Classification
Disambiguating “to” in this case would be Disambiguating “to” in this case would be even easier backward.even easier backward.
VBD DT VBD CC VBD TO VB PRP IN DT NN John saw the saw and decided to take it to the table.
classifier
NNP
Slide from Ray Mooney
![Page 63: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/63.jpg)
NER as Sequence LabelingNER as Sequence Labeling
![Page 64: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/64.jpg)
Why classifiers aren’t as Why classifiers aren’t as good as sequence modelsgood as sequence models
![Page 65: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/65.jpg)
Problems with using Classifiers Problems with using Classifiers for Sequence Labelingfor Sequence Labeling
It’s not easy to integrate information It’s not easy to integrate information from hidden labels on both sidesfrom hidden labels on both sides
We make a hard decision on each We make a hard decision on each tokentoken We’d rather choose a global optimum The best labeling for the whole sequence Keeping each local decision as just a
probability, not a hard decision
![Page 66: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/66.jpg)
Probabilistic Sequence Probabilistic Sequence ModelsModelsProbabilistic sequence models allow Probabilistic sequence models allow
integrating uncertainty over multiple, integrating uncertainty over multiple, interdependent classifications and interdependent classifications and collectively determine the most likely collectively determine the most likely global assignmentglobal assignment
Two standard modelsTwo standard models Hidden Markov Model (HMM) Conditional Random Field (CRF) Maximum Entropy Markov Model (MEMM)
is a simplified version of CRF
![Page 67: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/67.jpg)
HMMs vs. MEMMsHMMs vs. MEMMs
Slide from Jim Martin
![Page 68: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/68.jpg)
HMMs vs. MEMMsHMMs vs. MEMMs
Slide from Jim Martin
![Page 69: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/69.jpg)
HMMs vs. MEMMsHMMs vs. MEMMs
Slide from Jim Martin
![Page 70: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/70.jpg)
HMM (top) and MEMM HMM (top) and MEMM (bottom)(bottom)
![Page 71: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/71.jpg)
Viterbi in MEMMsViterbi in MEMMs We condition on the observation AND the previous state:We condition on the observation AND the previous state:
HMM decoding:HMM decoding:
Which is the HMM version of:Which is the HMM version of:
MEMM decoding: MEMM decoding:
![Page 72: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/72.jpg)
Decoding in MEMMsDecoding in MEMMs
![Page 73: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/73.jpg)
Evaluation MetricsEvaluation Metrics
![Page 74: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/74.jpg)
PrecisionPrecision
Precision: how many of the names we Precision: how many of the names we returned are really names?returned are really names?
Recall: how many of the names in the Recall: how many of the names in the database did we find?database did we find?
Precision Number of correct names given by system
Total number of names given by system
Recall Number of correct names given by system
Total number of actual names in the text
![Page 75: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/75.jpg)
F-measureF-measure
F-measure is a way to combine these:F-measure is a way to combine these:
More generally:More generally:
![Page 76: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/76.jpg)
F-measureF-measure
Harmonic mean is the reciprocal of Harmonic mean is the reciprocal of arthithmetic mean of reciprocals:arthithmetic mean of reciprocals:
Hence F-measure is:Hence F-measure is:
![Page 77: Named Entity Tagging Thanks to Dan Jurafsky, Jim Martin, Ray Mooney, Tom Mitchell for slides](https://reader035.vdocument.in/reader035/viewer/2022062806/56649eb35503460f94bbb1f5/html5/thumbnails/77.jpg)
OutlineOutline
Named Entities and the basic ideaNamed Entities and the basic idea IOB TaggingIOB Tagging A new classifier: Logistic RegressionA new classifier: Logistic Regression
Linear regression Logistic regression Multinomial logistic regression = MaxEnt
Why classifiers aren’t as good as sequence Why classifiers aren’t as good as sequence modelsmodels
A new sequence model:A new sequence model: MEMM = Maximum Entropy Markov Model