algorithms for nlpdemo.clab.cs.cmu.edu/algo4nlp20/slides/sp20 iitp lecture 10 -- hm… ·...

64
1 Yulia Tsvetkov Algorithms for NLP IITP, Spring 2020 HMMs, POS tagging, NER

Upload: others

Post on 06-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

1

Yulia Tsvetkov

Algorithms for NLP

IITP, Spring 2020

HMMs, POS tagging, NER

Page 2: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

2

▪ POS tagging recap▪ HMMs, Viterbi ▪ HMMs+

▪ dealing with UNKs▪ 3gram HMMs▪ multilingual POS tagging

▪ Featurizing HMMs▪ MEMM, CRF

▪ NER ▪ HMMs is speech recognition

Plan

Page 3: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 4: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 5: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 6: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 7: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 8: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 9: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

https://universaldependencies.org

Page 10: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 11: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 12: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 13: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

Page 14: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

▪▪

▪▪ → →

▪▪▪

Page 15: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

15

Levels of linguistic knowledge

Slide credit: Noah Smith

Page 16: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

16

▪ map a sequence of words to a sequence of labels

▪ Part-of-speech tagging (Church, 1988; Brants, 2000) ▪ Named entity recognition (Bikel et al., 1999)▪ Text chunking and shallow parsing (Ramshaw and Marcus,

1995) ▪ Word alignment of parallel text (Vogel et al., 1996) ▪ Compression (Conroy and O’Leary, 2001) ▪ Acoustic models, discourse segmentation, etc.

Sequence Labeling

Page 17: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

17

Sequence labeling as classification

Page 18: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 19: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

the future is independent of the past given the present

Page 20: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 21: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

the future is independent of the past given the present

Page 22: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

o1 o2 on

...

Page 23: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 24: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 25: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

Page 26: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

▪▪

▪▪

Page 27: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 28: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 29: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 30: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 31: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 32: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 33: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 34: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 35: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 36: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 37: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 38: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 39: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 40: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

▪▪▪

Page 41: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 42: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

▪▪ →▪ →▪ →▪ →▪ →▪ →

▪▪ →

▪▪▪

Page 43: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

▪▪▪▪

Page 44: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

Page 45: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

Page 46: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 47: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

Page 48: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 49: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 50: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

▪▪

Page 51: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

▪▪

▪▪▪

Page 52: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

▪▪

Page 53: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

▪ ⇒▪

Page 54: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

Page 55: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 56: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 57: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 58: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 59: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 60: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

▪▪

Page 61: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

▪▪

▪▪

Page 62: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet

ssssssssppppeeeeeeetshshshshllllaeaeaebbbbb

“speech lab”

Page 63: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet
Page 64: Algorithms for NLPdemo.clab.cs.cmu.edu/algo4nlp20/slides/SP20 IITP lecture 10 -- HM… · smoothing. Rows are labeled with the conditioning event; thus BIND) is 0.7968. NN DT Janet