language processing deep learning methods for natural

91
Deep Learning Methods for Natural Language Processing Garrett Hoffman Director of Data Science @ StockTwits

Upload: others

Post on 05-Apr-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Language Processing Deep Learning Methods for Natural

Deep Learning Methods for Natural Language ProcessingGarrett HoffmanDirector of Data Science @ StockTwits

Page 3: Language Processing Deep Learning Methods for Natural

Learning Distributed Representations of Words with Word2Vec

3

Page 4: Language Processing Deep Learning Methods for Natural

Sparse Representation

Page 5: Language Processing Deep Learning Methods for Natural

Sparse Representation

Page 6: Language Processing Deep Learning Methods for Natural

Sparse Representation

Page 7: Language Processing Deep Learning Methods for Natural

Sparse Representation

Page 8: Language Processing Deep Learning Methods for Natural

Sparse Representation

Page 9: Language Processing Deep Learning Methods for Natural

Sparse Representation Drawbacks

Page 10: Language Processing Deep Learning Methods for Natural

Sparse Representation Drawbacks

Page 11: Language Processing Deep Learning Methods for Natural

Sparse Representation Drawbacks

Page 12: Language Processing Deep Learning Methods for Natural

Distributed Representation

Page 13: Language Processing Deep Learning Methods for Natural

Distributed Representation

Page 14: Language Processing Deep Learning Methods for Natural

Distributed Representation

Page 15: Language Processing Deep Learning Methods for Natural

Distributed Representation

Page 16: Language Processing Deep Learning Methods for Natural

Word2Vec

“Distributed Representations of Words and Phrases and their Compositionality”, Mikolov et al. (2013)

Page 17: Language Processing Deep Learning Methods for Natural
Page 18: Language Processing Deep Learning Methods for Natural

Word2Vec - Generating Data

McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.

Page 19: Language Processing Deep Learning Methods for Natural

Word2Vec - Skip-gram Network Architecture

McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.

Page 20: Language Processing Deep Learning Methods for Natural

Word2Vec - Skip-gram Network Architecture

McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.

Page 21: Language Processing Deep Learning Methods for Natural

Word2Vec - Embedding Layer

McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.

Page 22: Language Processing Deep Learning Methods for Natural

Word2Vec - Embedding Layer

McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.

Page 23: Language Processing Deep Learning Methods for Natural

Word2Vec - Skip-gram Network Architecture

McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.

Page 24: Language Processing Deep Learning Methods for Natural

Word2Vec - Output Layer

McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.

Page 25: Language Processing Deep Learning Methods for Natural

Word2Vec - Intuition

McCormick, C. (2017, January 11). Word2Vec Tutorial Part 2 - Negative Sampling.

Page 26: Language Processing Deep Learning Methods for Natural

Word2Vec - Negative Sampling

McCormick, C. (2017, January 11). Word2Vec Tutorial Part 2 - Negative Sampling.

Page 27: Language Processing Deep Learning Methods for Natural

Word2Vec - Negative Sampling

McCormick, C. (2017, January 11). Word2Vec Tutorial Part 2 - Negative Sampling.

Page 28: Language Processing Deep Learning Methods for Natural

https://www.tensorflow.org/tutorials/word2vec

Word2Vec - Results

Page 29: Language Processing Deep Learning Methods for Natural

Pre-Trained Word Embedding

Page 30: Language Processing Deep Learning Methods for Natural

Distributed Representations of Sentences and Documents

Doc2Vec

Page 31: Language Processing Deep Learning Methods for Natural

Recurrent Neural Networks and their Variants

31

Page 32: Language Processing Deep Learning Methods for Natural

Sequence Models

Page 33: Language Processing Deep Learning Methods for Natural

Recurrent Neural Networks (RNNs)

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 34: Language Processing Deep Learning Methods for Natural

Recurrent Neural Networks (RNNs)

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 35: Language Processing Deep Learning Methods for Natural

Recurrent Neural Networks (RNNs)

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 36: Language Processing Deep Learning Methods for Natural

Long Term Dependency Problem

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 37: Language Processing Deep Learning Methods for Natural

Long Short Term Memory (LSTMs)

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 38: Language Processing Deep Learning Methods for Natural

Long Short Term Memory (LSTMs)

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 39: Language Processing Deep Learning Methods for Natural

Long Short Term Memory (LSTMs)

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 40: Language Processing Deep Learning Methods for Natural

LSTM - Forget Gate

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 41: Language Processing Deep Learning Methods for Natural

LSTM - Learn Gate

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 42: Language Processing Deep Learning Methods for Natural

LSTM - Update Gate

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 43: Language Processing Deep Learning Methods for Natural

LSTM - Output Gate

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 44: Language Processing Deep Learning Methods for Natural

Gated Recurrent Unit (GRU)

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 45: Language Processing Deep Learning Methods for Natural

Types of RNNs

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Page 46: Language Processing Deep Learning Methods for Natural

Types of RNNs

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Page 47: Language Processing Deep Learning Methods for Natural

LSTM Network Architecture

Page 48: Language Processing Deep Learning Methods for Natural

Learning Embeddings End-to-End

Page 49: Language Processing Deep Learning Methods for Natural

Dropout

Page 50: Language Processing Deep Learning Methods for Natural

Bidirectional LSTM

http://colah.github.io/posts/2015-09-NN-Types-FP/

Page 51: Language Processing Deep Learning Methods for Natural

Convolutional Neural Networks for Language Tasks

51

Page 52: Language Processing Deep Learning Methods for Natural

Computer Vision Models

Page 53: Language Processing Deep Learning Methods for Natural

Convolutional Neural Networks (CNNs)

Page 54: Language Processing Deep Learning Methods for Natural

Convolutional Neural Networks (CNNs)

http://colah.github.io/posts/2014-07-Conv-Nets-Modular/

Page 55: Language Processing Deep Learning Methods for Natural

CNNs - Convolution Function

0 0 0 0 0 0

0 1 2 1 1 2

0 1 1 1 1 1

1 0 0 0 0 0

0 0 1 1 1 0

0 1 1 1 1 1

0 0 2

1 2 0

1 2 2

Input Vector Kernel / Filter

Page 56: Language Processing Deep Learning Methods for Natural

CNNs - Convolution Function

0 0 0 0 0 0

0 1 2 1 1 2

0 1 1 1 1 1

1 0 0 0 0 0

0 0 1 1 1 0

0 1 1 1 1 1

0 0 2

1 2 0

1 2 2

Input Vector Kernel / Filter

Page 57: Language Processing Deep Learning Methods for Natural

CNNs - Convolution Function

0 0 0 0 0 0

0 1 2 1 1 2

0 1 1 1 1 1

1 0 0 0 0 0

0 0 1 1 1 0

0 1 1 1 1 1

0 0 0

1 0 0

0 2 0

Input Vector Kernel / Filter

2

Output Vector

Page 58: Language Processing Deep Learning Methods for Natural

CNNs - Convolution Function

0 0 0 0 0 0

0 1 2 1 1 2

0 1 1 1 1 1

1 0 0 0 0 0

0 0 1 1 1 0

0 1 1 1 1 1

0 0 0

1 0 0

0 2 0

Input Vector Kernel / Filter

2 3

Output Vector

Page 59: Language Processing Deep Learning Methods for Natural

CNNs - Convolution Function

0 0 0 0 0 0

0 1 2 1 1 2

0 1 1 1 1 1

1 0 0 0 0 0

0 0 1 1 1 0

0 1 1 1 1 1

0 0 0

1 0 0

0 2 0

Input Vector Kernel / Filter

2 3 4

Output Vector

Page 60: Language Processing Deep Learning Methods for Natural

CNNs - Convolution Function

0 0 0 0 0 0

0 1 2 1 1 2

0 1 1 1 1 1

1 0 0 0 0 0

0 0 1 1 1 0

0 1 1 1 1 1

0 0 0

1 0 0

0 2 0

Input Vector Kernel / Filter

2 3 4 3

Output Vector

Page 61: Language Processing Deep Learning Methods for Natural

CNNs - Convolution Function

0 0 0 0 0 0

0 1 2 1 1 2

0 1 1 1 1 1

1 0 0 0 0 0

0 0 1 1 1 0

0 1 1 1 1 1

0 0 0

1 0 0

0 2 0

Input Vector Kernel / Filter

2 3 4 3

0

Output Vector

Page 62: Language Processing Deep Learning Methods for Natural

CNNs - Convolution Function

0 0 0 0 0 0

0 1 2 1 1 2

0 1 1 1 1 1

1 0 0 0 0 0

0 0 1 1 1 0

0 1 1 1 1 1

0 0 0

1 0 0

0 2 0

Input Vector Kernel / Filter

2 3 4 3

0 1

Output Vector

Page 63: Language Processing Deep Learning Methods for Natural

CNNs - Convolution Function

0 0 0 0 0 0

0 1 2 1 1 2

0 1 1 1 1 1

1 0 0 0 0 0

0 0 1 1 1 0

0 1 1 1 1 1

0 0 0

1 0 0

0 2 0

Input Vector Kernel / Filter

2 3 4 3

0 1 1 1

1 2 2 2

2 2 3 3

Output Vector

Page 64: Language Processing Deep Learning Methods for Natural

CNNs - Max Pooling Function

3

Input Vector Output Vector

2 3 4 3

0 1 1 1

1 2 2 2

2 2 3 3

Page 65: Language Processing Deep Learning Methods for Natural

CNNs - Max Pooling Function

3 4

Input Vector Output Vector

2 3 4 3

0 1 1 1

1 2 2 2

2 2 3 3

Page 66: Language Processing Deep Learning Methods for Natural

CNNs - Max Pooling Function

3 4

2

Input Vector Output Vector

2 3 4 3

0 1 1 1

1 2 2 2

2 2 3 3

Page 67: Language Processing Deep Learning Methods for Natural

CNNs - Max Pooling Function

3 4

2 3

Input Vector Output Vector

2 3 4 3

0 1 1 1

1 2 2 2

2 2 3 3

Page 68: Language Processing Deep Learning Methods for Natural

Convolutional Neural Networks (CNNs)

Page 69: Language Processing Deep Learning Methods for Natural

CNN Architecture for Text

Page 70: Language Processing Deep Learning Methods for Natural

State of the Art in NLP - Generalized Language Models

70

Page 71: Language Processing Deep Learning Methods for Natural

Generalized Language Modeling

Page 72: Language Processing Deep Learning Methods for Natural

Types of RNNs

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Page 73: Language Processing Deep Learning Methods for Natural

P(wn|w1,…wn−

1)

Generalized Language Modeling

Page 74: Language Processing Deep Learning Methods for Natural

Current SOTA

Page 75: Language Processing Deep Learning Methods for Natural

ULMFiT

http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html

Page 76: Language Processing Deep Learning Methods for Natural

ULMFiT

http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html

Page 77: Language Processing Deep Learning Methods for Natural

ULMFiT - GLM Pre Training

AWD-LSTM

Page 78: Language Processing Deep Learning Methods for Natural

ULMFiT

http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html

Page 79: Language Processing Deep Learning Methods for Natural

ULMFiT - Refine GLM for Target Task

Discriminative Fine-Tuning

Slanted Triangular Learning Rates (STLR)

Page 80: Language Processing Deep Learning Methods for Natural

ULMFiT

http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html

Page 81: Language Processing Deep Learning Methods for Natural

ULMFiT - Target Task Classification Training

Concat Pooling

Gradual Unfreeze

Page 82: Language Processing Deep Learning Methods for Natural

BERT / GPT-2 - Transformer Model

Transformer Model

Page 83: Language Processing Deep Learning Methods for Natural

Attention Mechanism

http://www.abigailsee.com/2017/04/16/taming-rnns-for-better-summarization.html

Page 84: Language Processing Deep Learning Methods for Natural

Transformer Model

Attention Is All You Need

Page 85: Language Processing Deep Learning Methods for Natural

Transformer Model

Attention Is All You Need

Page 86: Language Processing Deep Learning Methods for Natural

Transformer Model

http://mlexplained.com/2017/12/29/attention-is-all-you-need-explained/

Page 87: Language Processing Deep Learning Methods for Natural

Practical Considerations for Modeling with Your Data

87

Page 88: Language Processing Deep Learning Methods for Natural

Practical Considerations

Page 89: Language Processing Deep Learning Methods for Natural

Practical Considerations

Page 90: Language Processing Deep Learning Methods for Natural

Practical Considerations