nlp: overview, non-dnn approaches, computational linguistics

37
NLP: overview, non-DNN approaches, computational linguistics Jonathan K. Kummerfeld Postdoctoral Research Fellow, CSE

Upload: others

Post on 27-Apr-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NLP: overview, non-DNN approaches, computational linguistics

NLP: overview,non-DNN approaches,

computational linguisticsJonathan K. Kummerfeld

Postdoctoral Research Fellow, CSE

Page 2: NLP: overview, non-DNN approaches, computational linguistics

Part 1:Introduction

Page 3: NLP: overview, non-DNN approaches, computational linguistics

Who am I?Introduction

NLP Overview CL Overview

Researcher in NLP

First paper, 2008, on detecting non-compositional verb-particle constructions,e.g.“ferret out” and “hand in”

Since then, topics including:- Parsing - Dialogue - Crowdsourcing - Cybercrime analysis- Playing Diplomacy

3

Page 4: NLP: overview, non-DNN approaches, computational linguistics

What is this lecture about?

An overview of:

• Natural Language Processing (NLP)

• Computational Linguistics(CL)

IntroductionNLP Overview

CL Overview

Note: For NLP,Lectures 2 and 9 covered what This lecture covers how

4

Page 5: NLP: overview, non-DNN approaches, computational linguistics

Recap: NLP in Conversational AIIntroduction

NLP Overview CL OverviewReview: Conversational Flow

Automated Speech Recognition

(ASR)

Spoken language (i.e., sound)

utterance

“Please pay Dr. Leach $1000.”

Natural language understanding

Intent Classification:transfer_money

Slot mapping:recipient: “Dr. Leach”amount: “$1000”

Business Logic:

Deduct $1000 from accountAdd $1000 to recipient account

Response Generation

Template Responses

“OK, I gave Dr. Leach $1000.”

“Sorry fam, you don’t have

enough cash”

Text-to-speech (TTS)

5

Page 6: NLP: overview, non-DNN approaches, computational linguistics

Recap: Other NLP ApplicationsIntroduction

NLP Overview CL OverviewApplications: Text Classification

10

www.wired.com

Applications: Sentiment/Opinion

12

6

Applications: Machine Translation

11

Page 7: NLP: overview, non-DNN approaches, computational linguistics

Part 2:Natural Language

Processing

Page 8: NLP: overview, non-DNN approaches, computational linguistics

Example task: Sentiment AnalysisIntroduction

NLP OverviewCL Overview

+

+

-

Example from Bo Pang, Lillian Lee, Shivakumar Vaithyanathan (2002), via Greg Durrett’s Lecture Slides for CS378 at UT Austin8

This movie was great! Would watch again

The movie was gross and overwrought, but I liked it

This movie was not really very enjoyable

Page 9: NLP: overview, non-DNN approaches, computational linguistics

Key components of NLP SystemsIntroduction

NLP OverviewCL Overview

9

Disclaimer: These slides focus on widely used forms of supervised learning.

There is an enormous range of other methods that don't fit neatly into this framework. Take some AI / ML classes and you’ll learn about them!

Page 10: NLP: overview, non-DNN approaches, computational linguistics

Key components of NLP SystemsIntroduction

NLP OverviewCL Overview

10

Data Examples of the language phenomena we want our system to handle

Model A function that maps (input, output) pairs to scores

Inference Method A way to make a prediction for an example given a Model

Learning Method A way to update a given and an Model Data Inference

Method

Page 11: NLP: overview, non-DNN approaches, computational linguistics

Rule-based MethodsIntroduction

NLP OverviewCL Overview

11

Model

Inference Method

Learning Method

DataNone in theory, but rules are hard to write without examples

People edit the rules

A set of rules If the review contains “good” return positive

Code to run the rules if ‘enjoyable’ in review: return True

Page 12: NLP: overview, non-DNN approaches, computational linguistics

Example task: Sentiment AnalysisIntroduction

NLP OverviewCL Overview

This movie was great! Would watch again

The movie was gross and overwrought, but I liked it

This movie was not really very enjoyable

+

+

-

Example from Bo Pang, Lillian Lee, Shivakumar Vaithyanathan (2002), via Greg Durrett’s Lecture Slides for CS378 at UT Austin12

Page 13: NLP: overview, non-DNN approaches, computational linguistics

Linear ModelsIntroduction

NLP OverviewCL Overview

13

Model

Inference Method

Learning Method

Data

A set of features and weights

“liked”: +1.2,“gross”: -1.4,“gross AND liked”: +0.3, …

A set of examples with their true labels

++-

This movie was great! Would watch again

The movie was gross and overwrought, but I liked it

This movie was not really very enjoyable

A range of methods - this is Machine Learning if prediction is wrong:

for each feature used: feature += sign of answer

A range of methods prediction = sign[sum features in the review]

Page 14: NLP: overview, non-DNN approaches, computational linguistics

Example task: Sentiment AnalysisIntroduction

NLP OverviewCL Overview

+

Example from Bo Pang, Lillian Lee, Shivakumar Vaithyanathan (2002), via Greg Durrett’s Lecture Slides for CS378 at UT Austin14

The movie was gross and overwrought, but I liked it

-I liked the start, but overall it was too gross

Page 15: NLP: overview, non-DNN approaches, computational linguistics

Non-Linear ModelsIntroduction

NLP OverviewCL Overview

15

Model

Inference Method

Learning Method

Data

A range of methods - this is Machine Learning

A set of examples with their true labels

++-

This movie was great! Would watch again

The movie was gross and overwrought, but I liked it

This movie was not really very enjoyable

A set of weights and structure of the neural network

Language Preprocessing: Why?

• Curating and cleaning data is critically important• Statistical models, classical machine learning, and deep learning are

all based on the same type of data as inputs

“Do my taxes.”

transfer

Withdraw

Fill_form

Delete_form

Hello

?6

Page 16: NLP: overview, non-DNN approaches, computational linguistics

Method ComparisonIntroduction

NLP OverviewCL Overview

16

Model

Inference Method

Learning Method

DataNone in theory, but rules are hard to write without examples

A set of rules

Code to run the rules

People edit the rules

A set of examples with their true labels

A set of features and weights

A range of methods

A set of weights and structure of the neural network

A range of methods - this is Machine Learning

Rule-based Linear Non-Linear

Page 17: NLP: overview, non-DNN approaches, computational linguistics

Data typesIntroduction

NLP OverviewCL Overview

17

Model

Inference Method

Learning Method

Data(input, output)

(sentence, sentiment)

(document, summary)

(paragraph, argument structure)

Input - Various sized pieces of text

ExampleOutput

A set of labels

Structured

Free text

Page 18: NLP: overview, non-DNN approaches, computational linguistics

Data sizesIntroduction

NLP OverviewCL Overview

18

Model

Inference Method

Learning Method

Data

Language modeling: 1 million - 30 trillion+ words

Translation: 60 million words from each of 21 languages

Sentiment: 12,000 sentences

Sentence Structure: 40,000 sentences

Page 19: NLP: overview, non-DNN approaches, computational linguistics

Models are functions that take an (input, label) pair and return a score

IntroductionNLP Overview

CL Overview

19

Model

Inference Method

Learning Method

Data def model(text, label): . . . do stuff . . . . . . return score

Inputs

Scores with label 1

Scores with label 2

Page 20: NLP: overview, non-DNN approaches, computational linguistics

Models are functions that take an (input, label) pair and return a score

IntroductionNLP Overview

CL Overview

20

Model

Inference Method

Learning Method

Data

Inputs

Scores with label 1

Scores with label 2

score1 = a1x + k1

score2 = a2x + k2

Page 21: NLP: overview, non-DNN approaches, computational linguistics

Models are functions that take an (input, label) pair and return a score

IntroductionNLP Overview

CL Overview

21

Model

Inference Method

Learning Method

Data

Input 1

score1 = a1x1 + b1x2 + k1

score2 = a2x1 + b2x2 + k2 Input 2Scores with label 1 ???

Scores with label 2 ???

Page 22: NLP: overview, non-DNN approaches, computational linguistics

Models are functions that take an (input, label) pair and return a score

IntroductionNLP Overview

CL Overview

22

Model

Inference Method

Learning Method

Data

Input 1

score1 = a1x1 + b1x2 + k1

score2 = a2x1 + b2x2 + k2 Input 2Scores with label 1 ???

Scores with label 2 ???

Input 3

Page 23: NLP: overview, non-DNN approaches, computational linguistics

Models are functions that take an (input, label) pair and return a score

IntroductionNLP Overview

CL Overview

23

Model

Inference Method

Learning Method

Data def model(text, label): . . . do stuff . . . . . . return score

x1 = 1 if ‘good’ in text 0 otherwise{

Page 24: NLP: overview, non-DNN approaches, computational linguistics

Check every output option?Introduction

NLP OverviewCL Overview

24

Model

Inference Method

Learning Method

DataUp to thousands of labels - all good.

Exponential set of labels - hmmm Translation:all possible sentences

Parsing:2(words squared)

Mark Liberman, http://languagelog.ldc.upenn.edu/nll/?p=17711

Page 25: NLP: overview, non-DNN approaches, computational linguistics

Algorithms to the rescue!Introduction

NLP OverviewCL Overview

25

Model

Inference Method

Learning Method

Data Greedy Search

A* Search

Dynamic Programming

Best(word1) Best(word2 given word1) …

Page 26: NLP: overview, non-DNN approaches, computational linguistics

Core idea: Apply the model, see what it gets wrong, update it to fix the mistake

IntroductionNLP Overview

CL Overview

26

Model

Inference Method

Learning Method

Data

Inputs

Scores with label 1

Scores with label 2

Page 27: NLP: overview, non-DNN approaches, computational linguistics

Core idea: Apply the model, see what it gets wrong, update it to fix the mistake

IntroductionNLP Overview

CL Overview

27

Model

Inference Method

Learning Method

Data score1 = a1 x1 + b1 x2 + . . .

How many examples at once? 1, a few, all

How much should a, b, etc change? +/-1, vary based on error, vary based on previous updates

Page 28: NLP: overview, non-DNN approaches, computational linguistics

Core idea: Apply the model, see what it gets wrong, update it to fix the mistake

IntroductionNLP Overview

CL Overview

28

Model

Inference Method

Learning Method

Data score1 = a1 x1 + b1 x2 + . . .

Key property for linear models: convexity

No longer true for neural networks :(

Page 29: NLP: overview, non-DNN approaches, computational linguistics

End of Part 2Introduction

NLP OverviewCL Overview

29

Model

Inference Method

Learning Method

Data

Questions?

Page 30: NLP: overview, non-DNN approaches, computational linguistics

Part 3:Computational Linguistics

Page 31: NLP: overview, non-DNN approaches, computational linguistics

Sub-Areas of Computational Linguistics

1. Natural Language Processing / Human Language TechnologyAll the things you’ve seen so far.

2. Computational PsycholinguisticsStudying humans, e.g., how do we learn language?

3. Digital LinguisticsUsing computation to support language documentation and linguistic research.

4. Other ApplicationsSocial science research that uses language, e.g., literary theory, political science

Introduction NLP Overview CL Overview

Partially based on Steve Abney’s Ling 441 notes31

Page 32: NLP: overview, non-DNN approaches, computational linguistics

Computational PsycholinguisticsIntroduction

NLP Overview CL Overview

32

Using computational methods to study the way people learn and use language.

https://penntoday.upenn.edu/2016-03-17/latest-news/eye-tracking-study-penn-linguist-reveals-inner-workings-human-mind

Page 33: NLP: overview, non-DNN approaches, computational linguistics

Digital LinguisticsIntroduction

NLP Overview CL Overview

33

Using computational methods to support language documentation and Linguistics research.

https://archive.mpi.nl/tla/elanBy Noahedits - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=84815849

Page 34: NLP: overview, non-DNN approaches, computational linguistics

Literary TheoryIntroduction

NLP Overview CL Overview

34

15,099 English novels published between 1700 and 1899

Models to understand pronoun references and identify

sentence structure

Page 35: NLP: overview, non-DNN approaches, computational linguistics

Political ScienceIntroduction

NLP Overview CL Overview

35

24,236 press releases

Model to identify topics in text

Page 36: NLP: overview, non-DNN approaches, computational linguistics

Computational SociolinguisticsIntroduction

NLP Overview CL Overview

36

Using computational methods to study the influence of society and language on each other.

LGBTQ 1986 - 2015

Page 37: NLP: overview, non-DNN approaches, computational linguistics

Thanks!What questions do you have?