9.012 brain and cognitive sciences ii part viii: intro to language & psycholinguistics - dr. ted...

83
9.012 Brain and Cognitive Sciences II art VIII: Intro to Language & Psycholinguistics - Dr. Ted Gibson

Post on 19-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

9.012Brain andCognitive Sciences II

Part VIII: Intro to Language & Psycholinguistics

- Dr. Ted Gibson

Presentedby

Liu Lab

Fighting for Freedomwith Cultured Neurons

Nathan Wilson

Distributed Representations, Simple Recurrent Networks,And Grammatical Structure

Jeffrey L. Elman (1991)Machine Learning

Distributed Representations/ Neural Networks

• are meant to capture the essence of neural computation:

many small, independent units calculating very simple functions in parallel.

EXPLICITRULES?

Distributed Representations/ Neural Networks:

EXPLICITRULES?

Distributed Representations/ Neural Networks:

EXPLICITRULES?

Distributed Representations/ Neural Networks:

EMERGENCE!

Distributed Representations/ Neural Networks

• are meant to capture the essence of neural computation:

many small, independent units calculating very simple functions in parallel.

FeedForward Neural Network(from Sebastian’s Teaching)

Don’t forget the nonlinearity!

FeedForward Neural Network(from Sebastian’s Teaching)

Recurrent Network (also from Sebastian)

Why Apply Network / Connectionist Modeling to

Language Processing?

• Connectionist Modeling is Good at What it Does

• Language is a HARD problem

What We Are Going to Do

What We Are Going to Do

• Build a network

What We Are Going to Do

• Build a network• Let it learn how to “read”

What We Are Going to Do

• Build a network• Let it learn how to “read” • Then test it!

What We Are Going to Do

• Build a network• Let it learn how to “read”• Then test it!

– Give it some words in a reasonably grammatical sentence

– Let it try to predict the next word, • Based on what it knows about grammar

What We Are Going to Do

• Build a network• Let it learn how to “read”• Then test it!

– Give it some words in a reasonably grammatical sentence

– Let it try to predict the next word, • Based on what it knows about grammar

– BUT: We’re not going to tell it any of the rules

What We Are Going to Do

• Build a network

FeedForward Neural Network(from Sebastian’s Teaching)

1000000000000

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

Methods > Network Implementation > Structure

What We Are Going to Do

• Build a network• Let it learn how to “read”

Methods > Network Implementation > Training

Words We’re going to Teach it:

- Nouns:boy | girl | cat | dog | boys | girls | cats | dogs

- Proper Nouns:John | Mary

- “Who”

- Verbs:chase | feed | see | hear | walk | live | chases | feeds | sees | hears | walks | lives

- “End Sentence”

Methods > Network Implementation > Training

1. Encode Each Word with Unique Activation Pattern

Methods > Network Implementation > Training

1. Encode Each Word with Unique Activation Pattern

- boy => 000000000000000000000001- girl => 000000000000000000000010-feed => 000000000000000000000100-sees => 000000000000000000001000

. . .

-who => 010000000000000000000000-End sentence =>

100000000000000000000000

Methods > Network Implementation > Training

1. Encode Each Word with Unique Activation Pattern

- boy => 000000000000000000000001- girl => 000000000000000000000010-feed => 000000000000000000000100-sees => 000000000000000000001000

. . .

-who => 010000000000000000000000-End sentence =>

100000000000000000000000

2. Feed these words sequentially to the network(only feed words in sequences that make good grammatical sense!)

INPUT

Methods > Network Implementation > Structure

1000000000000INPUT

Methods > Network Implementation > Structure

1000000000000INPUT

HIDDEN

Methods > Network Implementation > Structure

1000000000000

100100100100100100100100

INPUT

HIDDEN

Methods > Network Implementation > Structure

1000000000000

100100100100100100100100

INPUT

HIDDEN

OUTPUT

Methods > Network Implementation > Structure

1000000000000

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

Methods > Network Implementation > Structure

Methods > Network Implementation > Training

1. Encode Each Word with Unique Activation Pattern

- boy => 000000000000000000000001- girl => 000000000000000000000010-feed => 000000000000000000000100-sees => 000000000000000000001000

. . .

-who => 010000000000000000000000-End sentence =>

100000000000000000000000

2. Feed these words sequentially to the network(only feed words in sequences that make good grammatical sense!)

1000000000000

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

Methods > Network Implementation > Structure

What We Are Going to Do

• Build a network• Let it learn how to “read”

1000000000000

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

Methods > Network Implementation > Structure

1000000000000

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

Methods > Network Implementation > Structure

If learning wordrelations, needsome sort of memoryfrom word to word!

FeedForward Neural Network(from Sebastian’s Teaching)

Recurrent Network (also from Sebastian)

1000000000000

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

Methods > Network Implementation > Structure

1000000000000

100100100100100100100100

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

CONTEXT

Methods > Network Implementation > Structure

1000000000000

100100100100100100100100

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

CONTEXT

Methods > Network Implementation > Structure

1000000000000

100100100100100100100100

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

CONTEXT

Methods > Network Implementation > Structure

1000000000000

100100100100100100100100

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

CONTEXT

Methods > Network Implementation > Structure

1000000000000

100100100100100100100100

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

CONTEXT

Methods > Network Implementation > Structure

BACKPROP!

What We Are Going to Do

• Build a network• Let it learn how to “read”• Then test it!

– Give it some words in a reasonably grammatical sentence

– Let it try to predict the next word, • Based on what it knows about grammar

– BUT: We’re not going to tell it any of the rules

-After Hearing: “boy….”

-Network SHOULD predict next word is:“chases”

-NOT:“chase”

Subject and verb should agree!

Results > Emergent Properties of Network > Subject-Verb Agreement

-After Hearing: “boy….”

-Network SHOULD predict next word is:“chases”

-NOT:“chase”

Subject and verb should agree!

Results > Emergent Properties of Network > Noun-Verb Agreement

Results > Emergent Properties of Network > Noun-Verb Agreement

0.0 0.2 0.4 0.6 0.8 1.0Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boy…..

Results > Emergent Properties of Network > Noun-Verb Agreement

0.0 0.2 0.4 0.6 0.8 1.0Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boy…..

-Likewise, after Hearing: “boys….” (or boyz!)

-Network SHOULD predict next word is:“chase”

-NOT:“chases”

Again, subject and verb should agree!

Results > Emergent Properties of Network > Noun-Verb Agreement

Results > Emergent Properties of Network > Noun-Verb Agreement

0.0 0.2 0.4 0.6 0.8 1.0Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boys…..

Results > Emergent Properties of Network > Noun-Verb Agreement

0.0 0.2 0.4 0.6 0.8 1.0Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boys…..

Results > Emergent Properties of Network > Noun-Verb Agreement

0.0 0.2 0.4 0.6 0.8 1.0Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boys…..

There’s a differencebetween nouns and verbs. There are even different kinds of nounsthat require differentkinds of verbs.

-After Hearing: “chase”

-Network SHOULD predict next word is:“some direct object (like ”boys”)

-NOT:“. ”

Hey, if a verb needs an argument, it only makes sense to give it one!

Results > Emergent Properties of Network > Verb-Argument Agreement

-Likewise, after hearing the verb: “lives”

-Network SHOULD predict next word is:“. “

-NOT:“dog”

If the verb doesn’t make sense with an argument,It falls upon us to withhold one from it.

Results > Emergent Properties of Network > Verb-Argument Agreement

0.0 0.2 0.4 0.6 0.8 1.0

Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boy chases…..Results > Emergent Properties of Network > Verb-Argument Agreement

0.0 0.2 0.4 0.6 0.8 1.0

Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boy chases…..Results > Emergent Properties of Network > Verb-Argument Agreement

0.0 0.2 0.4 0.6 0.8 1.0

Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boy lives…..Results > Emergent Properties of Network > Verb-Argument Agreement

0.0 0.2 0.4 0.6 0.8 1.0

Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boy lives…..Results > Emergent Properties of Network > Verb-Argument Agreement

0.0 0.2 0.4 0.6 0.8 1.0

Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boy lives…..Results > Emergent Properties of Network > Verb-Argument Agreement

There are different kinds of verbsthat require differentkinds of nouns.

-After hearing: “boy who mary chases…”

-Network might predict next word is:“boys“

Since it learned that “boys” follows “mary chases”

But if it’s smart: may realize that “chases” is linked to “boys”, not “mary”

In which case you need a verb next, not a noun!

A good lithmus test for some intermediate understanding?

Results > Emergent Properties of Network > Longer-Range Dependence

Results > Emergent Properties of Network > Verb-Argument Agreement

0.0 0.2 0.4 0.6 0.8 1.0Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boys who Mary…..

Results > Emergent Properties of Network > Verb-Argument Agreement

0.0 0.2 0.4 0.6 0.8 1.0Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boys who Mary…..

Results > Emergent Properties of Network > Subject-Verb Agreement

0.0 0.2 0.4 0.6 0.8 1.0Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boys who mary chases…..

Results > Emergent Properties of Network > Subject-Verb Agreement

0.0 0.2 0.4 0.6 0.8 1.0Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boys who mary chases feed…..

Results > Emergent Properties of Network > Subject-Verb Agreement

0.0 0.2 0.4 0.6 0.8 1.0Activation

Single Noun

Plural Noun

Single Verb, DO Optional

“Who”

Single Verb, DO Required

Single Verb, DO Impossible

Plural Verb, DO Optional

Plural Verb, DO Required

Plural Verb, DO Impossible

End of Sentence

Wha

t Wor

d N

etw

ork

Pre

d ict

s is

Nex

t

boys who mary chases feed cats…..

What We Are Going to Do

• Build a network• Let it learn how to “read”• Then test it!

– Give it some words in a reasonably grammatical sentence

– Let it try to predict the next word, • Based on what it knows about grammar

– BUT: We’re not going to tell it any of the rules

Did Network Learn About Grammar?

• It learned there are different classes of nouns that need singular and plural verbs.

• It learned there are different classes of verbs that have diff. requirements in terms of direct objects.

• It learned that sometimes there are long-distance dependencies that don’t follow from immediately preceding words – => relative clauses and constituent structure of sentences.

Once You Have a Successful Network, can Examine its

Properties with Controlled I/O Relationships

• Boys hear boys• Boy hears boys.• Boy who boys chase chases boys.• Boys who boys chase chase boys.

1000000000000

100100100100100100100100

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

CONTEXT

Methods > Network Implementation > Structure

BACKPROP!

EXPLICITRULES?

Distributed Representations/ Neural Networks:

EXPLICITRULES?

Distributed Representations/ Neural Networks:

1000000000000

100100100100100100100100

100100100100100100100100

0000000000001

INPUT

HIDDEN

OUTPUT

CONTEXT

Methods > Network Implementation > Structure

BACKPROP!

What Does it Mean, “No Explicit Rules?”

• Does it just mean the mapping is “too complicated?”

• “Too difficult to formulate?”• “Unknown?”

• Possibly just our own failure to understand the mechanism, rather than description of mechanism itself.

General Advantages of Distributed Models

• Distributed, which while not limitless, is less rigid than models where there is strict mapping from concept to node.

• Generalizations are captured at a higher level than input – abstractly. So generalization to new input is possible.

FOUND / ISOLATED 4-CELL NEURAL NETWORKS

9.012Brain andCognitive Sciences II

Part VIII: Intro to Language & Psycholinguistics

- Dr. Ted Gibson

Presentedby

Liu Lab

Fighting for Freedomwith Cultured Neurons

“If you have built castles in the air, your work need not be lost; that is where they should be. Now put the foundations under them.”

-- Henry David Thoreau