understanding i hear and i forget i see and i remember i do and i understand attributed to...

Post on 13-Dec-2015

221 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Understanding

• I Hear and I Forget

• I See and I Remember

• I Do and I Understand

Attributed to Confucius, ~500 BCE

How could a mass of chemical cells produce language and thought?

Will computers think and speak?

How much can we know about our own experience?

How do we learn new concepts?

Does our language determine how we think?

Is language Innate?

How do children learn grammar?

How did languages evolve?

Why do we experience everything the way that we do?

physics lowest energy state

chemistry molecular

minima

biology fitness, MEU

Neuroeconomics

vision threats,

friends

language errors,

NTL

Constrained Best Fit in Natureinanimate animate

Embodiment

Of all of these fields, the learning of languages would be the most impressive, since it is the most human of these activities. This field, however, seems to depend rather too much on the sense organs and locomotion to be feasible.

Alan Turing (Intelligent Machines,1948)

The Mirror System

The mirror system, like the motor system, is somatotopically organized.

Foot actions Hand actions Mouth actions

Buccino et al., 2001

humans watching videos of actions without objects

humans watching same actions with objects

Pre-Natal Tuning: Internally generated tuning signals

• But in the womb, what provides the feedback to establish which neural circuits are the right ones to strengthen? – Not a problem for motor circuits - the feedback and control networks for

basic physical actions can be refined as the infant moves its limbs and indeed, this is what happens.

– But there is no vision in the womb. Recent research shows that systematic moving patterns of activity are spontaneously generated pre-natally in the retina. A predictable pattern, changing over time, provides excellent training data for tuning the connections between visual maps.

• The pre-natal development of the auditory system is also interesting and is directly relevant to our story. – Research indicates that infants, immediately after birth, preferentially

recognize the sounds of their native language over others. The assumption is that similar activity-dependent tuning mechanisms work with speech signals perceived in the womb.

Post-natal environmental tuning

• The pre-natal tuning of neural connections using simulated activity can work quite well – – a newborn colt or calf is essentially functional at birth. – This is necessary because the herd is always on the move. – Many animals, including people, do much of their development after

birth and activity-dependent mechanisms can exploit experience in the real world.

• In fact, such experience is absolutely necessary for normal development.

• As we saw, early experiments with kittens showed that there are fairly short critical periods during which animals deprived of visual input could lose forever their ability to see motion, vertical lines, etc. – For a similar reason, if a human child has one weak eye, the doctor

will sometimes place a patch over the stronger one, forcing the weaker eye to gain experience.

Freud’s Original Connectionist Model

Connectionist Model of Word Recognition (Rumelhart

and McClelland)

Interactive Activation Reading Model

Modeling lexical access errors

• Semantic error

• Formal error (i.e. errors related by form)

• Mixed error (semantic + formal)

• Phonological access error

Phonological access error: Selection of incorrect phonemes

FOG DOG CAT RAT MAT

f r d k m ae o t g

Onsets Vowels Codas

Adapted from Gary Dell, “Producing words from pictures or from other words”

Syl

On Vo Co

Representing concepts using triangle nodes

2

RD5

gender

female

Connectionist Circuit for Gender(RD5) = female

Also shown as a triangle node

The ICSI/BerkeleyNeural Theory of Language Project

The Binding Problem

Massively Parallel Brain

Unitary Conscious Experience

Many Variations and Proposals

Our focus: The Variable Binding Problem

ARDA AQUAINT TUTORIAL

Problem• Binding problem

– In vision• You do not exchange the colors of the shapes

below

– In behavior• Grasp motion depends on object to grasp

– In inference• Human(x) -> Mortal(x)• Must bind a variable to x

ARDA AQUAINT TUTORIAL

Automatic Inference

• Inference needed for many tasks– Reference resolution– General language understanding– Planning

• Humans do this quickly and without conscious thought– Automatically– No real intuition of how we do it

ARDA AQUAINT TUTORIAL

SHRUTI• SHRUTI does

inference by connections between simple computation nodes

• Nodes are small groups of neurons

• Nodes firing in sync reference the same object

Dynamic representation of relational instances

“John gave Mary a book”giver: John

recipient: Mary

given-object: a-book

giver

a-book

Mary

recipient

John

given-object

*

Focal-cluster of an entity

JohnJohn + ?

focal-clusters of motor schemasassociated with John

focal-clusters of lexical know-ledge associated with John

focal-clusters of perceptualschemas and sensoryrepresentations associatedwith John

focal-clusters of otherentities and categoriessemantically related to John

episodic memorieswhere John is oneof the role-fillers

+ - ? fall-pat fall-loc

Fall

+ ?

+ ?

Hallway

John

“John fell in the hallway”

+ - ? fall-pat fall-loc

Fall

+ ?

+ ?

Hallway

John

“John fell in the hallway”

+ -- ? fall-pat fall-loc

Fall

+ ?

Hallway

John + ?

+:Fall

+:John

fall-pat

fall-loc

+:Hallway

“John fell in the hallway”

Encoding “slip => fall” in Shruti

SLIPSLIP + - ? slip-pat slip-loc

FALLFALL + - ? fall-pat fall-loc

+ ? r1 r2

mediatormediator

Such rules are learned gradually via observations, by being told …

“John slipped in the hallway”

SlipSlip + - ? slip-pat slip-loc

FallFall + - ? fall-pat fall-loc

mediatormediatorr2 r1 ?+

+ ?

HallwayJohn

+ ?

→ “John fell in the hallway”

+:slip

+:John

slip-pat

+:Hallway

slip-loc

+:Fall

fall-pat

fall-loc

Encoding X-schema

http://www.iit.edu/~npr/DrJennifer/visual/retina.html

Rods and Cones in the Retina

Color Naming

© Stephen E. Palmer, 2002

Basic Color Terms (Berlin & Kay)

Criteria:

1. Single words -- not “light-blue” or “blue-green”

2. Frequently used -- not “mauve” or “cyan”

3. Refer primarily to colors -- not “lime” or “gold”

4. Apply to any object -- not “roan” or “blond”

The WCS Color Chips

• Basic color terms:– Single word (not blue-green)– Frequently used (not mauve)– Refers primarily to colors (not lime)– Applies to any object (not blonde)

FYI:

English has 11 basic color terms

Results of Kay’s Color Study

If you group languages into the number of basic color terms they have, as the number of color terms increases, additional terms specify focal colors

Stage I II IIIa / IIIb IV V VI VII

W or R or Y W W W W W W

Bk or G or Bu R or Y R or Y R R R R

Bk or G or Bu G or Bu Y Y Y Y

Bk G or Bu G G G

Bk Bu Bu Bu

W Bk Bk Bk

R Y+Bk (Brown) Y+Bk (Brown)

Y R+W (Pink)

Bk or G or Bu R + Bu (Purple)

R+Y (Orange)

B+W (Grey)

Ideas from Cognitive Linguistics

• Embodied Semantics (Lakoff, Johnson, Sweetser, Talmy

• Radial categories (Rosch 1973, 1978; Lakoff 1985)

– mother: birth / adoptive / surrogate / genetic, …

• Profiling (Langacker 1989, 1991; cf. Fillmore XX)

– hypotenuse, buy/sell (Commercial Event frame)

• Metaphor and metonymy (Lakoff & Johnson 1980, …)

– ARGUMENT IS WAR, MORE IS UP– The ham sandwich wants his check.

• Mental spaces (Fauconnier 1994)

– The girl with blue eyes in the painting really has green eyes.

• Conceptual blending (Fauconnier & Turner 2002, inter alia)

– workaholic, information highway, fake guns– “Does the name Pavlov ring a bell?” (from a talk on ‘dognition’!)

Concepts are not categorical

Radial Structure of Mother

The radial structure of this category is defined with respect to the different models

CentralCase

Stepmother

Adoptivemother

Birthmother

NaturalmotherFoster

mother

Biologicalmother

Surrogatemother

Unwedmother

Geneticmother

Language, Learning and Neural Modelingwww.icsi.berkeley.edu/AI

• Scientific Goal Understand how people learn and use language

• Practical Goal Build systems that analyze and produce language

• Approach Embodied linguistic theories with advanced

biologically-based computational methods

General and Domain Knowledge

• Conceptual Knowledge and Inference– Embodied– Language and Domain Independent– Powerful General Inferences– Ubiquitous in Language

• Domain Specific Frames and Ontologies– Framenet (www.icsi.berkeley.edu/framenet)

• Metaphor links domain specific to general– E.g., France slipped into recession.

Image schemas

• Trajector / Landmark (asymmetric)– The bike is near the house – ? The house is near the bike

• Boundary / Bounded Region – a bounded region has a closed boundary

• Topological Relations– Separation, Contact, Overlap, Inclusion, Surround

• Orientation– Vertical (up/down), Horizontal (left/right, front/back)– Absolute (E, S, W, N)

LMTR

bounded region

boundary

Learning System

We’ll look at the details next lecture

dynamic relations(e.g. into)

structured connectionistnetwork (based on visual system)

Simulation-based language understanding

“Harry walked to the cafe.”

Schema Trajector Goalwalk Harry cafe

Analysis Process

Simulation Specification

Utterance

SimulationCafe

Constructions

General Knowledge

Belief State

Simulation Semantics• BASIC ASSUMPTION: SAME REPRESENTATION FOR

PLANNING AND SIMULATIVE INFERENCE– Evidence for common mechanisms for recognition and

action (mirror neurons) in the F5 area (Rizzolatti et al (1996), Gallese 96, Boccino 2002) and from motor imagery (Jeannerod 1996)

• IMPLEMENTATION: – x-schemas affect each other by enabling, disabling or

modifying execution trajectories. Whenever the CONTROLLER schema makes a transition it may set, get, or modify state leading to triggering or modification of other x-schemas. State is completely distributed (a graph marking) over the network.

• RESULT: INTERPRETATION IS IMAGINATIVE SIMULATION!

Active representations• Many inferences about actions derive from what we know

about executing them• Representation based on stochastic Petri nets captures

dynamic, parameterized nature of actions• Models linguistic aspect

Walking:

bound to a specific walker with a direction or goal

consumes resources (e.g., energy)

may have termination condition(e.g., walker at goal)

ongoing, iterative action

walker=Harry

goal=home

energy

walker at goal

Learning Verb MeaningsDavid Bailey

A model of children learning their first verbs.Assumes parent labels child’s actions.Child knows parameters of action, associates with wordProgram learns well enough to: 1) Label novel actions correctly 2) Obey commands using new words (simulation)System works across languagesMechanisms are neurally plausible.

cow

apple ball yes

juice bead girl down no more

bottle truck baby woof yum go up this more

spoon hammer shoe daddy moo whee get out there bye

banana box eye momy choo-choo

uhoh sit in here hi

cookie horse door boy boom oh open on that no

food toys misc. people sound emotion action prep. demon. social

Words learned by most 2-year olds in a play school (Bloom 1993)

System Overview

Learning Two Senses of PUSH

Model merging based on Bayesian MDL

Training ResultsDavid Bailey

English• 165 Training Examples, 18 verbs• Learns optimal number of word senses (21)• 32 Test examples : 78% recognition, 81% action• All mistakes were close lift ~ yank, etc.• Learned some particle CXN,e.g., pull up

Farsi • With identical settings, learned senses not in

English

Task: Interpret simple discourse fragments/ blurbs

France fell into recession. Pulled out by Germany

US Economy on the verge of falling back into recession after moving forward on an anemic recovery.

Indian Government stumbling in implementing Liberalization plan.

Moving forward on all fronts, we are going to be ongoing and relentless as we tighten the net of justice.

The Government is taking bold new steps. We are loosening the stranglehold on business, slashing tariffs and removing obstacles to international trade.

Probabilistic inference

– Filtering• P(X_t | o_1…t,X_1…t)• Update the state based on the observation sequence and state

set– MAP Estimation

• Argmaxh1…hnP(X_t | o_1…t, X_1…t)• Return the best assignment of values to the hypothesis

variables given the observation and states– Smoothing

• P(X_t-k | o_1…t, X_1…t)• modify assumptions about previous states, given observation

sequence and state set– Projection/Prediction/Reachability

• P(X_t+k | o_1..t, X_1..t)

Metaphor Maps

• Static Structures that project bindings from source domain f- struct to target domain Bayes net nodes by setting evidence on the target network.

• Different types of maps– PMAPS project X- schema Parameters to abstract domains– OMAPS connect roles between source and target domain– SMAPS connect schemas from source to target domains.

• ASPECT is an invariant in projection.

Results• Model was implemented and tested on discourse fragments

from a database of 50 newspaper stories in international economics from standard sources such as WSJ, NYT, and the Economist.

• Results show that motion terms are often the most effective method to provide the following types of information about abstract plans and actions.– Information about uncertain events and dynamic changes in goals and

resources. (sluggish, fall, off-track, no steam)– Information about evaluations of policies and economic actors and

communicative intent (strangle-hold, bleed).– Communicating complex, context-sensitive and dynamic economic

scenarios (stumble, slide, slippery slope).– Commincating complex event structure and aspectual information (on

the verge of, sidestep, giant leap, small steps, ready, set out, back on track).

• ALL THESE BINDINGS RESULT FROM REFLEX, AUTOMATIC INFERENCES PROVIDED BY X-SCHEMA BASED INFERENCES.

Models of Learning

• Hebbian ~ coincidence• Recruitment ~ one trial• Supervised ~ correction (backprop)• Reinforcement ~ Reward based

– delayed reward

• Unsupervised ~ similarity

Reinforcement Learning

• Basic idea:– Receive feedback in the form of rewards

• also called reward based learning in psychology– Agent’s utility is defined by the reward function– Must learn to act so as to maximize expected utility– Change the rewards, change the behavior

• Examples:– Learning coordinated behavior/skills (x-schemas)– Playing a game, reward at the end for winning / losing– Vacuuming robot, reward for each piece of dirt picked up– Automated taxi, reward for each passenger delivered

Markov Decision Processes• Markov decision processes (MDPs)

– A set of states s S– A model T(s,a,s’) = P(s’ | s,a)

• Probability that action a in state s leads to s’

– A reward function R(s, a, s’) (sometimes just R(s) for leaving a state or R(s’) for entering one)

– A start state (or distribution)– Maybe a terminal state

• MDPs are the simplest case of reinforcement learning– In general reinforcement learning, we

don’t know the model or the reward function

MDP Solutions• In deterministic single-agent search, want an optimal

sequence of actions from start to a goal• In an MDP we want an optimal policy (s)

– A policy gives an action for each state– Optimal policy maximizes expected utility (i.e. expected rewards)

if followed– Defines a reflex agent

Optimal policy when R(s, a, s’) = -0.04 for all non-terminals s

Q-Learning

• Learn Q*(s,a) values– Receive a sample (s,a,s’,r)– Consider your old estimate:– Consider your new sample estimate:

– Nudge the old estimate towards the new sample:

Exploration / Exploitation

• Several schemes for forcing exploration– Simplest: random actions (-greedy)

• Every time step, flip a coin• With probability , act randomly• With probability 1-, act according to current policy

(best q value for instance)

– Problems with random actions?• You do explore the space, but keep thrashing

around once learning is done• One solution: lower over time• Another solution: exploration functions

Embodied Construction GrammarECG

1. Linguistic Analysis

2. Computational Implementationa. Test Grammars

b. Applied Projects – Question Answering

3. Map to Connectionist Models, Brain

4. Models of Grammar Acquisition

Embodied Construction Grammar• Embodied representations

– active perceptual and motor schemas(image schemas, x-schemas, frames, etc.)

– situational and discourse context

• Construction Grammar– Linguistic units relate form and

meaning/function.– Both constituency and (lexical) dependencies

allowed.

• Constraint-based– based on feature unification (as in LFG, HPSG)– Diverse factors can flexibly interact.

“Harry walked into the cafe.”

Phonology

Semantics

Pragmatics

Morphology

Syntax

Phonetics

“Harry walked into the cafe.”

Phonology

Semantics

Pragmatics

Morphology

Syntax

Phonetics

UTTERANCE

ECG Structures

• Schemas– image schemas, force-dynamic schemas, executing

schemas, frames…

• Constructions– lexical, grammatical, morphological, gestural…

• Maps– metaphor, metonymy, mental space maps…

• Spaces– discourse, hypothetical, counterfactual…

schema Containerroles

interiorexteriorportalboundary

Embodied schemas

Interior

Exterior

Boundary

PortalSource

Path

GoalTrajector

These are abstractions over sensorimotor experiences.

schema Source-Path-Goalroles

sourcepathgoaltrajector

schema name

role name

Embodied constructions

construction HARRYform : /hEriy/meaning : Harry

construction CAFEform : /khaefej/meaning : Cafe

Harry

CAFEcafe

ECG NotationForm Meaning

Constructions have form and meaning poles that are subject to type constraints.

ARDA AQUAINT TUTORIAL

An analysis using THROW-TRANSITIVE

Simulation-based language understanding

Analysis Process

SemanticSpecification

“Harry walked into the cafe.” Utterance

CAFE Simulation

Belief State

General Knowledge

Constructions

construction WALKEDform

selff.phon [wakt]meaning : Walk-Action constraints

selfm.time before Context.speech-time selfm..aspect encapsulated

Simulation specification

The analysis process produces a simulation specification that

•includes image-schematic, motor control and conceptual structures

•provides parameters for a mental simulation

Competition-based analyzer finds the best analysis

• An analysis is made up of:– A constructional tree– A set of resolutions– A semantic specification

The best fit has the highest combined score

Summary: ECG

• Linguistic constructions are tied to a model of simulated action and perception

• Embedded in a theory of language processing– Constrains theory to be usable– Frees structures to be just structures, used in processing

• Precise, computationally usable formalism– Practical computational applications, like MT and NLU– Testing of functionality, e.g. language learning

• A shared theory and formalism for different cognitive mechanisms– Constructions, metaphor, mental spaces, etc.

State of the ArtNatural Language Understanding

• Limited Commercial Speech Applications transcription, simple response systems • Statistical NLP for Restricted Tasks tagging, parsing, information retrieval• Template-based Understanding programs expensive, brittle, inflexible, unnatural• Essentially no NLU in QA, HCI systems• ECG being applied in prototypes

top related