how do neurons deal with uncertainty? sophie deneve group for neural theory ecole normale...

How do neurons deal with uncertainty?

Sophie DeneveGroup for Neural Theory

Ecole Normale SupérieureParis, France.

Dealing with uncertainties: two approaches

State of the word or body

Neural representation

Behavioral response

Sensor noise

Neural noise

Effector noise

Efficient encoding(information maximization)

Efficient decoding(ideal observer)

Tuning curves, neural code

Signal processing

nr f s

x Sensory variable

ˆ ˆy g x Behavioral estimate

Velocity perception as bayesian inference:Weiss, Simoncelli and Adelson, 2002

Bayesian perception and action

Infering 3D structures from 2D images. Knill and Richards, 1996

Bayesian integration for optimal motor strategies.Kording and Wolpert, 2004.Multisensory integration.

Van Beers, Sittig and Gon, 1999, Ernst and Banks 2002

1| |P V S P S V P V

Dealing with uncertainties: two approaches

Internal beliefs

Behavioral strategies

Inference Prediction

decisions explorations

PredictionLikelihood

Maximize Utility/Reward

Models, Probabilities, priors….

Bayesian models

max , |x

reward y x p x s

Sensory variable

Posterior probabilities

| ,r f p x s priors

Two parts of the lecture

• Part 1: How can variables encoded by population of noisy neurons be estimated? How can they be combined?

• Part 2: Are probability distributions and belief states encoded in neurons or neural populations?

Population coding

Georgopoulos 1986

Population Codes

Tuning Curves Average pattern of activity

-100 0 1000

Direction (deg)

-100 0 1000

Preferred Direction (deg)

i i ir f x x f x

Poisson Variability in Cortex

Trial 1

Trial 2

Trial 3

Trial 4

Mean Spike Count

Variance of Spike Count

Noisy population Codes

Pattern of activity (r)

-100 0 1000

y s?X?

f x f xp r k

Poisson noise:

Independent: | |i

p x p r xr

Population Vector

riPi-100 0 100

-45 0 450

Maximum likelihoodˆpvx

Preferred Direction of motion (deg)

ˆpv pvx xI x

ˆ anglepv i ii

Fisher information

Maximum performance for an unbiased estimator

-45 0 450

Maximum likelihoodˆMLx

2 1ML I x

ˆ arg max |MLx p xx

Fisher Information

• For one neuron with Poisson noise

• For n independent neurons :

The more neurons, the better! Small variance is good!

Large slope is good!

Derivative of the tuning curve

Tuning curve (mean activity)

-100 0 1000

Direction (deg)

Variance of the maximum likelihood estimate

1ˆML ML

• Population of neuron with independent Poisson noise:

-45 0 450

Recurrent network

-45 0 450

Recurrent networkˆrx

If the network is a line attractor, and if

We have:

fNetwork parameters

Covariance of the neural noise

Derivative of the tuning curve

Line attractor networks

-45 0 450

yPreferred Direction of motion (deg)

Neuron 1

-45 0 450

yPreferred Direction of motion (deg)

-45 0 450

Neuron 1

-45 0 450

Neuron 1

-45 0 450

Neuron 1

-45 0 450

ˆrxˆ r

Neuron 1

Covariance of neural noise

Shape of stable manifold

Direction of projection

Neuron 1

fNeuron 1

Cue integration

Visual capture: The ventriloquism effect

Multi-sensory integration

Position of the object, x^ ^

( | ) ( | )( | , ) vis aud

vis aud

p x p xp x

r rr r

Combining cues from several modalities

( / )visp x r( / )audp x r

( / , )vis audp x r r

Combining cues from several modalities: Gaussiab distributions

ˆvisx

ˆaudx

visaud

Combining cues from several modalities

ˆvisx

ˆaudxˆbix

ˆ ˆ ˆ ˆ ˆaud visbi vis aud vis vis aud aud

vis aud vis aud

x x x w x w x

visaud

Visual xvis

Auditoryxaud

Bimodal

Visual xvis

Auditoryxaud

Bimodal

Visual xvis

Auditoryxaud

Bimodal

Visual more reliable: Visual capture

Auditory more reliable: Auditory capture:

Analogy with the center-of-mass

Ernst and Banks, 2002

From: Banks et al, Nature, 2002

0 67 133 2000

)Visual noise level (%)

Measured bimodal STD

Predicted by the optimal model

0 67 133 2000

Visual noise level (%)

Unimodal visual STD

Unimodal Tactile STD

Measured visual weight

Visual widthxv

Haptic widthxt

Bimodal width

Prior, likelihood and posterior

-45 0 450

Preferred position (x)

|p xrLikelihood:

p xPrior:

p x p xp x

Posterior (Bayes Theorem)

Prior, likelihood and posterior

-45 0 450

Preferred position (x)

| | ir

i ii i

p x p r x K f x r

Flat prior

| ' ir

p x K f x r

log | '' log i ii

p x K f x r r

Gain as encoding certainty

-45 0 450

ˆvisMLx

| visp x rVisual input:

-45 0 450

y ˆaudMLx

| audp x rAuditory input:

| bip x r

Product

ˆvisMLx

ˆvisaudx

ˆbiMLx

-45 0 450

Preferred eye-centered position-45 0 45

Preferred eye position

-45 0 450

Preferred head-centered position

Eye position

Eye-centered position

Head-centered position

Visual layer Eye position layer

Auditory layer

Integrating several noisy population code

-45 0 450

Preferred eye-centered position

-45 0 450

Maximum Likelihood

Eye position

Visual layer

Auditory layer

Eye position layer

-45 0 450

Eye position

Visual layer

Auditory layer

Eye position layer

Multidirectional sensory prediction

-45 0 450

Visual layer

Auditory layer

Eye position layer

tac vis eyex x x

eyexvisx

How do we do coordinate transform?

-45 0 450

Visual layer

Auditory layer

Eye position layer

Not Like This!

-45 0 450

Visual layer

Auditory layer

Eye position layer

Look-up table

Look up table.

-45 0 450

Visual layer

Auditory layer

Eye position layer

Radial basis function map

Radial basis function mapIterative (IBF)

Bimodal visuo-tactile input

Auditory layer

Neuron 1

Auditory layer

ˆvisx ˆeyex

Bimodal visuo-tactile input, weak tactile input

Auditory layer

Neuron 1

Auditory layer

ˆvisx ˆeyex

Unimodal visual input

Auditory layer

Neuron 1

Performance: The variance of the estimates are less than 4% worse than variance of the best (bayesian) estimator.

Result: IBF is an optimal bayesian estimator

ˆ ˆ ˆ ˆbi vis vis aud aud eyex w x w x x

visx eyex^

ˆ ˆ ˆ ˆbi vis vis aud aud eyex w x w x x

ˆaudx

From: Banks et al, Nature, 2002 Visual layer Eye position layer

Tactile layer

0 67 133 2000

)Visual noise level (%)

Measured bimodal STD

Predicted by the optimal model

0 67 133 2000

Visual noise level (%)

Unimodal visual STD

Unimodal Tactile STD

Ernst and Banks, 2002

From: Banks et al, Nature, 2002

Measured visual weight

-45 0 450

Why does it work?Reason 1: Line attractor networks work for different gains

G=Gain

x /maxargˆ

22 1MLML G

fNeuron 1

-45 0 450

22 1MLr G

Why does it work? Reason 2: For independent Poisson noise, doing sum is

equivalent to multiplying the posteriors

-45 0 450

ˆvisMLx

| visp x rVisual input:

-45 0 450

y ˆaudMLx

| audp x rAuditory input:

| bip x r

Product

ˆvisMLx

ˆvisaudx

ˆbiMLx

sum-45 0 45

ˆbiMLx

Generalization: Implicit probabilistic encoding

• Linear combinations correspond to optimal cue integration as long as the distribution of neural noise belong to the exponential family:

We have

Beck, Latham and Pouget, 2007

| exp .p x Z xr r h r

1 2 1 2

1| | |p x p x p x

Z r r r r

Structure of the iterative basis function network.

???Multi-sensory Representation.

Receptive fields in different modalities are roughly aligned in space.

Visual and tactile receptive fields in VIP:

Is multi-sensory integration simply realized by the convergence of all sensory inputs on a common spatial map?

Frame of reference of a multi-sensory cell’s receptive field: Sensory alignment hypothesis.

Model prediction: partially shifting receptive fields.

Duhamel et al, 1997, 2001

The modality dominance influence the receptive field shift.

Visual dominant: Tactile dominant:

Estimate

Visual estimate Tactile estimate

Estimate

(Avillac et al, 2001)

Visual and tactile receptive fields of a VIP cell

Duhamel et al, 1997Avillac et al, 2001

Jay and Sparks, 1986Stricanne et al, 1995

Shift Ratio as a function of eye-centered dominance

1 0.52 0.254Eye-centered dominance

Visual input

1Eye-centered dominance (confidence given to the visual modality)

Tactile (auditory) input

Visual input

Head-centered

Graziano and Gross, 1998

Taking time into account

• Inference and predictions need to take place on short time scale.

• Events start and end unpredictably, objects move...

• We need to estimate the states of relevant variable on-line. No time to converge to an attractor.

• Spike per spike computations (and coding?)

Explicit encoding of probabilities by population codes

|i ir p x s f x dx

| i ii

p x s r f x

Convolutional codes (Zemel, Dayan, Pouget, Sahani)

Basis function coding (Andersen, Barber, Eliasmith)

Time? Inference is hard.

Experimental evidence is lacking.

It is not enough to store past input to combine it with new input.

Dupont Durand Smith

P(guilty)

Dupont Durand Smith

P(guilty)

Dupont Durand Smith

P(guilty)

Dupont Durand Smith

P(guilty)

Dupont Durand Smith

P(guilty)

Dupont Durand Smith

P(guilty)

Dupont Durand Smith

P(search)

London Paris

P(search)

London Paris

P(search)

London Paris

Hidden Markov Model

1ts ts 1ts

txdttx dttx Hidden variable

Observations

1| | | |

t dt t dt t dt t dt t dt t t tx

p x s p s x p x x p x sZ

Inference can be performed recurrently:

Observations are spikes:

| |tt t i t

p s x p s x 1 is

f x dtZ

0 (no spike) or 1 (spike)

Hidden Markov Model

1ts ts 1ts

Observations

1 or 0tx

0 or 1| =1 or 0t t dtp x x

1 , 0i if f

stimulus present/absent

Probability that stimulus appears/disappears

Firing rate of neuron i given that stimulus is present/absent

Example 1: Detecting the presence of a stimulus

Hidden Markov Model

1ts ts 1ts

Observations

|t t dtp x x

1|it t t ip s x f x x

Position, speed and acceleration of the arm

Noisy dynamics of the arm.

Tuning curve of neuron i

Example 2: Tracking the state of the arm during movement

0px ˆ 1px ˆ 2px

Evidence for “explicit” encoding of probability in neural activity: Shadlen et al

Gold and Shadlen, 2004

Rightward motion: saccade right

Leftward motion: saccade left

Medio-temporal cortex (MT): Lateral intra-parietal cortex (LIP)

Preferreddirection

Non-preferreddirection

Increase with coherence

Medio-temporal cortex (MT):Lateral intra-parietal cortex (LIP)

1ts ts 1ts

txdttx dttx

0| tp x s 1if

LIP/MT model (sequential probability test)

p leftL

p right

Response left>response right

Response right>response left

Saccade right

Saccade Left

p rightL

p left

Left>Right

Right>Left

Saccade right

Saccade Left

p leftL

p right

p rightL

p left

Left>Right

Right>Left

Saccade right

Saccade Left

p leftL

p right

p rightL

p left

Left>Right

Right>Left

Saccade right

Saccade Left

p leftL

p right

p rightL

p left

Left>Right

Right>Left

Saccade right

Saccade Left

p leftL

p right

p rightL

p left

Left>Right

Right>Left

Saccade right

Saccade Left

p leftL

p right

p rightL

p left

Left>Right

Right>Left

Saccade right

Saccade Left

p leftL

p right

p rightL

p left

Left>Right

Right>Left

Saccade right

Saccade Left

p leftL

p right

p rightL

p left

Left>Right

Right>Left

Saccade right

Saccade Left

p leftL

p right

p rightL

p left

Go to the log (replace products by sums):

1| | |t dt t dt tp x s p s x p x s

Assume the direction of motion never change during the trial:

it dt t i t

L L w s In the limit of small temporal discretization step, dt:

0 |t dt

p x sL

Firing rate when x is 1

Firing rate when x is 0

?Bayesian temporal integration in single neurons

txt dtx t dtx

1ts ts 1ts

Bayesian temporal integration in single neurons

0if 1if

11| 0on t dt tr p x x

10 | 1off t dt tr p x x

Rate of switching on:

Rate of switching off:

txt dtx t dtx

1ts ts 1ts

Bayesian temporal integration in single neurons

ionq i

Rate of switching on:

Rate of switching off:

1 1L L ion off i t

Lr e r e w s

Leak Synaptic input

' it i t

LL w s

Bayesian integration corresponds to leaky synaptic integration.

0 100 200 300 400-10

Accumulation of evidence results in linear ramps followed by a saturation.

Slowly dynamics Fast dynamics

If the stimulus varies very slowly, the neuron is an integrator

For faster stimuli, this integration is leaky

Importance of using the right temporal statistics

Left>Right

Right>Left

Motion right

Motion left

Motion right

Motion left

Generalization to multiple states

0px ˆ 1px ˆ 2px

exp tkkl l kj j

LM L W s

Continuous variablesSequences of discrete state Multiple choice

|jt t kp s x x

|t dt k t l klp x x x x

0log |t k tp x x s

Input layer

Recurrent layer

Alternative approaches• Linearization of the recurrent equations. Recurrent network of leaky

integrate and fire neuron. (Rao, 2001)

• Membrane potentials are log probabilities. Spike generation exponentiates (Rao, 2003)

• Firing rate are probabilities. Multiplicative dendrites (Beck and Pouget, 2007)

• Generalization of line attractor networks (Deneve et al, 2007).

• Deterministic spike generation mechanism: spikes signal increase in probability (Deneve, 2007).

… …

Speed inputPosition input

Preferred arm speedPreferred arm position

Preferred force

f ( 1)t

Optimal filtering of noisy sensory input in the presence of movement

INTERNAL MODEL

Activity

Position input

Speed inputArm position

Sensori-motor Map

Efferent motor command

Feedforward connections: W

Predictive (lateral) connections:

Lateral connections: M

Network model

Preferred arm position

# Spikes

Preferred arm

# Spikes

Preferred arm

# Spikes

Preferred arm

ˆ ( )px t

ˆ ( )sx t

Position

Tuning curves of sensorimotor neuron reflects the reliability and covariance of the different sensory and motor variables.

0 10 20 30 40 50

Speed (cm/sec)

Prediction: Partially shifting representation of position and speed.

Arm at 0cm

Arm at 20cm

Population vector points in the right direction!

Individual cells shifts their velocity tunings curves with position

0 10 20 30 40 50

Speed (cm/sec)

Prediction: Partially shifting representation of position and speed.

Arm at 0cm

Arm at 20cm

Cell’s selectivity is described by the 2D tuning curve:

Individual cells shifts their velocity tunings curves with position

unbiased

shifting

Spike generation.

• Rate coding: information is in spike count. Spike times are random (Poisson).

• Deterministic spike generation rule: each spike signals a change in the belief state.

LIP/MT model

Left>Right

Right>Left

Rightward motion

Leftward motion

p rightL

p left

Firing rate

Noisy reports on a murder case?

Dupont Durand Smith

P(guilty)

Durand

Dupont

LIP/MT model

Left>Right

Right>Left

Rightward motion

Leftward motion

p rightL

p left

Integrate and fire

Deterministic firing

0 2000

' it i t

LL w s

500 1000 1500 2000 2500 3000tL

Integrate input (what I know)

(What I know)

Integrate output (what I told)-

(What I told in the past)

' t o t

GG g O

' it i t

LL w s

Mechanism for spike generation in Bayesian neurons

500 1000 1500 2000 2500 3000tL

Integrate input (what I know)

Integrate output (what I told)-

Mechanism for spike generation in Bayesian neurons

0 100 200 300 400

Time0 100 200 300 400

tLtt tV GL

Integrator with adapting threshold Integrate and fire with adaptive time constant

Neurons cannot all be integrators

1000 2000 3000 4000 5000-10

Spikes deterministically signal an increase in probability

Generalization to continuous variables

exp tkkl l kj j

LM L W s

Threshold for spiking: expkkl l kj t

GM G O

Implication 1: Spikes signal increase in probability

A spike signals an increase in the probability of a binary variable.

0 10 20 30 400

0 200 400 6000

Mean spike count

Implication 2: Firing rate statistics are similar to Poisson, but these fluctuations purely reflects input noise.

Poisson statistics.

The spike train is a deterministic function of the input.

1000 2000 3000

Implication 3: Firing precision depends on the proportion meaningful fluctuations.

Time0 1000 2000

ConclusionInformation encoded in noisy population codes can be decoded using recurrent line attractor networks.

However, in order to combine cues and to integrate information over time, it is necessary to represent probability distributions, not only estimates of sensory and motor variables.

This representation could be implicit (i.e. gain encoding) but this implies strong limitations on the statistical problems that can to be solved.

This representation could be more explicit. This seems to account for the integrate and fire dynamics of biological neurons.

But this puts into question the traditional concept of “neural signal” and “neural noise”. Spikes signaling deterministic changes in the belief state share properties with Poisson distributed “rate models”, but merely reflect fluctuations in the input.

This also implies whole new sets of predictions for link between biophysics and behavior.

Internal beliefs

Behavioral strategies

Inference Prediction

decisions explorations

PredictionLikelihood

Maximize Utility/Reward

Models, Probabilities, priors….

Bayesian models

Towards the Bayesian Brain?

Expectation-Maximization

Repeat

• Expectation step: Compute the expected value of the state given the observations and the current set of parameters.

• Maximization step: Choose parameters maximizing the probability of observations given the expected value of the state.

1ts ts 1ts

txdttx dttx

Learning

1ts ts 1ts

txdttx dttx

Repeat

Learning

E(xt)1ts ts 1ts

txdttx dttx

Repeat

Learning

E(xt)1ts ts 1ts

txdttx dttx

Repeat

Learning

E(xt)1ts ts 1ts

txdttx dttx

Repeat

Learning

E(xt)1ts ts 1ts

txdttx dttx

Repeat

Dynamics at multiple time scales in spiking neurons

2 2.5 3 3.5 4 4.5 5 5.5 6-10

Fast time constant: Inference, spiking

Time (hours)

Slow time constant: Learning, adaptation(on-line EM)

2 2.5 3 3.5 4 4.5 5 5.5 6

2 2.5 3 3.5 4 4.5 5 5.5 64

ts 3ts 4

ts 5ts 6

th 3th 4

11th 1

Learning in networks

h(cuddly zebra)=1 h(dangerous tiger)=1

s(striped patch)=1

ts 3ts 4

ts 5ts 6

th 3th 4

11th 1

ts 3ts 4

ts 5ts 6

Divisive inhibition

Learning in networks

2( 1| )p h s

Contrast

1 2 3 4 5 6 7 8

how do neurons deal with uncertainty? sophie deneve group for neural theory ecole normale...

Documents

steven a. balbus ecole normale supérieure physics...

École normale supérieure · subject: image created date:...

neural modeling and simulation romain brette ecole normale...

romain brette & dan goodman ecole normale supérieure...

assimilation of observations and bayesianity mohamed jardak...

romain brette & dan goodman ecole normale supérieure equipe...

p.c.w. holdsworth ecole normale supérieure de...

whatsapp-based learning in ecole normale supérieure de

lyn tieu and zheng shen École normale supérieure and

climate change: certainties and uncertainties hervé le...

École normale supérieure€¦ · created date: 3/21/2014...

classical - École normale supérieure · ens ulm 1 École...

mines paristech ecole normale supérieure,...

romain brette ecole normale supérieure, paris...

t$ - École normale supérieure de lyon

dynamical systems, sequential estimation, and estimating...

département de philosophie - École normale supérieure

École normale supérieure paris-saclay factsheet ·...

ultracold fermi gases : the bec-bcs crossover roland...

École normale supÉrieure paris, cole normale supÉrieure...