center for computational biology department of mathematical sciences montana state university...

Center for Computational BiologyDepartment of Mathematical Sciences

Montana State University

Collaborators:Alexander Dimitrov

Tomas Gedeon John P. Miller

Zane Aldworth

Neural Coding and DecodingAlbert E. Parker

Problem: Determine a coding scheme: How does neural ensemble activity represent information about sensory stimuli?

Our Approach: • Construct a model using Probability and Information Theory

• Optimize the model to cluster the neural responses which gives an approximation of a coding scheme given the available data.

• Apply our method to the cricket cercal system

Neural Coding and Decoding.

Goal: What conditions must a coding scheme satisfy?

Demands: • An animal needs to recognize the same object on repeated

exposures. Coding has to be deterministic at this level.• The code must deal with uncertainties introduced by the

environment and neural architecture. Coding is by necessity stochastic at this finer scale.

Major Problem: The search for a coding scheme requires large amounts of data

How to determine a coding scheme?

Idea: Model a part of a neural system as a communication channel using Information Theory. This model enables us to:

• Meet the demands of a coding scheme:o Define a coding scheme as a relation between stimulus and neural

response classes.

o Construct a coding scheme that is stochastic on the finer scale yet almost deterministic on the classes.

• Deal with the major problem:o Use whatever quantity of data is available to construct coarse but

optimally informative approximations of the coding scheme.

o Refine the coding scheme as more data becomes available.

• Investigate the cricket cercal sensory system.

X YQ(Y|X)input output

A Stochastic Map

The relationship between X and Y is completely described by the conditional probability Q.

stimulus sequence X=x response sequence Y=y

Realizations of X and Y in neural coding

Q(Y=y|X=x)

Y

1

2

3

4

Xstimulus sequences

resp

onse

seq

uenc

esDetermining Stimulus/Response Classes

Given a joint probability p(X,Y):

Stimulus and Response Classes

stimulus sequences

resp

onse

seq

uenc

es

Distinguishable stimulus/response

classes

Y

X

1

2

3

4

Information Theoretic QuantitiesA quantizer or encoder, Q, relates the environmental stimulus, X, to the neural response, Y, through a process called quantization. In general, Q is a stochastic map

The Reproduction space Y is a quantization of X. This can be repeated: Let Yf be a reproduction of Y. So there is a quantizer

Use Mutual information to measure the degree of dependence between X and Yf.

Use Conditional Entropy to measure the self-information of Yf given Y

q y y Y Yf f( | ):

fyyy

f

yf

ff yyqypxp

yxpyyq

yxpyyqYXI, )|()()(

),()|(

log),()|(),(

H Y Y p y q y y q y yf fy y

ff

( | ) ( ) ( | ) log( ( | )),

Q y x X Y( | ):

The ModelProblem: To determine a coding scheme between X and Y requires large amounts of

data

Idea: Determine the coding scheme between X and Yf, a clustering (reproduction) of Y, such that: Yf preserves as much information (mutual information) with X as possible and the self-information (entropy) of Yf |Y is maximized. That is, we are searching for an optimal mapping (quantizer):

that satisfies these conditions.

Justification: Jayne's maximum entropy principle, which states that of all the quantizers that satisfy a given set of constraints, choose the one that maximizes the entropy.

ff YYyyq :)|(*

Maximum entropy:

maximize F(q(yf|y)) = H(Yf|Y) constrained by

I(X;Yf ) Io Io determines the informativeness of the reproduction.

Deterministic annealing (Rose, ’98):

maximize F(q(yf|y)) = H(Yf|Y) + I(X,Yf ).Small favor maximum entropy, large : maximum I(X,Yf ).

Augmented Lagrangian with Newton CG line search

Implicit solution:

Simplex Algorithm:

maximize I(X,Yf ) over vertices of constraint space

f

Iq

Iq

y

yp

qD

yy

yp

qD

f ee

yyq

,|

Equivalent Optimization Problems

Random clusters

Application to synthetic data(p(X,Y) is known)

• p(x,y) cannot be estimated directly for rich stimulus sets - there is

not enough data.

• I(X,Yf )=H(X) - H(X|Yf ). Only H(X|Yf ) depends on the q(yf |y). So an

upper bound of H(X|Yf) produces a lower bound of I(X,Yf ).

• is bounded by a Gaussian:

where

is the conditional covariance of the stimulus.

•

which is written explicitly as a function of the quantizer (through

p(y|yf )). We estimate the mean and and covariance matrices.

Application to Real Data(the probabilities p(y) and p(x,y) are NOT known)

ffyf yYXHEYXHf

||

fff yX

XyfGyf CeEyXHEYXH |2 det2log2

1)|(|

fyXC |

Tyyyyyy

TyyyXyyyX

x

TyyfyX

ffff

fff

EECEC

xxyxpC

|||||

|

-

)|(

The Optimization Problem for Real Data

Maximum entropy: maximize F(q(yf|y)) = H(Yf|Y) constrained by

H(X)-HG(X|Yf ) Io Io determines the informativeness of the reproduction.

Signal

Nervous system

Communicationchannel

Modeling the cricket cercal sensory system as a communication channel

Wind Stimulus and Neural Response in the cricket cercal system

Neural Responses (over a 30 minute recording) caused by white noise wind stimulus.

T, ms

Neural Responses (these are all doublets) for a 12 ms window

Some of the air current stimuli preceding one of the neural responses

Time in ms. A t T=0, the first spike occurs

X

Y

YfY

Quantization:A quantizer is any map f: Y -> Yf from Y to a reproduction space Yf with finitely many elements. Quantizers can be

deterministic or

refined

yy f

Y

probabilistic

yyq f |

Applying the algorithm to cricket sensory data.

Y

Yf

Yf

Class 1

Class 2

Class 1

Class 2

Class 3

ConclusionsWe• model a part of the neural system as a communication

channel.

• define a coding scheme through relations between classes of stimulus/response pairs.

- Coding is probabilistic on the individual elements of X and Y.- Coding is almost deterministic on the stimulus/response

classes.

To recover such a coding scheme, we• propose a new method to quantify neural spike trains.

- Quantize the response patterns to a small finite space (Yf).

- Use information theoretic measures to determine optimal quantizer for a fixed reproduction size.

- Refine the coding scheme by increasing the reproduction size.

• present preliminary results with cricket sensory data.

center for computational biology department of mathematical sciences montana state university...

Documents