center for computational biology department of mathematical sciences montana state university...
TRANSCRIPT
Center for Computational BiologyDepartment of Mathematical Sciences
Montana State University
Collaborators:Alexander Dimitrov
Tomas Gedeon John P. Miller
Zane Aldworth
Neural Coding and DecodingAlbert E. Parker
Problem: Determine a coding scheme: How does neural ensemble activity represent information about sensory stimuli?
Our Approach: • Construct a model using Probability and Information Theory
• Optimize the model to cluster the neural responses which gives an approximation of a coding scheme given the available data.
• Apply our method to the cricket cercal system
Neural Coding and Decoding.
Goal: What conditions must a coding scheme satisfy?
Demands: • An animal needs to recognize the same object on repeated
exposures. Coding has to be deterministic at this level.• The code must deal with uncertainties introduced by the
environment and neural architecture. Coding is by necessity stochastic at this finer scale.
Major Problem: The search for a coding scheme requires large amounts of data
How to determine a coding scheme?
Idea: Model a part of a neural system as a communication channel using Information Theory. This model enables us to:
• Meet the demands of a coding scheme:o Define a coding scheme as a relation between stimulus and neural
response classes.
o Construct a coding scheme that is stochastic on the finer scale yet almost deterministic on the classes.
• Deal with the major problem:o Use whatever quantity of data is available to construct coarse but
optimally informative approximations of the coding scheme.
o Refine the coding scheme as more data becomes available.
• Investigate the cricket cercal sensory system.
X YQ(Y|X)input output
A Stochastic Map
The relationship between X and Y is completely described by the conditional probability Q.
stimulus sequence X=x response sequence Y=y
Realizations of X and Y in neural coding
Q(Y=y|X=x)
Y
1
2
3
4
Xstimulus sequences
resp
onse
seq
uenc
esDetermining Stimulus/Response Classes
Given a joint probability p(X,Y):
Stimulus and Response Classes
stimulus sequences
resp
onse
seq
uenc
es
Distinguishable stimulus/response
classes
Y
X
1
2
3
4
Information Theoretic QuantitiesA quantizer or encoder, Q, relates the environmental stimulus, X, to the neural response, Y, through a process called quantization. In general, Q is a stochastic map
The Reproduction space Y is a quantization of X. This can be repeated: Let Yf be a reproduction of Y. So there is a quantizer
Use Mutual information to measure the degree of dependence between X and Yf.
Use Conditional Entropy to measure the self-information of Yf given Y
q y y Y Yf f( | ):
fyyy
f
yf
ff yyqypxp
yxpyyq
yxpyyqYXI, )|()()(
),()|(
log),()|(),(
H Y Y p y q y y q y yf fy y
ff
( | ) ( ) ( | ) log( ( | )),
Q y x X Y( | ):
The ModelProblem: To determine a coding scheme between X and Y requires large amounts of
data
Idea: Determine the coding scheme between X and Yf, a clustering (reproduction) of Y, such that: Yf preserves as much information (mutual information) with X as possible and the self-information (entropy) of Yf |Y is maximized. That is, we are searching for an optimal mapping (quantizer):
that satisfies these conditions.
Justification: Jayne's maximum entropy principle, which states that of all the quantizers that satisfy a given set of constraints, choose the one that maximizes the entropy.
ff YYyyq :)|(*
Maximum entropy:
maximize F(q(yf|y)) = H(Yf|Y) constrained by
I(X;Yf ) Io Io determines the informativeness of the reproduction.
Deterministic annealing (Rose, ’98):
maximize F(q(yf|y)) = H(Yf|Y) + I(X,Yf ).Small favor maximum entropy, large : maximum I(X,Yf ).
Augmented Lagrangian with Newton CG line search
Implicit solution:
Simplex Algorithm:
maximize I(X,Yf ) over vertices of constraint space
f
Iq
Iq
y
yp
qD
yy
yp
qD
f ee
yyq
,|
Equivalent Optimization Problems
Random clusters
Application to synthetic data(p(X,Y) is known)
• p(x,y) cannot be estimated directly for rich stimulus sets - there is
not enough data.
• I(X,Yf )=H(X) - H(X|Yf ). Only H(X|Yf ) depends on the q(yf |y). So an
upper bound of H(X|Yf) produces a lower bound of I(X,Yf ).
• is bounded by a Gaussian:
where
is the conditional covariance of the stimulus.
•
which is written explicitly as a function of the quantizer (through
p(y|yf )). We estimate the mean and and covariance matrices.
Application to Real Data(the probabilities p(y) and p(x,y) are NOT known)
ffyf yYXHEYXHf
||
fff yX
XyfGyf CeEyXHEYXH |2 det2log2
1)|(|
fyXC |
Tyyyyyy
TyyyXyyyX
x
TyyfyX
ffff
fff
EECEC
xxyxpC
|||||
|
-
)|(
The Optimization Problem for Real Data
Maximum entropy: maximize F(q(yf|y)) = H(Yf|Y) constrained by
H(X)-HG(X|Yf ) Io Io determines the informativeness of the reproduction.
?
?
Signal
Nervous system
Communicationchannel
Modeling the cricket cercal sensory system as a communication channel
Wind Stimulus and Neural Response in the cricket cercal system
Neural Responses (over a 30 minute recording) caused by white noise wind stimulus.
T, ms
Neural Responses (these are all doublets) for a 12 ms window
Some of the air current stimuli preceding one of the neural responses
Time in ms. A t T=0, the first spike occurs
X
Y
YfY
Quantization:A quantizer is any map f: Y -> Yf from Y to a reproduction space Yf with finitely many elements. Quantizers can be
deterministic or
refined
yy f
Y
probabilistic
yyq f |
Applying the algorithm to cricket sensory data.
Y
Yf
Yf
Class 1
Class 2
Class 1
Class 2
Class 3
ConclusionsWe• model a part of the neural system as a communication
channel.
• define a coding scheme through relations between classes of stimulus/response pairs.
- Coding is probabilistic on the individual elements of X and Y.- Coding is almost deterministic on the stimulus/response
classes.
To recover such a coding scheme, we• propose a new method to quantify neural spike trains.
- Quantize the response patterns to a small finite space (Yf).
- Use information theoretic measures to determine optimal quantizer for a fixed reproduction size.
- Refine the coding scheme by increasing the reproduction size.
• present preliminary results with cricket sensory data.