modelling and control issues arising in the quest for a neural decoder computation, control, and...

46
Modelling and Control Issues rising in the Quest for a Neural Decode Computation, Control, and Biological Systems Conference VIII, July 30, 2003 Albert E. Parker Complex Biological Systems Department of Mathematical Sciences Center for Computational Biology Montana State University Collaborators: Tomas Gedeon, Alex Dimitrov, John Miller, and Zane Aldworth

Post on 19-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Modelling and Control Issues Arising in the Quest for a Neural Decoder

Computation, Control, and Biological Systems Conference VIII,

July 30, 2003

Albert E. Parker

Complex Biological Systems Department of Mathematical Sciences

Center for Computational Biology

Montana State University

Collaborators: Tomas Gedeon, Alex Dimitrov, John Miller, and Zane Aldworth

The Neural Coding ProblemA Clustering ProblemThe Dynamical SystemThe Role of Bifurcation TheoryA new algorithm to solve the Neural

Coding Problem

Talk Outline

The Neural Coding Problem

GOAL: To understand the neural code.EASIER GOAL: We seek an answer to the question,

How does neural activity represent information about environmental stimuli?

“The little fly sitting in the fly’s brain trying to fly the fly”

inputs: stimuliX

outputs: neural responsesY

Looking for the dictionary to the neural code …

decoding

encoding

… but the dictionary is not deterministic!

Given a stimulus, an experimenter observes many different neural responses:

X

Yi| Xi = 1, 2, 3, 4

… but the dictionary is not deterministic!

Given a stimulus, an experimenter observes many different neural responses:

Neural coding is stochastic!!

X

Yi| Xi = 1, 2, 3, 4

Similarly, neural decoding is stochastic:

Y

Xi|Yi = 1, 2, … , 9

Probability Framework

X Y

environmentalstimuli

neuralresponses

decoder: P(X|Y)

encoder: P(Y|X)

The Neural Coding Problem:How to determine the

encoder P(Y|X) or the decoder P(X|Y)?

Common Approaches: parametric estimations, linear methods

Difficulty: There is never enough data.

One Approach: Cluster the responses

X Y

Stimuli Responses

YN

q(YN |Y)

Clustered Responses

K objects {yi} N objects {yNi}L objects {xi}

p(X,Y)

One Approach: Cluster the responses

X Y

Stimuli Responses

YN

q(YN |Y)

Clustered Responses

K objects {yi} N objects {yNi}L objects {xi}

p(X,Y)

One Approach: Cluster the responses

X Y

Stimuli Responses

YN

q(YN |Y)

Clustered Responses

K objects {yi} N objects {yNi}L objects {xi}

p(X,Y)

P(Y|X)

P(X|Y)

One Approach: Cluster the responses

X Y

Stimuli Responses

YN

q(YN |Y)

Clustered Responses

K objects {yi} N objects {yNi}L objects {xi}

p(X,Y)

P(Y|X)

P(X|Y)

One Approach: Cluster the responses

X Y

Stimuli Responses

YN

q(YN |Y)

Clustered Responses

K objects {yi} N objects {yNi}L objects {xi}

p(X,Y)

P(Y|X) P(YN|X)

P(X|Y) P(X|YN)

One Approach: Cluster the responses

• q(YN|Y) is a stochastic clustering of the responses • To address the insufficient data problem, one clusters the outputs Y into clusters YN so that the information that one can learn about X by observing YN , I(X;YN), is as close as possible to the mutual information I(X;Y)

X Y

Stimuli Responses

YN

q(YN |Y)

K objects {yi} N objects {yNi}L objects {xi}

p(X,Y)

Clustered Responses

• Information Bottleneck Method (Tishby, Pereira, Bialek 1999)

min I(Y,YN) constrained by I(X;YN) I0

max –I(Y,YN) + I(X;YN)

• Information Distortion Method (Dimitrov and Miller 2001)

max H(YN|Y) constrained by I(X;YN) I0

max H(YN|Y) + I(X;YN)

q

Two optimization problems which use this approach

q

q

q

In General:We have developed an approach to solve optimization problems of the form

maxqG(q) constrained by D(q)D0

or (using the method of Lagrange multipliers)

maxqF(q,) = maxq(G(q)+D(q))

where [0,). is a subset of valid stochastic clusterings in RNK.• G and D are sufficiently smooth in .• G and D have symmetry: they are invariant to relabelling of the classes of YN.

Symmetry: invariance to relabelling of the clusters of YN

Y YN

q(YN|Y) : a clustering

K objects {yi} N objects {yNi}

class 1

class 2

Symmetry: invariance to relabelling of the clusters of YN

Y YN

q(YN|Y) : a clustering

K objects {yi} N objects {yNi}

class 2

class 1

An annealing algorithmto solve

maxq(G(q)+D(q))

Let q0 be the maximizer of maxq G(q), and let 0 =0. For k 0, let (qk , k ) be a solution to maxq G(q) + D(q ). Iterate the following steps until K = max for some K.

1. Perform -step: Let k+1 = k + dk where dk>0

2. The initial guess for qk+1 at k+1 is qk+1(0) = qk + for some small

perturbation .

3. Optimization: solve maxq (G(q) + k+1 D(q)) to get the maximizer qk+1 , using initial guess qk+1

(0) .

Application of the annealing method to the Information Distortion problem maxq (H(YN|Y) + I(X;YN))

when p(X,Y) is defined by four gaussian blobs

Stimuli

Responses

X Y

52 responses52 stimuli

p(X,Y) Y YN

q(YN |Y)

52 responses 4 clusters

Evolution of the optimal clustering: Observed Bifurcations for the Four Blob problem:

We just saw the optimal clusterings q* at some *= max . What do the clusterings look like for < max ??

??????

Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?

What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?

How many bifurcating solutions are there?

What do the bifurcating branches look like? Are they subcritical or supercritical ?

What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem?

Are there bifurcations after all of the classes have resolved ?

q*

Conceptual Bifurcation Structure

Observed Bifurcations for the 4 Blob Problem

Bifurcation theory in the presence of symmetries

enables us to answer the questions previously posed …

Recall the Symmetries:

To better understand the bifurcation structure, we capitalize on the symmetries of the function G(q)+D(q)

Y YN

q(YN|Y) : a clustering

K objects {yi} N objects {yNi}

class 1

class 3

Y YN

q(YN|Y) : a clustering

K objects {yi} N objects {yNi}

class 3

class 1

Recall the Symmetries:

To better understand the bifurcation structure, we capitalize on the symmetries of the function G(q)+D(q)

The symmetry group of all permutations on N symbols

is SN.

Formulate a Dynamical SystemGoal: To solve maxq (G(q) + D(q)) for each , incremented in

sufficiently small steps, as .

Method: Study the equilibria of the of the gradient flow

• Equilibria of this system are possible solutions of the the maximization problem (satisfy the necessary conditions of constrained optimality)

• The Jacobian q,L(q*,*) is symmetric, and so only bifurcations of equilibria can occur.

Yy z

yqq yzqqDqGqq

1)|()()(:),,( ,, L

Observed Bifurcation Structure

Observed Bifurcation Structure

4S

3S3S

3S 3S

2S2S 2S2S2S2S2S2S

1

2S 2S 2S2S

Group Structure

Group Structure

q*Observed Bifurcation Structure

4S

3S3S

3S 3S

2S2S 2S2S2S2S2S2S

1

2S 2S 2S2S

The Equivariant Branching Lemma shows that the bifurcation structure contains the branches …

Group Structure

q*Observed Bifurcation Structure

4S

34,12 24,13

23,14

The Smoller-Wasserman Theorem shows additional structure …

q*

Theorem: There are at exactly K/N bifurcations on the branch (q1/N , ) for the Information Distortion problem

There are 13bifurcations on the first

branch

Observed Bifurcation Structure

??????

Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?

What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?

How many bifurcating solutions are there?

What do the bifurcating branches look like? Are they subcritical or supercritical ?

What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem?

Are there bifurcations after all of the classes have resolved ?

q*

Conceptual Bifurcation Structure

Observed Bifurcations for the 4 Blob Problem

??????

Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?There are N-1 symmetry breaking bifurcations from SM to SM-1 for M N.

What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?

How many bifurcating solutions are there? There are at least N from the first bifurcation, at least N-1 from the next one, etc.

What do the bifurcating branches look like? They are subcritical or supercritical depending on the sign of the bifurcation discriminator (q*,*,uk) .

What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? No.

Are there bifurcations after all of the classes have resolved ? In general, no.

Conceptual Bifurcation StructureObserved Bifurcations for the 4 Blob Problem

q*

Continuation techniques provide

numerical confirmation of the theory

A closer look …

q*

Bifurcation from S4 to S3…

q*

The bifurcation from S4 to S3 is subcritical …

(the theory predicted this since the bifurcation discriminator (q1/4,*,u)<0 )

Additional structure!!

Conclusions …

We have a complete theoretical picture of how the clusterings evolve for any problem of the form

maxq(G(q)+D(q))

subject to the assumptions stated earlier.

o When clustering to N classes, there are N-1 bifurcations.o In general, there are only pitchfork and saddle-node bifurcations.o We can determine whether pitchfork bifurcations are either subcritical or

supercritical (1st or 2nd order phase transitions)o We know the explicit bifurcating directions

SO WHAT?? There are theoretical consequences … This yields a new and improved algorithm for solving the neural coding

problem …

A numerical algorithm to solve max(G(q)+D(q))

Let q0 be the maximizer of maxq G(q), 0 =1 and s > 0. For k 0, let (qk , k ) be a solution to maxq G(q) + D(q ). Iterate the following steps until K = max for some K.

1. Perform -step: solve

for and select k+1 = k + dk where dk = (s sgn(cos )) /(||qk ||2 + ||k ||2 +1)1/2.

2. The initial guess for (qk+1,k+1) at k+1 is (qk+1

(0),k+1 (0)) = (qk ,k) + dk ( qk, k) .

3. Optimization: solve maxq (G(q) + k+1 D(q)) using pseudoarclength continuation to get the maximizer qk+1, and the vector of Lagrange multipliers k+1 using initial guess (qk+1

(0),k+1 (0)).

4. Check for bifurcation: compare the sign of the determinant of an identical block of each of q [G(qk) + k D(qk)] and q [G(qk+1) + k+1 D(qk+1)]. If a bifurcation is detected, then set qk+1

(0) = qk + d_k u where u is bifurcating direction and repeat step 3.

),,(),,( ,, kkkqk

kkkkq q

qq

LL

k

kq

q

Application to cricket sensory data

E(X|YN): stimulusmeans conditioned

on each of the classes

typical spikepatterns

optimal quantizer