an exactly solvable maximum entropy model peter latham gatsby computational neuroscience unit ucl...

68
n exactly solvable maximum entropy mode Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Post on 20-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

An exactly solvable maximum entropy model

Peter LathamGatsby Computational Neuroscience Unit

UCL

CNSJuly 20, 2006

Page 2: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

s r1, r2, ..., rn

?

The neural coding problem

Page 3: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

s r1, r2, ..., rn

The neural coding problem

P(s|r1, r2, ..., rn)

?

Page 4: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

s r1, r2, ..., rn

The neural coding problem

P(r1, r2, ..., rn|s) P(s|r1, r2, ..., rn)

?

Page 5: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

s r1, r2, ..., rn

The neural coding problem

P(r1, r2, ..., rn|s) P(s|r1, r2, ..., rn)

P(r1, r2, ..., rn|s) P(s)

P(r1, r2, ..., rn)

?

Bayes

Page 6: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

P(r

|s)

0 10 20

response (r)

response:one neuron, spike count, 300 ms bins.

decent histogram: ~20 responses, ~200 trials/stimulus.

Page 7: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

r 2r1

P(r

1 , r 2

|s)

200 trials is just not enough

response:two neurons, spike count, 300 ms bins.

decent histogram: ~202=400 responses, ~4000 trials/stimulus.

Page 8: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

more realistic case ...

P(r1, r2, ..., r10|s) =???

10-D histogram, ~1013 responses.

time to collect:20,000,000 years/stimulus.

Page 9: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Clearly, an approximate approach is needed.

There are several possibilities:

1. Assume independence: p(r|s) = p(r1|s)p(r2|s) …

2. Parametric models.

2a. Point process models.2b. Gaussian approximation (for rates).2c. Maximum entropy models.

Page 10: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Clearly, an approximate approach is needed.

There are several possibilities:

1. Assume independence: p(r|s) = p(r1|s)p(r2|s) …

2. Parametric models.

2a. Point process models.2b. Gaussian approximation (for rates).2c. Maximum entropy models.

Page 11: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Clearly, an approximate approach is needed.

There are several possibilities:

1. Assume independence: p(r|s) = p(r1|s)p(r2|s) …

2. Parametric models.

2a. Point process models.2b. Gaussian approximation (for rates).2c. Maximum entropy models.

Page 12: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Questions:

1. Are maximum entropy models useful for neural data?

2. How do we assess model quality?

3. How tractable are these models?

I’m not sure.

Not the way you might think.

Not very.

Page 13: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

The idea behind maximum entropy models:

1. Measure, from data, some aspect of a probability distribution.

2. Find the maximum entropy distribution consistent with that measurement.

Page 14: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

An example (in 1-D, with no dependence on s).

1. estimate the mean response from data,

r = ∑ r(k)

K1

k=1

K

Page 15: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

An example (in 1-D, with no dependence on s).

2. find the maximum entropy distribution consistent with this,

Page 16: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

An example (in 1-D, with no dependence on s).

2. find the maximum entropy distribution consistent with this,

[ -∑r' p(r') log p(r') - λ0∑r' p(r') - λ∑r' r'p(r') ] = 0∂

∂p(r)

entropy normalization (1) mean (r)

Page 17: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

An example (in 1-D, with no dependence on s).

2. find the maximum entropy distribution consistent with this,

[ -∑r' p(r') log p(r') - λ0∑r' p(r') - λ∑r' r'p(r') ] = 0∂

∂p(r)

-1-log p(r) -λ0 -λr

Page 18: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

An example (in 1-D, with no dependence on s).

2. find the maximum entropy distribution consistent with this,

-1 - log p(r) - λ0 - λr = 0

=> p(r) = exp[-1 - λ0 - λr]

Page 19: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

To find λ0 and λ,

p(r) =

Z = ∑r exp(-λr) => ∑r p(r) = 1

r = determines λ

exp(-λr)

Z

∑r r exp(-λr)

Z

Page 20: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

p(r) =exp(-λr - λ1r2)

Z

Page 21: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

p(r1, r2) =exp(-λ1r1 - λ2r2 - λ11r1

2 - λ12r1r2 - λ22r22)

Z

Page 22: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

An aside:

Maximum entropy => Maximum likelihood

just another parametric model;it’s just one that lives in theexponential family.

Page 23: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Assessing goodness of fit: KL distance.

D(p(r)||p(r|λ)) = ∑r p(r) log [p(r)/p(r|λ)]

= ∑r p(r) log p(r) - ∑r p(r) log p(r|λ)

Page 24: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Assessing goodness of fit: KL distance.

D(p(r)||p(r|λ)) = ∑r p(r) log [p(r)/p(r|λ)]

= ∑r p(r) log p(r) - ∑r p(r) log p(r|λ)

= -H(r) - ∑r p(r) log p(r|λ)

entropy

Page 25: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Our original example:

p(r|λ) =

- ∑r p(r) log p(r|λ)

exp(-λr)

Z

Page 26: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Our original example:

p(r|λ) =

- ∑r p(r) log p(r|λ) = λ∑r p(r) r + log Z

= λ∑r p(r|λ) r + log Z

= - ∑r p(r|λ) log p(r|λ)

= H(r|λ)

exp(-λr)

Z

Page 27: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Assessing goodness of fit: KL distance.

D(p(r)||p(r|λ)) = ∑r p(r) log [p(r)/p(r|λ)]

= ∑r p(r) log p(r) - ∑r p(r) log p(r|λ)

= -H(r) - ∑r p(r) log p(r|λ)

entropy

Page 28: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Assessing goodness of fit: KL distance.

D(p(r)||p(r|λ)) = ∑r p(r) log [p(r)/p(r|λ)]

= ∑r p(r) log p(r) - ∑r p(r) log p(r|λ)

= -H(r) + H(r|λ)

entropy

entropy under the model

Page 29: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Assessing goodness of fit: KL distance.

D(p(r)||p(r|λ)) = model entropy - true entropy

Page 30: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

We have a problem:

D(p(r)||p(r|λ)) = model entropy - true entropy

Although we might be able to compute the modelentropy, there’s no way in hell we can compute thetrue entropy.

Page 31: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

number of neurons

“Solution”: an exactly solvable model.

In the large N limit,

we can compute the true entropyand the model entropy.

Page 32: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

N neurons

time

neu

ron

Page 33: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

N binary neurons

time

neu

ron

0110010111

1111011110

0001100100

Page 34: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

ri = response of neuron i = {-1, 1}

0 spikesin a bin

1 or more spikesin a bin

Page 35: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

ri = response of neuron i = {-1, 1}

0110010111

r = (-1, 1, 1, -1, -1, 1, -1, 1, 1, 1)

Page 36: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Exactly solvable model (suppressing dependenceon stimulus)

p(r) = ∑θ p(θ) ∏i p(ri|θ)

D(p(r)||p(r|λ)) = model entropy - true entropy

Page 37: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Exactly solvable model (suppressing dependenceon stimulus)

p(r,θ) = ∑θ p(θ) ∏i p(ri|θ)

Page 38: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Computing the true entropy, H(r)

H(r) – H(r|θ) = I(r;θ)

mutual information

Page 39: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Computing the true entropy, H(r)

H(r) – H(r|θ) = I(r;θ) ≤ H(θ)

H(r) = H(r|θ) + I(r;θ) ≤ H(r|θ) + H(θ)

H(r) = H(r|θ) + I(r;θ) ≥ H(r|θ)

mutual information

entropy of p(θ)

Page 40: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Computing the true entropy, H(r)

H(r) – H(r|θ) = I(r;θ) ≤ H(θ)

H(r) = H(r|θ) + I(r;θ) ≤ H(r|θ) + H(θ)

H(r) = H(r|θ) + I(r;θ) ≥ H(r|θ)

H(r|θ) ≤ H(r) ≤ H(r|θ) + H(θ)

Page 41: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Why is this useful?

H(r|θ) = ∑θ p(θ) H(∏i p(ri|θ))

= ∑θ p(θ) ∑i H1(ri|θ)

H1(r|θ) = - ∑r p(r|θ) log p(r|θ)

Page 42: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Why is this useful?

H(r|θ) = ∑θ p(θ) H(∏i p(ri|θ))

= ∑θ p(θ) ∑i H1(ri|θ)

= N ∑θ p(θ) H1(r|θ)

H1(r|θ) = - ∑r p(r|θ) log p(r|θ)

Page 43: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Why is this useful?

H(r|θ) = ∑θ p(θ) H(∏i p(ri|θ))

= ∑θ p(θ) ∑i H1(ri|θ)

= N ∑θ p(θ) H1(r|θ)

H1(r|θ) = - ∑r p(r|θ) log p(r|θ)

only two terms in this sum!!!

Page 44: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

H(r|θ) ≤ H(r) ≤ H(r|θ) + H(θ)

order(N) and easyto compute

order(1)

In the large N limit,

H(r) ≈ H(r|θ)

Page 45: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Two maximum entropy models:

1. p1(r|h'), which captures first moments:

∑r ri p(r) = ∑r ri p(r|h')

2. p2(r|h, J), which captures first and second moments:

∑r ri p(r) = ∑r ri p2(r|h, J)

∑r ri rj p(r) = ∑r ri rj p2(r|h, J)

Page 46: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

p1(r|h') =

p2(r|h, J) =

p(r) = ∑θ p(θ) ∏i p(ri|θ)

exp(h' ∑i ri)

Z

exp(h∑i ri + (J/2N)∑ij ri rj)

Z

Page 47: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

p1(r|h') = => all neurons have the same mean

ri ≡ ρ, independent of i.

ρ completely specifies p1(r|h').

exp(h' ∑i ri)

Z

Page 48: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

p(r) = ∑θ p(θ) ∏i p(ri|θ) => conditioned on θ, all neurons have the same mean.

ri(θ) ≡ ∑ ri p(ri|θ) ≡ ρ(θ).

ρ(θ) and p(θ) completely specifies p(r).

Page 49: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

-1 1

p1(r|h')

ρ

-1 1

p(r)

ρ(θj)

p(θj)

1

-1 1

p2(r|h, J)

-ρ2 ρ2

p2-

p2+

Page 50: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Conclusion #1:

The “pairwise” maximum entropy distribution(p2) does not do a very good job matching thetrue distribution.

Whether or not this is true for more complexdistributions is not known.

Page 51: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

A simple case: p(θ) consists of two terms

Page 52: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

-1 1

p1(r|h')

ρ

-1 1

p(r)

ρ(θ1) ρ(θ2)

p(θ1)

p(θ2)

1

-1 1

p2(r|h, J)

-ρ2 ρ2

p2-

p2+

Page 53: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

-1 1

p1(r|h')

ρ

-1 1

p(r)

ρ(θ1) ρ(θ2)

p(θ1)

p(θ2)

1

-1 1

p2(r|h, J)

-ρ2 ρ2

p2-

p2+

Three parameters: ρ, p(θ2), ρ2

Page 54: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

firing rate bin size

In terms of more intuitive parameters:

ρ = (+1)×ντ + (-1)×(1-ντ) = 2ντ – 1

ρ22 - ρ2 ≡ δ2 = <rirj> - <ri><rj>, i ≠ j

Page 55: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

time

neu

ron

bin size

Page 56: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

firing rate bin size

In terms of more intuitive parameters:

ρ = (+1)×ντ + (-1)×(1-ντ) = 2ντ – 1

ρ22 - ρ2 ≡ δ2 = <rirj> - <ri><rj>, i ≠ j

Model parameters: ντ, δ, p(θ2).

Page 57: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

Goodness of fit:

D(p(r)||p1(r|h')) = H1 – H

D(p(r)||p2(r|h, J)) = H2 – H

The picture:

0 H1H2H

Page 58: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

What’s a good cost function?

For the independent distribution, p1(r|h'):

0 H1H2H

HH1

Page 59: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

What’s a good cost function?

For the independent distribution, p1(r|h'):

For the pairwise distribution p2(r|h, J):

0 H1H2H

HH1

HH2

H1-H2

H1-H

Page 60: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

p(θ2)0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

En

trop

y re

lati

ve t

o H

1τ = 20 ms, ν = 25 Hz, δ = 0.1

HH2

H1

Page 61: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

p(θ2)0.0 0.2 0.4 0.6 0.8 1.0

0.990

0.992

0.994

0.996

0.998

1.000

τ = 20 ms, ν = 25 Hz, δ = 0.1

H2 ≈ H;both are ≈ H1

≈ 1H1-H2

H1-H

0 H1H2H

H1

Page 62: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

p(θ2)0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

En

trop

y re

lati

ve t

o H

1τ = 20 ms, ν = 2 Hz, δ = 0.05

HH2≈ 0.2

H1-H2

H1-H

0 H1H2H

H1

Page 63: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

p(θ2)0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

En

trop

y re

lati

ve t

o H

1τ = 20 ms, ν = 2 Hz, δ = 0.1

H H2

H1

Page 64: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

p(θ2)0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

En

trop

y re

lati

ve t

o H

1τ = 20 ms, ν = 2 Hz, δ = 0.2

H

H2

H1

Page 65: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

A better approach to determining goodness of fit

What we’re computing is

p(r|s,λm)

model m

Page 66: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

A better approach to determining goodness of fit

What we’re computing is

p(r|s,λm)

=> p(s|r,λm) = p(r|s,λm) p(s) / normalization

Page 67: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

A better approach to determining goodness of fit

What we should be comparing are posteriors:

p(s|r,λ1) and p(s|r,λ2)

It doesn’t make sense to spend a huge amount of timeand effort finding the ultimate model for p(r|s,λ) ifthat’s not going to improve p(s|r,λ).

Page 68: An exactly solvable maximum entropy model Peter Latham Gatsby Computational Neuroscience Unit UCL CNS July 20, 2006

1. Maximum entropy => Maximum Likelihood.

2. For at least one maxent model, pairwise correlations don’t match the true distribution very well.

3. One needs to be very careful about assessing models: compare posteriors!!!

4. For binary neurons, the pairwise maxent model is intractable.

5. Wherever possible, use point process models. After all, spike timing is irrelevant in the brain.

Conclusions