symmetry breaking bifurcations of the information distortion dissertation defense april 8, 2003...

84
Symmetry Breaking Bifurcations of the Information Distortion Dissertation Defense April 8, 2003 Albert E. Parker III Complex Biological Systems Department of Mathematical Sciences Center for Computational Biology Montana State University

Post on 19-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Symmetry Breaking Bifurcationsof the Information Distortion

Dissertation DefenseApril 8, 2003

Albert E. Parker III

Complex Biological Systems Department of Mathematical Sciences

Center for Computational Biology

Montana State University

Goal: Solve the Information Distortion Problem

The goal of my thesis is to solve the Information Distortion problem, an optimization problem of the form

maxqG(q) constrained by D(q)D0

where

is a subset of Rn.• G and D are sufficiently smooth in .• G and D have symmetry: they are invariant to some group action.

Problems of this form arise in the study of clustering problems or optimal source coding systems.

Goal: Another Formulation

Using the method Lagrange multipliers, the goal of finding solutions of the optimization problem can be rephrased as finding stationary points of the problem

maxqF(q,) = maxq(G(q)+D(q))

where [0,). is a subset of RNK.• G and D are sufficiently smooth in .• G and D have symmetry: they are invariant to some group action.

How: Determine the Bifurcation Structure

We have described the bifurcation structure of stationary points to any problem of the form

maxqF(q,) = maxq(G(q)+D(q))

where [0,). is a linear subset of RNK.• G and D are sufficiently smooth in .• G and D have symmetry: they are invariant to some group action.

Thesis Topics

The Data Clustering ProblemThe Neural Coding Problem Information Theory / Probability TheoryOptimization TheoryDynamical SystemsBifurcation Theory with SymmetriesGroup TheoryContinuation Techniques

Outline of this talk

The Data Clustering ProblemA Class of Optimization ProblemsBifurcation with SymmetriesNumerical Results

The Data Clustering Problem

• Data Classification: identifying all of the books printed in 2002

which address the martial art Kempo

• Data Compression: converting a bitmap file to a jpeg file

Y YN

q(YN|Y) : a clustering

K objects {yi} N objects {yNi}

A Symmetry: invariance to relabelling of the clusters of YN

Y YN

q(YN|Y) : a clustering

K objects {yi} N objects {yNi}

class 1

class 2

A Symmetry: invariance to relabelling of the clusters of YN

Y YN

q(YN|Y) : a clustering

K objects {yi} N objects {yNi}

class 2

class 1

Requirements of a Clustering Method

• The original data is represented reasonably well by the clusters

– Choosing a cost function, D(Y,YN) , called a distortion function, rigorously defines what we mean by the “data is represented reasonably well”.

• Fast implementation

• Deterministic Annealing (Rose 1998) A Fast Clustering Algorithm

max H(YN|Y) constrained by D(Y,YN) D0

• Rate Distortion Theory (Shannon ~1950) Minimum Informative Compression

min I(Y,YN) constrained by D(Y,YN) D0

qC,

Examplesoptimizing at a distortion level D(Y,YN) D0

q

NK

YyNN Yyyyqyyq

NN

,1)|(|)|(:

Inputs and Outputs and Clustered Outputs

• The Information Distortion method clusters the outputs Y into clusters YN so that the information that one can learn about X by observing YN , I(X;YN), is as close as possible to the mutual information I(X;Y)• The corresponding information distortion function is

DI(Y;YN)=I(X;Y) - I(X;YN )

X Y

Inputs Outputs

YN

q(YN |Y)

Clusters

K objects {yi} N objects {yNi}L objects {xi}

p(X,Y)

• Information Distortion Method (Dimitrov and Miller 2001)

max H(YN|Y) constrained by DI(Y,YN) D0

max H(YN|Y) + I(X;YN)

• Information Bottleneck Method (Tishby, Pereira, Bialek 1999)

min I(Y,YN) constrained by DI(Y,YN) D0

max –I(Y,YN) + I(X;YN)

q

Two optimization problems which use the information distortion function

q

q

q

An annealing algorithmto solve

maxqF(q,) = maxq(G(q)+D(q))

Let q0 be the maximizer of maxq G(q), and let 0 =0. For k 0, let (qk , k ) be a solution to maxq G(q) + D(q ). Iterate the following steps until K = max for some K.

1. Perform -step: Let k+1 = k + dk where dk>0

2. The initial guess for qk+1 at k+1 is qk+1(0) = qk + for some small

perturbation .

3. Optimization: solve maxq (G(q) + k+1 D(q)) to get the maximizer qk+1 , using initial guess qk+1

(0) .

Application of the annealing method to the Information Distortion problem maxq (H(YN|Y) + I(X;YN))

when p(X,Y) is defined by four gaussian blobs

Inputs

Outputs

X Y

52 objects52 objects

p(X,Y)

Y YN

q(YN |Y)

52 objects N objects I(X;YN)=D(q(YN|Y))

Observed Bifurcations for the Four Blob problem:

We just saw the optimal clusterings q* at some *= max . What do the clusterings look like for < max ??

Bifurcations of q *()

Observed Bifurcations for the 4 Blob Problem

Conceptual Bifurcation Structure

q*

Nq

1*

??????

Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?

What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?

How many bifurcating solutions are there?

What do the bifurcating branches look like? Are they subcritical or supercritical ?

What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem?

Are there bifurcations after all of the classes have resolved ?

q*

Conceptual Bifurcation Structure

Observed Bifurcations for the 4 Blob Problem

Bifurcations with symmetry• To better understand the bifurcation structure, we capitalize on

the symmetries of the function G(q)+D(q)

• The “obvious” symmetry is that G(q)+D(q) is invariant to relabelling of the N classes of YN

• The symmetry group of all permutations on N symbols is SN.

switch labels 1 and 3

Symmetry Breaking Bifurcations

q*

4

11

N

q

41 by fixed is SSq N

N

Symmetry Breaking Bifurcations

q*

4

11

N

q

*q

41 by fixed is SSq N

N

31* by fixed is SSq N

Symmetry Breaking Bifurcations

q*

4

11

N

q

*q

41 by fixed is SSq N

N

31* by fixed is SSq N

*q

22* by fixed is SSq N

Symmetry Breaking Bifurcations

q*

Symmetry Breaking Bifurcations

q*

*q

)34)(12()1324()(by fixed is 2* pcycleNq

Existence Theorems for Bifurcating Branches

Given a bifurcation at a point fixed by SN ,

• Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1)

• There are N bifurcating branches, each which have symmetry SN-1 .

• The Smoller-Wasserman Theorem (Smoller and Wasserman 1985-6)

• There are bifurcating branches which have symmetry <(N-cycle)p> for every prime p|N, p<N.

q*

Given a bifurcation at a point fixed by SN-1 ,

• Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1)

• Gives N-1 bifurcating branches which have symmetry SN-2 .

• The Smoller-Wasserman Theorem (Smoller and Wasserman 1985-6)

• Gives bifurcating branches which have symmetry <(M-cycle)p> for every prime p|N-1, p<N-1 .

When N = 4, N-1=3, there are no bifurcating branches given by SW Theorem.

q*Existence Theorems for Bifurcating Branches

Bifurcation Structure corresponds with Group Structure

4S

3S3S

3S 3S

0

3

vv

v

v

0

3

vv

v

v

0

3vv

v

v

0

3vv

v

v

2S2S 2S2S2S2S2S2S

1

0

2

0

vv

v

2S 2S 2S2S

0

2

0

vv

v

0

2

0

vv

v

0

0

2

vv

v

0

2

0

vv

v

0

2

0

vv

v

0

0

2

v

v

v

0

20v

v

v

0

0

2

v

v

v

0

0

2

v

v

v

0

0

2

v

v

v

0

02v

v

v

A partial subgroup lattice for S4 and the corresponding bifurcating directions given by the Equivariant Branching Lemma

4S

4A

34,12 24,13

23,14

v

v

v

v

v

v

v

v

0)(Fix 4 A

v

v

v

v

)1324(

0))1234((Fix

A partial subgroup lattice for S4 and the corresponding bifurcating directions given by the Smoller-Wasserman Theorem

q*

Conceptual Bifurcation Structure

q*

Conceptual Bifurcation Structure

4S

3S3S

3S 3S

2S2S 2S2S2S2S2S2S

1

2S 2S 2S2S

The Equivariant Branching Lemma shows that the bifurcation structure from SM to SM-1 is …

Group Structure

q*

Conceptual Bifurcation Structure

q*

4S

3S3S

3S 3S

2S2S 2S2S2S2S2S2S

1

2S 2S 2S2S

Group Structure

The Equivariant Branching Lemma shows that the bifurcation structure from SM to SM-1 is …

The Smoller-Wasserman Theorem shows additional structure …

q*

q*

Conceptual Bifurcation Structure

4S

4A

34,12 24,13

23,14

)1324(

Group Structure

q*

Conceptual Bifurcation Structure

4S

4A

34,12 24,13

23,14

)1324(

Group Structure

q*

The Smoller-Wasserman Theorem shows additional structure … 3 branches from the S4 to S3 bifurcation only.

q*

Conceptual Bifurcation Structure

q*

If we stay on a branch which is fixed by SM , how many bifurcations are there?

q*

Conceptual Bifurcation Structure

4S

4A

34,12 24,13

23,14

)1324(

Group Structure

q*

Theorem: There are at exactly K/N bifurcations on the branch (q1/N , ) for the Information Distortion problem

There are 13bifurcations on the first

branch

Bifurcation theory in the presence of symmetries

enables us to answer the questions previously posed …

??????

Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?

What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?

How many bifurcating solutions are there?

What do the bifurcating branches look like? Are they subcritical or supercritical ?

What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem?

Are there bifurcations after all of the classes have resolved ?

q*

Conceptual Bifurcation Structure

Observed Bifurcations for the 4 Blob Problem

??????

Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?There are N-1 symmetry breaking bifurcations from SM to SM-1 for M N.

What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?

How many bifurcating solutions are there? There are at least N from the first bifurcation, at least N-1 from the next one, etc.

What do the bifurcating branches look like? They are subcritical or supercritical depending on the sign of the bifurcation discriminator (q*,*,uk) .

What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? No.

Are there bifurcations after all of the classes have resolved ? In general, no.

Conceptual Bifurcation StructureObserved Bifurcations for the 4 Blob Problem

q*

We can explain the bifurcation structure

of all problems of the form

maxq F(q, ) = maxq (G(q)+D(q))

where [0,). is a subset of RNK.• G and D are sufficiently smooth in .• G and D are invariant to relabelling of the classes of YN

• The blocks of the Hessian q(G+ D) at bifurcation satisfy a set of generic conditions.

This class of problems includes the Information Distortion problem.

Hessian d constraine theis , LqHessian nedunconstrai isFq

singular is , Lq

singular isFq rnonsingula isFq

rnonsingula is1

1

MN

iKi MIRB

1M 1M

Symmetry breaking

bifurcation

Impossible scenario

Saddle-node bifurcation

Impossible scenario

Non-generic

chap

ter

6

rnonsingula is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

chap

ter

6

chap

ter

8

chap

ter

4

Continuation techniques provide

numerical confirmation of the theory

Previously Observed Bifurcation Structure for the Four Blob problem:

Equivariant Branching Lemma: Previous vs. Actual Bifurcation Structure

We used Continuation Techniques and the Theory of Bifurcations with Symmetries on the 4 Blob Problem using the Information Distortion method to get this picture.

Previous results:

Actual structure:

Singularity of F:

Singularity of L :

*

q*

Smoller-Wasserman Theorem: there are bifurcating branches with

symmetry <(1324)2> = <(12)(34)>

q*

A closer look …

q*

Bifurcation from S4 to S3…

q*

The bifurcation from S4 to S3 is subcritical …

(the theory predicted this since the bifurcation discriminator (q1/4,*,u)<0 )

q*Bifurcation from S3 to S2…

The bifurcation from S3 to S2 is subcritical …

q*Bifurcation from S2 to S1…

The bifurcation from S2 to S1 …

What are these branches ???

q*

Theorem: In general, either symmetry breaking bifurcations or saddle-node bifurcations can occur.

Outline of proof: The Equivariant Branching Lemma, Smoller-Wasserman

Theorem, and the following singularity structure:

singular is , Lq

singular isFq singular-non isFq

singular-non is1

1

MN

iKi MIRB

1M 1M

singular-non is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

Conclusions

Symmetry breaking

bifurcation

ImpossibleScenario

Saddle-node bifurcation

Impossible scenario

Non-generic

Theorem: All symmetry breaking bifurcations from

SM to SM-1 are pitchfork-like, and there exists M

bifurcating branches, for which we have explicit

directions.

Conclusions

q*

Theorem: The bifurcation discriminator of the pitchfork-like branch (q*,*,*) + (tu,0,(t)) is

If (q*,*,uk) < 0, then the branch is subcritical. If (q*,*,uk) > 0, then the branch is supercritical.

Conclusions

],,[,)33(][][,[,3),( 42

,

213** vvvFvMMvv

qqELuuuq

sr srsr

LL

Theorem: Solutions of the optimization problem do not always persist from bifurcation.

Theorem: In general, bifurcations do not occur after all of the classes have resolved.

Conclusions

A numerical algorithm to solve max(G(q)+D(q))

Let q0 be the maximizer of maxq G(q), 0 =1 and s > 0. For k 0, let (qk , k ) be a solution to maxq G(q) + D(q ). Iterate the following steps until K = max for some K.

1. Perform -step: solve

for and select k+1 = k + dk where dk = (s sgn(cos )) /(||qk ||2 + ||k ||2 +1)1/2.

2. The initial guess for (qk+1,k+1) at k+1 is (qk+1

(0),k+1 (0)) = (qk ,k) + dk ( qk, k) .

3. Optimization: solve maxq (G(q) + k+1 D(q)) using pseudoarclength continuation to get the maximizer qk+1, and the vector of Lagrange multipliers k+1 using initial guess (qk+1

(0),k+1 (0)).

4. Check for bifurcation: compare the sign of the determinant of an identical block of each of q [G(qk) + k D(qk)] and q [G(qk+1) + k+1 D(qk+1)]. If a bifurcation is detected, then set qk+1

(0) = qk + d_k u where u is bifurcating direction and repeat step 3.

),,(),,( ,, kkkqk

kkkkq q

qq

LL

k

kq

q

Details …

• The Dynamical System

• Types of Singularities

• Continuation Techniques

• The Explicit Group of Symmetries

• Explicit Existence Theorems for bifurcating branches

A Class of Problems

max F(q, ) = max(G(q)+D(q))

• G and D are sufficiently smooth in .

• G and D must be invariant under relabelling of the classes.

q q

The Dynamical SystemGoal: To determine the bifurcation structure of solutions to

maxq (G(q) + D(q)) for [0,) .

Method: Study the equilibria of the of the flow

• The Jacobian wrt q of the K constraints {YNq(YN|y)-1} is J=(IK IK … IK).

• If wT qF(q*,) w < 0 for every wker J, then q*() is a maximizer of .

• The first equilibrium is q*(0 = 0) 1/N.

• If wT qF(q*,) w < 0 for every wker J, then q*() is a maximiYNer of .

• The first equilibrium is q*(0 = 0) 1/N.

Yy z

yqq yzqqDqGqq

1)|()()(:),,( ,, L

KnKnq

:, L

• In our dynamical system

the hessian

determines the stability of equilibria and the location of bifurcation.

.

),,(, qq

q L

0),,(, T

qq J

JFq L

Properties of the Dynamical System

Hessian d constraine theis , LqHessian nedunconstrai isFq

singular is , Lq

singular isFq rnonsingula isFq

rnonsingula is1

1

MN

iKi MIRB

1M 1M

Symmetry breaking

bifurcation

Impossible scenario

Saddle-node bifurcation

Impossible scenario

Non-generic

chap

ter

6

rnonsingula is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

chap

ter

6

chap

ter

8

chap

ter

4

The Dynamical System

How:

Use numerical continuation in a constrained system to choose and to choose an initial guess to find the equilibria q*( ).

Use bifurcation theory with symmetries to understand bifurcations of the equilibria.

Investigating the Dynamical System

Continuation• A local maximum qk

*(k) of is an equilibrium of the gradient flow .• Initial condition qk+1

(0)(k+1(0)) is sought in tangent direction qk, which is found by solving the matrix system

• The continuation algorithm used to find qk+1*(k+1) is based on Newton’s method.

• Parameter continuation follows the dashed (---) path, pseudoarclength continuation follows the dotted (…) path ),,(),,( ,, kkkq

k

k

kkkq qq

q

LL

k)0(

1k

),( , kkkq

),,( 111 kkkq

),( )0(1

)0(1

)0(1 kkkq

),( 11 kkq

),( kkq

),( q

),( )0(1

)0(1 kkq

The Groups• Let P be the finite group of n ×n “block” permutation matrices which represents the action of SN

on q and F(q,) . For example, if N=3,

permutes q(YN1|y) with q(YN2|y) for every y

• F(q,) is P -invariant means that for every P, F( q,) = F(q,)

• Let be the finite group of (n+K) × (n+K) block permutation matrices

which represents the action of SN on and q, L(q,,):

q, L(q, , ) is -equivariant means that for every q, L(q, , ) = q, L( ,)

q

! |0

0: fixed are sconstraint and smultiplier lagrange the

P

KKnK

Kn

I

q

P

K

K

K

I

I

I

00

00

00

Notation and Definitions• The symmetry of is measured by its isotropy subgroup

• An isotropy subgroup is a maximal isotropy subgroup of if there does not exist an isotropy subgroup of such that .

• At bifurcation , the fixed point subspace of q*,* is

qqq |,

q

),( *

*

*

q

**** ,

***,,

,|),,(ker)(Fix

qqq

wwqw L

Equivariant Branching LemmaOne of the Existence Theorems we use to describe a bifurcation in the

presence of symmetries is the Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1).

Idea: The bifurcation structure of local solutions is described by the isotropy subgroups of which have dim Fix()=1. • System: .

• r(x,) is G-equivariant for some compact Lie Group G• • Fix(G)={0}• Let H be an isotropy subgroup of G such that dim Fix (H) = 1.• Assume r(0,0) 0 (crossing condition).

Then there is a unique smooth solution branch (tx0,(t)) to r = 0 such that x0 Fix (H) and the isotropy subgroup of each solution is H.

mmrxrx :),,(

0)0,0(,0)0,0( rr x

From bifurcation, the Equivariant Branching Lemma shows that the following solutions emerge:

An stationary point q* is M-uniform if there exists 1 M N and a

K x 1 vector P such that q(yNi|Y)=P for M and only M classes, {yNi}Ni=1

of YN. These M classes of YN are unresolved classes. The classes of YN that are not unresolved are called resolved.

The first equilibria, q* 1/N, is N-uniform.

Theorem: q* is M-uniform if and only if q* is fixed by SM.

Symmetry Breaking from SM to SM-1

Theorem: dim ker qF (q*,)=M with basis vectors {vi}Mi=1

Theorem: dim ker q,L (q*,,)=M-1 with basis vectors

Point: Since the bifurcating solutions whose existence is guaranteed by the EBL and the SW Theorem

are tangential to ker q,L (q*,,), then we know the explicit form of the bifurcating directions.

otherwise 0

class unresolved theis if ][

th

i

ivv

00

Mi vv

Kernel of the Hessian at Symmetry Breaking Bifurcation

Assumptions:• Let q* be M-uniform • Call the M identical blocks of qF (q*,): B. Call the other N-M blocks of qF (q*,):

{R}. We assume that B has a single nullvector v and that R is nonsingular for every .

• If M<N, then BR-1 + MIK is nonsingular.

Theorem: Let (q*,*,*) be a singular point of the flow

such that q* is M-uniform. Then there exists M bifurcating (M-1)-uniform solutions (q*,*,*) + (tuk,0,(t)), where

Symmetry Breaking Bifurcation from M-uniform solutions

otherwise 0

class unresolvedother any is if

class unresolved theis if)1(

][ kv

kvM

u

th

k

),,(, qq

q L

Hessian d constraine theis , LqHessian nedunconstrai isFq

singular is , Lq

singular isFq rnonsingula isFq

rnonsingula is1

1

MN

iKi MIRB

1M 1M

Symmetry breaking

bifurcation

Impossible scenario

Saddle-node bifurcation

Impossible scenario

Non-generic

chap

ter

6

rnonsingula is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

singular is1

1

MN

iKi MIRB

chap

ter

6

chap

ter

8

chap

ter

4

4S

3S3S

3S 3S

0

3

vv

v

v

0

3

vv

v

v

0

3vv

v

v

0

3vv

v

v

2S2S 2S2S2S2S2S2S

1

0

2

0

vv

v

2S 2S 2S2S

0

2

0

vv

v

0

2

0

vv

v

0

0

2

vv

v

0

2

0

vv

v

0

2

0

vv

v

0

0

2

v

v

v

0

20v

v

v

0

0

2

v

v

v

0

0

2

v

v

v

0

0

2

v

v

v

0

02v

v

v

Some of the bifurcating branches when N = 4 are given by the following isotropy subgroup lattice for S4

For the 4 Blob problem:The isotropy subgroups and bifurcating directions of the

observed bifurcating branches

isotropy group: S4 S3 S2 1bif direction: (-v,-v,3v,-v,0)T (-v,2v,0,-v,0)T (-v,0,0,v,0)T … No more bifs!

Smoller-Wasserman Theorem

The other Existence Theorem:

Smoller-Wasserman Theorem (1985-6)

For variational problems where

there is a bifurcating solution tangential to Fix(H) for every maximal isotropy subgroup H, not only those with dim Fix(H) = 1.

• dim Fix(H) =1 implies that H is a maximal isotropy subgroup

),(),( xfxr x

The Smoller-Wasserman Theorem shows that (under the same assumptions as before)

if M is composite, then there exists bifurcating solutions with isotropy group <p> for every element of order M in and every prime p|M, p<M. Furthermore,

dim (Fix <p>)=p-1

Other branches

4S

4A

34,12 24,13

23,14

v

v

v

v

v

v

v

v

0)(Fix 4 A

v

v

v

v

)1324(

0))1234((Fix

Bifurcating branches from a 4-uniform solution are given by the following isotropy subgroup lattice for S4

Maximal isotropy subgroup for S4

4S

3S3S

3S 3S 4A

34,12 24,13

23,14

Issues: SM

• The full lattice of subgroups of the group SM is not known for arbitrary M.

• The lattice of maximal subgroups of the group SM is not known for arbitrary M.

More about the Bifurcation Structure

Theorem: All symmetry breaking bifurcations from SM to SM-1 are pitchfork-like.

Outline of proof: ’(0)=0 since 2xx r(0,0) =0.

Theorem: The bifurcation discriminator of the pitchfork-like branch

(q*,*,*) + (tuk,0,(t)) is

If (q*,*,uk) < 0, then the branch is subcritical. If (q*,*,uk) > 0, then the branch is supercritical.

Theorem: Generically, bifurcations do not occur after all of the classes have resolved.

Theorem: If dim (ker q,L (q*,,)) = 1, and if a crossing condition is satisfied, then saddle-node bifurcation must occur.

],,[,)33(][][,[,3),( 42

,

213** vvvFvMMvv

qqELuuuq

sr srsr

LL