symmetry breaking bifurcations of the information distortion dissertation defense april 8, 2003...
Post on 19-Dec-2015
217 views
TRANSCRIPT
Symmetry Breaking Bifurcationsof the Information Distortion
Dissertation DefenseApril 8, 2003
Albert E. Parker III
Complex Biological Systems Department of Mathematical Sciences
Center for Computational Biology
Montana State University
Goal: Solve the Information Distortion Problem
The goal of my thesis is to solve the Information Distortion problem, an optimization problem of the form
maxqG(q) constrained by D(q)D0
where
is a subset of Rn.• G and D are sufficiently smooth in .• G and D have symmetry: they are invariant to some group action.
Problems of this form arise in the study of clustering problems or optimal source coding systems.
Goal: Another Formulation
Using the method Lagrange multipliers, the goal of finding solutions of the optimization problem can be rephrased as finding stationary points of the problem
maxqF(q,) = maxq(G(q)+D(q))
where [0,). is a subset of RNK.• G and D are sufficiently smooth in .• G and D have symmetry: they are invariant to some group action.
How: Determine the Bifurcation Structure
We have described the bifurcation structure of stationary points to any problem of the form
maxqF(q,) = maxq(G(q)+D(q))
where [0,). is a linear subset of RNK.• G and D are sufficiently smooth in .• G and D have symmetry: they are invariant to some group action.
Thesis Topics
The Data Clustering ProblemThe Neural Coding Problem Information Theory / Probability TheoryOptimization TheoryDynamical SystemsBifurcation Theory with SymmetriesGroup TheoryContinuation Techniques
Outline of this talk
The Data Clustering ProblemA Class of Optimization ProblemsBifurcation with SymmetriesNumerical Results
The Data Clustering Problem
• Data Classification: identifying all of the books printed in 2002
which address the martial art Kempo
• Data Compression: converting a bitmap file to a jpeg file
Y YN
q(YN|Y) : a clustering
K objects {yi} N objects {yNi}
A Symmetry: invariance to relabelling of the clusters of YN
Y YN
q(YN|Y) : a clustering
K objects {yi} N objects {yNi}
class 1
class 2
A Symmetry: invariance to relabelling of the clusters of YN
Y YN
q(YN|Y) : a clustering
K objects {yi} N objects {yNi}
class 2
class 1
Requirements of a Clustering Method
• The original data is represented reasonably well by the clusters
– Choosing a cost function, D(Y,YN) , called a distortion function, rigorously defines what we mean by the “data is represented reasonably well”.
• Fast implementation
• Deterministic Annealing (Rose 1998) A Fast Clustering Algorithm
max H(YN|Y) constrained by D(Y,YN) D0
• Rate Distortion Theory (Shannon ~1950) Minimum Informative Compression
min I(Y,YN) constrained by D(Y,YN) D0
qC,
Examplesoptimizing at a distortion level D(Y,YN) D0
q
NK
YyNN Yyyyqyyq
NN
,1)|(|)|(:
Inputs and Outputs and Clustered Outputs
• The Information Distortion method clusters the outputs Y into clusters YN so that the information that one can learn about X by observing YN , I(X;YN), is as close as possible to the mutual information I(X;Y)• The corresponding information distortion function is
DI(Y;YN)=I(X;Y) - I(X;YN )
X Y
Inputs Outputs
YN
q(YN |Y)
Clusters
K objects {yi} N objects {yNi}L objects {xi}
p(X,Y)
• Information Distortion Method (Dimitrov and Miller 2001)
max H(YN|Y) constrained by DI(Y,YN) D0
max H(YN|Y) + I(X;YN)
• Information Bottleneck Method (Tishby, Pereira, Bialek 1999)
min I(Y,YN) constrained by DI(Y,YN) D0
max –I(Y,YN) + I(X;YN)
q
Two optimization problems which use the information distortion function
q
q
q
An annealing algorithmto solve
maxqF(q,) = maxq(G(q)+D(q))
Let q0 be the maximizer of maxq G(q), and let 0 =0. For k 0, let (qk , k ) be a solution to maxq G(q) + D(q ). Iterate the following steps until K = max for some K.
1. Perform -step: Let k+1 = k + dk where dk>0
2. The initial guess for qk+1 at k+1 is qk+1(0) = qk + for some small
perturbation .
3. Optimization: solve maxq (G(q) + k+1 D(q)) to get the maximizer qk+1 , using initial guess qk+1
(0) .
Application of the annealing method to the Information Distortion problem maxq (H(YN|Y) + I(X;YN))
when p(X,Y) is defined by four gaussian blobs
Inputs
Outputs
X Y
52 objects52 objects
p(X,Y)
Y YN
q(YN |Y)
52 objects N objects I(X;YN)=D(q(YN|Y))
Observed Bifurcations for the Four Blob problem:
We just saw the optimal clusterings q* at some *= max . What do the clusterings look like for < max ??
Bifurcations of q *()
Observed Bifurcations for the 4 Blob Problem
Conceptual Bifurcation Structure
q*
Nq
1*
??????
Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?
What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?
How many bifurcating solutions are there?
What do the bifurcating branches look like? Are they subcritical or supercritical ?
What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem?
Are there bifurcations after all of the classes have resolved ?
q*
Conceptual Bifurcation Structure
Observed Bifurcations for the 4 Blob Problem
Bifurcations with symmetry• To better understand the bifurcation structure, we capitalize on
the symmetries of the function G(q)+D(q)
• The “obvious” symmetry is that G(q)+D(q) is invariant to relabelling of the N classes of YN
• The symmetry group of all permutations on N symbols is SN.
switch labels 1 and 3
Symmetry Breaking Bifurcations
q*
4
11
N
q
*q
41 by fixed is SSq N
N
31* by fixed is SSq N
*q
22* by fixed is SSq N
Existence Theorems for Bifurcating Branches
Given a bifurcation at a point fixed by SN ,
• Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1)
• There are N bifurcating branches, each which have symmetry SN-1 .
• The Smoller-Wasserman Theorem (Smoller and Wasserman 1985-6)
• There are bifurcating branches which have symmetry <(N-cycle)p> for every prime p|N, p<N.
q*
Given a bifurcation at a point fixed by SN-1 ,
• Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1)
• Gives N-1 bifurcating branches which have symmetry SN-2 .
• The Smoller-Wasserman Theorem (Smoller and Wasserman 1985-6)
• Gives bifurcating branches which have symmetry <(M-cycle)p> for every prime p|N-1, p<N-1 .
When N = 4, N-1=3, there are no bifurcating branches given by SW Theorem.
q*Existence Theorems for Bifurcating Branches
4S
3S3S
3S 3S
0
3
vv
v
v
0
3
vv
v
v
0
3vv
v
v
0
3vv
v
v
2S2S 2S2S2S2S2S2S
1
0
2
0
vv
v
2S 2S 2S2S
0
2
0
vv
v
0
2
0
vv
v
0
0
2
vv
v
0
2
0
vv
v
0
2
0
vv
v
0
0
2
v
v
v
0
20v
v
v
0
0
2
v
v
v
0
0
2
v
v
v
0
0
2
v
v
v
0
02v
v
v
A partial subgroup lattice for S4 and the corresponding bifurcating directions given by the Equivariant Branching Lemma
4S
4A
34,12 24,13
23,14
v
v
v
v
v
v
v
v
0)(Fix 4 A
v
v
v
v
)1324(
0))1234((Fix
A partial subgroup lattice for S4 and the corresponding bifurcating directions given by the Smoller-Wasserman Theorem
q*
Conceptual Bifurcation Structure
4S
3S3S
3S 3S
2S2S 2S2S2S2S2S2S
1
2S 2S 2S2S
The Equivariant Branching Lemma shows that the bifurcation structure from SM to SM-1 is …
Group Structure
q*
Conceptual Bifurcation Structure
q*
4S
3S3S
3S 3S
2S2S 2S2S2S2S2S2S
1
2S 2S 2S2S
Group Structure
The Equivariant Branching Lemma shows that the bifurcation structure from SM to SM-1 is …
The Smoller-Wasserman Theorem shows additional structure …
q*
q*
Conceptual Bifurcation Structure
4S
4A
34,12 24,13
23,14
)1324(
Group Structure
q*
Conceptual Bifurcation Structure
4S
4A
34,12 24,13
23,14
)1324(
Group Structure
q*
The Smoller-Wasserman Theorem shows additional structure … 3 branches from the S4 to S3 bifurcation only.
q*
Conceptual Bifurcation Structure
q*
If we stay on a branch which is fixed by SM , how many bifurcations are there?
q*
Conceptual Bifurcation Structure
4S
4A
34,12 24,13
23,14
)1324(
Group Structure
q*
Theorem: There are at exactly K/N bifurcations on the branch (q1/N , ) for the Information Distortion problem
There are 13bifurcations on the first
branch
Bifurcation theory in the presence of symmetries
enables us to answer the questions previously posed …
??????
Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?
What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?
How many bifurcating solutions are there?
What do the bifurcating branches look like? Are they subcritical or supercritical ?
What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem?
Are there bifurcations after all of the classes have resolved ?
q*
Conceptual Bifurcation Structure
Observed Bifurcations for the 4 Blob Problem
??????
Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations?There are N-1 symmetry breaking bifurcations from SM to SM-1 for M N.
What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type?
How many bifurcating solutions are there? There are at least N from the first bifurcation, at least N-1 from the next one, etc.
What do the bifurcating branches look like? They are subcritical or supercritical depending on the sign of the bifurcation discriminator (q*,*,uk) .
What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? No.
Are there bifurcations after all of the classes have resolved ? In general, no.
Conceptual Bifurcation StructureObserved Bifurcations for the 4 Blob Problem
q*
We can explain the bifurcation structure
of all problems of the form
maxq F(q, ) = maxq (G(q)+D(q))
where [0,). is a subset of RNK.• G and D are sufficiently smooth in .• G and D are invariant to relabelling of the classes of YN
• The blocks of the Hessian q(G+ D) at bifurcation satisfy a set of generic conditions.
This class of problems includes the Information Distortion problem.
Hessian d constraine theis , LqHessian nedunconstrai isFq
singular is , Lq
singular isFq rnonsingula isFq
rnonsingula is1
1
MN
iKi MIRB
1M 1M
Symmetry breaking
bifurcation
Impossible scenario
Saddle-node bifurcation
Impossible scenario
Non-generic
chap
ter
6
rnonsingula is1
1
MN
iKi MIRB
singular is1
1
MN
iKi MIRB
singular is1
1
MN
iKi MIRB
chap
ter
6
chap
ter
8
chap
ter
4
Equivariant Branching Lemma: Previous vs. Actual Bifurcation Structure
We used Continuation Techniques and the Theory of Bifurcations with Symmetries on the 4 Blob Problem using the Information Distortion method to get this picture.
Previous results:
Actual structure:
Singularity of F:
Singularity of L :
*
The bifurcation from S4 to S3 is subcritical …
(the theory predicted this since the bifurcation discriminator (q1/4,*,u)<0 )
Theorem: In general, either symmetry breaking bifurcations or saddle-node bifurcations can occur.
Outline of proof: The Equivariant Branching Lemma, Smoller-Wasserman
Theorem, and the following singularity structure:
singular is , Lq
singular isFq singular-non isFq
singular-non is1
1
MN
iKi MIRB
1M 1M
singular-non is1
1
MN
iKi MIRB
singular is1
1
MN
iKi MIRB
singular is1
1
MN
iKi MIRB
Conclusions
Symmetry breaking
bifurcation
ImpossibleScenario
Saddle-node bifurcation
Impossible scenario
Non-generic
Theorem: All symmetry breaking bifurcations from
SM to SM-1 are pitchfork-like, and there exists M
bifurcating branches, for which we have explicit
directions.
Conclusions
q*
Theorem: The bifurcation discriminator of the pitchfork-like branch (q*,*,*) + (tu,0,(t)) is
If (q*,*,uk) < 0, then the branch is subcritical. If (q*,*,uk) > 0, then the branch is supercritical.
Conclusions
],,[,)33(][][,[,3),( 42
,
213** vvvFvMMvv
qqELuuuq
sr srsr
LL
Theorem: Solutions of the optimization problem do not always persist from bifurcation.
Theorem: In general, bifurcations do not occur after all of the classes have resolved.
Conclusions
A numerical algorithm to solve max(G(q)+D(q))
Let q0 be the maximizer of maxq G(q), 0 =1 and s > 0. For k 0, let (qk , k ) be a solution to maxq G(q) + D(q ). Iterate the following steps until K = max for some K.
1. Perform -step: solve
for and select k+1 = k + dk where dk = (s sgn(cos )) /(||qk ||2 + ||k ||2 +1)1/2.
2. The initial guess for (qk+1,k+1) at k+1 is (qk+1
(0),k+1 (0)) = (qk ,k) + dk ( qk, k) .
3. Optimization: solve maxq (G(q) + k+1 D(q)) using pseudoarclength continuation to get the maximizer qk+1, and the vector of Lagrange multipliers k+1 using initial guess (qk+1
(0),k+1 (0)).
4. Check for bifurcation: compare the sign of the determinant of an identical block of each of q [G(qk) + k D(qk)] and q [G(qk+1) + k+1 D(qk+1)]. If a bifurcation is detected, then set qk+1
(0) = qk + d_k u where u is bifurcating direction and repeat step 3.
),,(),,( ,, kkkqk
kkkkq q
LL
k
kq
q
Details …
• The Dynamical System
• Types of Singularities
• Continuation Techniques
• The Explicit Group of Symmetries
• Explicit Existence Theorems for bifurcating branches
A Class of Problems
max F(q, ) = max(G(q)+D(q))
• G and D are sufficiently smooth in .
• G and D must be invariant under relabelling of the classes.
q q
The Dynamical SystemGoal: To determine the bifurcation structure of solutions to
maxq (G(q) + D(q)) for [0,) .
Method: Study the equilibria of the of the flow
•
• The Jacobian wrt q of the K constraints {YNq(YN|y)-1} is J=(IK IK … IK).
• If wT qF(q*,) w < 0 for every wker J, then q*() is a maximizer of .
• The first equilibrium is q*(0 = 0) 1/N.
• If wT qF(q*,) w < 0 for every wker J, then q*() is a maximiYNer of .
• The first equilibrium is q*(0 = 0) 1/N.
Yy z
yqq yzqqDqGqq
1)|()()(:),,( ,, L
KnKnq
:, L
• In our dynamical system
the hessian
determines the stability of equilibria and the location of bifurcation.
.
q L
0),,(, T
qq J
JFq L
Properties of the Dynamical System
Hessian d constraine theis , LqHessian nedunconstrai isFq
singular is , Lq
singular isFq rnonsingula isFq
rnonsingula is1
1
MN
iKi MIRB
1M 1M
Symmetry breaking
bifurcation
Impossible scenario
Saddle-node bifurcation
Impossible scenario
Non-generic
chap
ter
6
rnonsingula is1
1
MN
iKi MIRB
singular is1
1
MN
iKi MIRB
singular is1
1
MN
iKi MIRB
chap
ter
6
chap
ter
8
chap
ter
4
The Dynamical System
How:
Use numerical continuation in a constrained system to choose and to choose an initial guess to find the equilibria q*( ).
Use bifurcation theory with symmetries to understand bifurcations of the equilibria.
Investigating the Dynamical System
Continuation• A local maximum qk
*(k) of is an equilibrium of the gradient flow .• Initial condition qk+1
(0)(k+1(0)) is sought in tangent direction qk, which is found by solving the matrix system
• The continuation algorithm used to find qk+1*(k+1) is based on Newton’s method.
• Parameter continuation follows the dashed (---) path, pseudoarclength continuation follows the dotted (…) path ),,(),,( ,, kkkq
k
k
kkkq qq
q
LL
k)0(
1k
),( , kkkq
),,( 111 kkkq
),( )0(1
)0(1
)0(1 kkkq
),( 11 kkq
),( kkq
),( q
),( )0(1
)0(1 kkq
The Groups• Let P be the finite group of n ×n “block” permutation matrices which represents the action of SN
on q and F(q,) . For example, if N=3,
permutes q(YN1|y) with q(YN2|y) for every y
• F(q,) is P -invariant means that for every P, F( q,) = F(q,)
• Let be the finite group of (n+K) × (n+K) block permutation matrices
which represents the action of SN on and q, L(q,,):
q, L(q, , ) is -equivariant means that for every q, L(q, , ) = q, L( ,)
q
! |0
0: fixed are sconstraint and smultiplier lagrange the
P
KKnK
Kn
I
q
P
K
K
K
I
I
I
00
00
00
Notation and Definitions• The symmetry of is measured by its isotropy subgroup
• An isotropy subgroup is a maximal isotropy subgroup of if there does not exist an isotropy subgroup of such that .
• At bifurcation , the fixed point subspace of q*,* is
qqq |,
q
),( *
*
*
q
**** ,
***,,
,|),,(ker)(Fix
qqq
wwqw L
Equivariant Branching LemmaOne of the Existence Theorems we use to describe a bifurcation in the
presence of symmetries is the Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1).
Idea: The bifurcation structure of local solutions is described by the isotropy subgroups of which have dim Fix()=1. • System: .
• r(x,) is G-equivariant for some compact Lie Group G• • Fix(G)={0}• Let H be an isotropy subgroup of G such that dim Fix (H) = 1.• Assume r(0,0) 0 (crossing condition).
Then there is a unique smooth solution branch (tx0,(t)) to r = 0 such that x0 Fix (H) and the isotropy subgroup of each solution is H.
mmrxrx :),,(
0)0,0(,0)0,0( rr x
From bifurcation, the Equivariant Branching Lemma shows that the following solutions emerge:
An stationary point q* is M-uniform if there exists 1 M N and a
K x 1 vector P such that q(yNi|Y)=P for M and only M classes, {yNi}Ni=1
of YN. These M classes of YN are unresolved classes. The classes of YN that are not unresolved are called resolved.
The first equilibria, q* 1/N, is N-uniform.
Theorem: q* is M-uniform if and only if q* is fixed by SM.
Symmetry Breaking from SM to SM-1
Theorem: dim ker qF (q*,)=M with basis vectors {vi}Mi=1
Theorem: dim ker q,L (q*,,)=M-1 with basis vectors
Point: Since the bifurcating solutions whose existence is guaranteed by the EBL and the SW Theorem
are tangential to ker q,L (q*,,), then we know the explicit form of the bifurcating directions.
otherwise 0
class unresolved theis if ][
th
i
ivv
00
Mi vv
Kernel of the Hessian at Symmetry Breaking Bifurcation
Assumptions:• Let q* be M-uniform • Call the M identical blocks of qF (q*,): B. Call the other N-M blocks of qF (q*,):
{R}. We assume that B has a single nullvector v and that R is nonsingular for every .
• If M<N, then BR-1 + MIK is nonsingular.
Theorem: Let (q*,*,*) be a singular point of the flow
such that q* is M-uniform. Then there exists M bifurcating (M-1)-uniform solutions (q*,*,*) + (tuk,0,(t)), where
Symmetry Breaking Bifurcation from M-uniform solutions
otherwise 0
class unresolvedother any is if
class unresolved theis if)1(
][ kv
kvM
u
th
k
q L
Hessian d constraine theis , LqHessian nedunconstrai isFq
singular is , Lq
singular isFq rnonsingula isFq
rnonsingula is1
1
MN
iKi MIRB
1M 1M
Symmetry breaking
bifurcation
Impossible scenario
Saddle-node bifurcation
Impossible scenario
Non-generic
chap
ter
6
rnonsingula is1
1
MN
iKi MIRB
singular is1
1
MN
iKi MIRB
singular is1
1
MN
iKi MIRB
chap
ter
6
chap
ter
8
chap
ter
4
4S
3S3S
3S 3S
0
3
vv
v
v
0
3
vv
v
v
0
3vv
v
v
0
3vv
v
v
2S2S 2S2S2S2S2S2S
1
0
2
0
vv
v
2S 2S 2S2S
0
2
0
vv
v
0
2
0
vv
v
0
0
2
vv
v
0
2
0
vv
v
0
2
0
vv
v
0
0
2
v
v
v
0
20v
v
v
0
0
2
v
v
v
0
0
2
v
v
v
0
0
2
v
v
v
0
02v
v
v
Some of the bifurcating branches when N = 4 are given by the following isotropy subgroup lattice for S4
For the 4 Blob problem:The isotropy subgroups and bifurcating directions of the
observed bifurcating branches
isotropy group: S4 S3 S2 1bif direction: (-v,-v,3v,-v,0)T (-v,2v,0,-v,0)T (-v,0,0,v,0)T … No more bifs!
Smoller-Wasserman Theorem
The other Existence Theorem:
Smoller-Wasserman Theorem (1985-6)
For variational problems where
there is a bifurcating solution tangential to Fix(H) for every maximal isotropy subgroup H, not only those with dim Fix(H) = 1.
• dim Fix(H) =1 implies that H is a maximal isotropy subgroup
),(),( xfxr x
The Smoller-Wasserman Theorem shows that (under the same assumptions as before)
if M is composite, then there exists bifurcating solutions with isotropy group <p> for every element of order M in and every prime p|M, p<M. Furthermore,
dim (Fix <p>)=p-1
Other branches
4S
4A
34,12 24,13
23,14
v
v
v
v
v
v
v
v
0)(Fix 4 A
v
v
v
v
)1324(
0))1234((Fix
Bifurcating branches from a 4-uniform solution are given by the following isotropy subgroup lattice for S4
Issues: SM
• The full lattice of subgroups of the group SM is not known for arbitrary M.
• The lattice of maximal subgroups of the group SM is not known for arbitrary M.
More about the Bifurcation Structure
Theorem: All symmetry breaking bifurcations from SM to SM-1 are pitchfork-like.
Outline of proof: ’(0)=0 since 2xx r(0,0) =0.
Theorem: The bifurcation discriminator of the pitchfork-like branch
(q*,*,*) + (tuk,0,(t)) is
If (q*,*,uk) < 0, then the branch is subcritical. If (q*,*,uk) > 0, then the branch is supercritical.
Theorem: Generically, bifurcations do not occur after all of the classes have resolved.
Theorem: If dim (ker q,L (q*,,)) = 1, and if a crossing condition is satisfied, then saddle-node bifurcation must occur.
],,[,)33(][][,[,3),( 42
,
213** vvvFvMMvv
qqELuuuq
sr srsr
LL