adjoint orbits, principal components, and neural nets some facts about lie groups and examples...

23
Adjoint Orbits, Principal Components, and Neural Nets Some facts about Lie groups and examples Examples of adjoint orbits and a distance measure Descent equations on adjoint orbits Properties of the double bracket equation Smoothed versions of the double bracket equation The principal component extractor The performance of subspace filters Variations on a theme

Upload: madelynn-robins

Post on 15-Dec-2015

227 views

Category:

Documents


0 download

TRANSCRIPT

Adjoint Orbits, Principal Components, and Neural Nets

2. Some facts about Lie groups and examples3. Examples of adjoint orbits and a distance measure4. Descent equations on adjoint orbits5. Properties of the double bracket equation6. Smoothed versions of the double bracket equation7. The principal component extractor8. The performance of subspace filters9. Variations on a theme

Where We Are

9:30 - 10:45 Part 1. Examples and Mathematical Background10:45 - 11:15 Coffee break

11:15- 12:30 Part 2. Principal components, Neural Nets, and Automata

12:30 - 14:30 Lunch14:30 - 15:45 Part 3. Precise and Approximate Representation

of Numbers15:45 - 16:15 Coffee break16:15 - 17:30 Part 4. Quantum Computation

The Adjoint Orbit Theory and Some Applications

1. Some facts about Lie groups and examples2. Examples of adjoint orbits and a distance measure3. Descent equations on adjoint orbits4. Properties of the double bracket equation5. Smoothed versions of the double bracket equation6. Loops and deck transformations

Some BackgroundByaLieGroupGweunderstandagroupwithatopologysuchthatmultiplicationandinver-sionarecontinuous.(InthissettingcontinuousimpliesdiÆerentiable.)WesaythatagroupactsonadiÆerentiablemanifoldXvia¡if¡:G£X!MisdiÆeren-tiableand¡(G2G1;x)=¡(G2;¡(G1;x)).ThegroupoforthogonalmatricesSo(n)actsonthen°1-dimensionalsphereviatheaction¡(£;x)=£x

More Mathematics BackgroundAssociatedwitheveryLiegroupisaLiealge-braLwhichmaybethoughtofasdescribinghowGlooksinasmallsetaroundtheidentity.Abstractly,aLiealgebraisavectorspacewithabilinearmapping¡:L£L7!Lsuchthat[L1;L2]=°[L2;L1][L1;[L2;L3]]+[L2;[L3;L1]]+[L3;[L1;L2]]=0TheLiealgebraassociatedwiththerealorthog-onalgroupisthesetofskew-symmetricmatri-cesofthesamedimension.Thebilinearopera-tionisgivenby[≠1;≠2]=≠1≠2°≠2≠1.

A Little More Mathematics BackgroundLet£beanorthogonalmatrixandletQbeasymmetricmatrixwitheigenvalues∏1;∏2;:::;∏n.Theformula£TQ£deØnesagroupactiononSym(∏1;∏2;:::;∏n).Thesetoforthogonalma-tricesisofdimensionn(n°1)=2andthespaceSym(§)isofdimensionn(n+1)=2.ThisactionisbasictoalotMatlab!Theactionofthegroupofunitarymatricesonthespaceofskew-hermitianmatricesvia(U;H)7!UyHUcanbethoughtofasgeneralizingthisac-tion.ItisanexampleofagroupactingonitsownLiealgebra.Thisisanadjointaction.

Still More Mathematics BackgroundConsiderLiealgebraswhoseelementsarenbynmatricesandLiegroupswhoseelementsarenonsingularnbynmatrices.Themappingexp:L7!eLsendstheLiealgebraintothegroupofinvertiblematrices.TheidentityP°1eLP=eP°1LPdeØnestheadjointaction.If¡:G£X!XisagroupactionthenthereisanequivalencerelationonXdeØnedbyxºyify=¡(G1;x)forsomeG12G.Setsofequivalentpointsarecalledorbits.ThesubsetofHΩGsuchthat¡(H;x0)=x0formsasubgroupcalledtheisotropygrouptx0.

The Last for now, Mathematics BackgroundAnyL12LdeØnesvia[L1;¢]:L!L,alineartransformationonaØnitedimensionalspace.ItisoftenwrittenadL1(¢).adL1(adL2(¢))=[L1;[L2;¢]]deØnesalineartransformationonLaswell.ThesumoftheeigenvaluesofthismapdeØneswhatiscalledtheKillingform∑(L1;L2),onL.Forsemisimplecompactgroupssuchastheorthogonalorspecialunitarygroup,theKillingformisnegativedeØniteandpropor-tionaltothemorefamiliartr(≠1≠2).TheKillingformonGdeØnesametricontheadjointorbitcalledthenormalmetric.

Getting a Feel for the Normal MetricExplanation:Considerperturbing£via£7!£(I+≠).Linearizingtheequation£TQ£=HwegetH≠+≠TH=[H;≠]=dHThus ≠=ad°1H(dH)IfHisdiagonalthen!ij=dhij∏i°∏j

Steepest Descent on an Adjoint Orbit

LetQ=QTandN=NTbesymmetricma-tricesandlet£beorthogonal.Considerthefunctiontr£TQ£Nthoughtofasafunctionontheorthogonalmatrices.RelativetotheKillingmetricontheorthogonalgroup,thegradientdescent∞owforminimizingthisfunctionis_£=[£TQ£;N]£Ifwelet£TQ£=HthenthederivativeofHcanbeexpressedas_H=[H;[H;N]]

QuickTime™ and aIntel Indeo® Video R3.2 decompressor

are needed to see this picture.

QuickTime™ and aIntel Indeo® Video R3.2 decompressor

are needed to see this picture.

A Descent Equation on an Adjoint OrbitLetQ=QTandN=NTbesymmetricmatri-cesandlet√(H)bearealvaluedfunctiononSym(§).Whatisthegradientof√(H)?ThegradientonaRiemannianspaceisG°1d√.OnSym(§)theinverseoftheRiemannianmetricisgivenby[H;[H;¢]].andsothedescentequationis _H=°[H;[H;d√(H)]]Thusfor√(H)=tr(HN)wehave_H=°[H;[H;N]].IfNisdiagonalthentrHNachievesitsmini-mumwhenHisdiagonalandsimilarlyorderedwith°N.

A Descent Equation with Multiple Equilibria

If√(H)=°tr(diag(H)H)then_H=[H;[H;2diag(H)]]LetQ=QTandN=NTbediagonalmatriceswithdistincteigenvalues.Thedescentequationis_H=°[H;[H;d√(H)]]Thusfor√(H)=tr(HN)wehave_H=[H;[H;N]]If√(H)=diagHthen_H=2[H;[H;diag(H)]]

A Descent Equation with Smoothing AddedConsiderreplacingthesystem_H=[H;[H;N]]with_H=[H;q(D)P];p(D)P=[H;N]HereD=d=dt.Thissmoothsthesignalsbutdoesnotaltertheequilibriumpoints.StabilityisunaÆectedifq=pisapositiverealfunction.

QuickTime™ and aIntel Indeo® Video R3.2 decompressor

are needed to see this picture.

The Double Bracket Flow for Analog ComputationPrincipalComponentsinRnLearningwithoutateacherissometimesapproachedbyØndingprincipalcomponents._W=x(t)xT(t)-forgettingterm£T(t)W(t)£(t)=diag(∏1:::;∏n)Columnsof£are\components'Theprincipalcomponentsareassembledinahiddenlayer

xxxx

yyyy

ww

wvvvv

1 112 2 23 3 3n n

mmm

Adaptive Subspace Filtering

Filter

t

frequency

power

Let u be a vector of inputs, and let be a diagonal “editing” matrix that selects energy levels that are desirable. An adaptive subspace filter with input u and output y can be realized by implementing the equations

Some Equations

dQdt

=uuT −(1−tr(Q))Q

dΘdt

=[ΘQΘT,N]Θ

y=ΘΘTu

Neural Nets as Flows on Grassmann Manifolds

DenotebyG(n;k)thespaceofk-planesinn-space.ThisspaceisadiÆerentiablemanifoldthatcanbeparameterizedbythesetofallkbynmatricesofrankk.Itisamanifold.AdaptivesubspaceØlterssteertheweightssoastodeØneaparticularelementofthisspace.Thus§£,deØnessuchapointif§lookslike§=266410:::001:::0::::::::::::00:::1

3775

Summary of Part 2

1. We have given some mathematical background necessary to work with flows on adjoint orbits and indicated some applications.

2. We have defined flows that will stabilize at invariant subspaces corresponding to the principal components of a vector process. These flows can be interpreted as flows that learn without a teacher.

3. We have argued that in spite of its limitations, steepest descent is usually the first choice in algorithm design.

4. We have interpreted a basic neural network algorithm as a flow in a Grassmann manifold generated by a steepest descent tracking algorithm.

M. W. Berry et al., “Matrices, Vector Spaces, and Information Retrieval” SIAM Review, vol. 41, No. 2, 1999.

R. W. Brockett, “Dynamical Systems That Learn Subspaces” in Mathematical System Theory: The Influence of R. E. Kalman, (A.C. Antoulas, ed.) Springer -Verlag, Berlin. 1991. pp. 579--592.

R. W. Brockett “An Estimation Theoretic Basis for the Design of Sorting and Classification Networks,” in Neural Networks, (R. Mammone and Y. Zeevi, eds.) Academic Press, 1991, pp. 23-41.

A Few References