the orbits of stars - university of california, berkeley

3The Orbits of Stars

In this chapter we examine the orbits of individual stars in gravitationalfields such as those found in stellar systems. Thus we ask the questions,“What kinds of orbits are possible in a spherically symmetric, or an axiallysymmetric potential? How are these orbits modified if we distort the poten-tial into a bar-like form?” We shall obtain analytic results for the simplerpotentials, and use these results to develop an intuitive understanding of howstars move in more general potentials.

In §§3.1 to 3.3 we examine orbits of growing complexity in force fieldsof decreasing symmetry. The less symmetrical a potential is the less likelyit is that we can obtain analytic results, so in §3.4 we review techniques forintegrating orbits in both a given gravitational field, and the gravitationalfield of a system of orbiting masses. Even numerically integrated orbits ingravitational fields of low symmetry often display a high degree of regularityin their phase-space structures. In §3.5 we study this structure using ana-lytic models, and develop analytic tools of considerable power, including theidea of adiabatic invariance, which we apply to some astronomical problemsin §3.6. In §3.7 we develop Hamiltonian perturbation theory, and use it tostudy the phenomenon of orbital resonance and the role it plays in generat-ing orbital chaos. In §3.8 we draw on techniques developed throughout thechapter to understand how elliptical galaxies are affected by the existence ofcentral stellar cusps and massive black holes at their centers.

All of the work in this chapter is based on a fundamental approximation:

3.1 Orbits in spherical potentials 143

although galaxies are composed of stars, we shall neglect the forces fromindividual stars and consider only the large-scale forces from the overallmass distribution, which is made up of thousands of millions of stars. Inother words, we assume that the gravitational fields of galaxies are smooth,neglecting small-scale irregularities due to individual stars or larger objectslike globular clusters or molecular clouds. As we saw in §1.2, the gravitationalfields of galaxies are sufficiently smooth that these irregularities can affectthe orbits of stars only after many crossing times.

Since we are dealing only with gravitational forces, the trajectory ofa star in a given field does not depend on its mass. Hence, we examinethe dynamics of a particle of unit mass, and quantities such as momentum,angular momentum, and energy, and functions such as the Lagrangian andHamiltonian, are normally written per unit mass.

3.1 Orbits in static spherical potentials

We first consider orbits in a static, spherically symmetric gravitational field.Such fields are appropriate for globular clusters, which are usually nearlyspherical, but, more important, the results we obtain provide an indispens-able guide to the behavior of orbits in more general fields.

The motion of a star in a centrally directed gravitational field is greatlysimplified by the familiar law of conservation of angular momentum (seeAppendix D.1). Thus if

r = rer (3.1)

denotes the position vector of the star with respect to the center, and theradial acceleration is

g = g(r)er, (3.2)

the equation of motion of the star is

d2r

dt2= g(r)er . (3.3)

If we remember that the cross product of any vector with itself is zero, wehave

d

dt

(r×

dr

dt

)=

dr

dt×

dr

dt+ r ×

d2r

dt2= g(r)r × er = 0. (3.4)

Equation (3.4) says that r × r is some constant vector, say L:

r ×dr

dt= L. (3.5)

Of course, L is simply the angular momentum per unit mass, a vectorperpendicular to the plane defined by the star’s instantaneous position and

144 Chapter 3: The Orbits of Stars

velocity vectors. Since this vector is constant, we conclude that the starmoves in a plane, the orbital plane. This finding greatly simplifies thedetermination of the star’s orbit, for now that we have established that thestar moves in a plane, we may simply use plane polar coordinates (r,ψ) inwhich the center of attraction is at r = 0 and ψ is the azimuthal angle in theorbital plane. In terms of these coordinates, the Lagrangian per unit mass(Appendix D.3) is

L = 12

[r2 + (rψ)2

]− Φ(r), (3.6)

where Φ is the gravitational potential and g(r) = −dΦ/dr. The equationsof motion are

0 =d

dt

∂L∂r

−∂L∂r

= r − rψ2 +dΦ

dr, (3.7a)

0 =d

dt

∂L∂ψ

−∂L∂ψ

=d

dt

(r2ψ

). (3.7b)

The second of these equations implies that

r2ψ = constant ≡ L. (3.8)

It is not hard to show that L is actually the length of the vector r × r,and hence that (3.8) is just a restatement of the conservation of angularmomentum. Geometrically, L is equal to twice the rate at which the radiusvector sweeps out area.

To proceed further we use equation (3.8) to replace time t by angle ψas the independent variable in equation (3.7a). Since (3.8) implies

d

dt=

L

r2

d

dψ, (3.9)

equation (3.7a) becomes

L2

r2

d

dψ

(1

r2

dr

dψ

)−

L2

r3= −

dΦ

dr. (3.10)

This equation can be simplified by the substitution

u ≡1

r, (3.11a)

which puts (3.10) into the form

d2u

dψ2+ u =

1

L2u2

dΦ

dr

(1/u

). (3.11b)


The solutions of this equation are of two types: along unbound orbits r →∞ and hence u → 0, while on bound orbits r and u oscillate between finitelimits. Thus each bound orbit is associated with a periodic solution of thisequation. We give several analytic examples later in this section, but ingeneral the solutions of equation (3.11b) must be obtained numerically.

Some additional insight is gained by deriving a “radial energy” equationfrom equation (3.11b) in much the same way as we derive the conservation ofkinetic plus potential energy in Appendix D; we multiply (3.11b) by du/dψand integrate over ψ to obtain

(du

dψ

)2

+2Φ

L2+ u2 = constant ≡

2E

L2, (3.12)

where we have used the relation dΦ/dr = −u2(dΦ/du).This result can also be derived using Hamiltonians (Appendix D.4).

From (3.6) we have that the momenta are pr = ∂L/∂r = r and pψ =∂L/∂ψ = r2ψ, so with equation (D.50) we find that the Hamiltonian perunit mass is

H(r, pr, pψ) = pr r + pψψ − L

= 12

(p2

r +p2

ψ

r2

)+ Φ(r)

= 12

(dr

dt

)2

+ 12

(rdψ

dt

)2

+ Φ(r).

(3.13)

When we multiply (3.12) by L2/2 and exploit (3.9), we find that the constantE in equation (3.12) is simply the numerical value of the Hamiltonian, whichwe refer to as the energy of that orbit.

For bound orbits the equation du/dψ = 0 or, from equation (3.12)

u2 +2[Φ(1/u) − E]

L2= 0 (3.14)

will normally have two roots u1 and u2 between which the star oscillatesradially as it revolves in ψ (see Problem 3.7). Thus the orbit is confinedbetween an inner radius r1 = u−1

1 , known as the pericenter distance, andan outer radius r2 = u−1

2 , called the apocenter distance. The pericenterand apocenter are equal for a circular orbit. When the apocenter is nearlyequal to the pericenter, we say that the orbit has small eccentricity, whileif the apocenter is much larger than the pericenter, the eccentricity is said tobe near unity. The term “eccentricity” also has a mathematical definition,but only for Kepler orbits—see equation (3.25a).


Figure 3.1 A typical orbit in aspherical potential (the isochrone,eq. 2.47) forms a rosette.

The radial period Tr is the time required for the star to travel fromapocenter to pericenter and back. To determine Tr we use equation (3.8) toeliminate ψ from equation (3.13). We find

(dr

dt

)2

= 2(E − Φ) −L2

r2, (3.15)

which may be rewritten

dr

dt= ±

√2[E − Φ(r)] −

L2

r2. (3.16)

The two possible signs arise because the star moves alternately in and out.Comparing (3.16) with (3.14) we see that r = 0 at the pericenter and apocen-ter distances r1 and r2, as of course it must. From equation (3.16) it followsthat the radial period is

Tr = 2

∫ r2

r1

dr√2[E − Φ(r)] − L2/r2

. (3.17)

In traveling from pericenter to apocenter and back, the azimuthal angleψ increases by an amount

∆ψ = 2

∫ r2

r1

dψ

drdr = 2

∫ r2

r1

L

r2

dt

drdr. (3.18a)

Substituting for dt/dr from (3.16) this becomes

∆ψ = 2L

∫ r2

r1

dr

r2√

2[E − Φ(r)] − L2/r2. (3.18b)


The azimuthal period is

Tψ =2π

|∆ψ|Tr; (3.19)

in other words, the mean angular speed of the particle is 2π/Tψ. In general∆ψ/2π will not be a rational number. Hence the orbit will not be closed: atypical orbit resembles a rosette and eventually passes close to every pointin the annulus between the circles of radii r1 and r2 (see Figure 3.1 andProblem 3.13). There are, however, two and only two potentials in which allbound orbits are closed.

(a) Spherical harmonic oscillator We call a potential of the form

Φ(r) = 12Ω2r2 + constant (3.20)

a spherical harmonic oscillator potential. As we saw in §2.2.2b, this potentialis generated by a homogeneous sphere of matter. Equation (3.11b) could besolved analytically in this case, but it is simpler to use Cartesian coordinates(x, y) defined by x = r cosψ, y = r sinψ. In these coordinates, the equationsof motion are simply

x = −Ω2x ; y = −Ω2y, (3.21a)

with solutions

x = X cos(Ωt + εx) ; y = Y cos(Ωt + εy), (3.21b)

where X , Y , εx, and εy are arbitrary constants. Every orbit is closed since theperiods of the oscillations in x and y are identical. The orbits form ellipsescentered on the center of attraction. The azimuthal period is Tψ = 2π/Ωbecause this is the time required for the star to return to its original azimuth.During this time, the particle completes two in-and-out cycles, so the radialperiod is

Tr = 12Tψ =

π

Ω. (3.22)

(b) Kepler potential When the star is acted on by an inverse-squarefield g(r) = −GM/r2 due to a point mass M , the corresponding potentialis Φ = −GM/r = −GMu. Motion in this potential is often called Keplermotion. Equation (3.11b) becomes

d2u

dψ2+ u =

GM

L2, (3.23)

the general solution of which is

u(ψ) = C cos(ψ − ψ0) +GM

L2, (3.24)


where C > 0 and ψ0 are arbitrary constants. Defining the orbit’s eccentri-city by

e ≡CL2

GM(3.25a)

and its semi-major axis by

a ≡L2

GM(1 − e2), (3.25b)

equation (3.24) may be rewritten

r(ψ) =a(1 − e2)

1 + e cos(ψ − ψ0). (3.26)

An orbit for which e ≥ 1 is unbound, since r → ∞ as (ψ − ψ0) →± cos−1(−1/e). We discuss unbound orbits in §3.1d below. Bound orbitshave e < 1 and along them r is a periodic function of ψ with period 2π, sothe star returns to its original radial coordinate after exactly one revolutionin ψ. Thus bound Kepler orbits are closed, and one may show that theyform ellipses with the attracting center at one focus. The pericenter andapocenter distances are

r1 = a(1 − e) and r2 = a(1 + e). (3.27)

In many applications, equation (3.26) for r along a bound Kepler orbitis less convenient than the parameterization

r = a(1 − e cos η), (3.28a)

where the parameter η is called the eccentric anomaly to distinguish itfrom the true anomaly, ψ − ψ0. By equating the right sides of equations(3.26) and (3.28a) and using the identity cos θ = (1− tan2 1

2θ)/(1+tan2 12θ),

it is straightforward to show that the true and eccentric anomalies are relatedby √

1 − e tan 12 (ψ − ψ0) =

√1 + e tan 1

2η. (3.29)

Equation (3.326) gives alternative relations between the two anomalies.Taking t = 0 to occur at pericenter passage, from L = r2ψ we have

t =

∫ ψ

ψ0

dψ

ψ=

∫dψ

r2

L=

a2

L

∫ η

0dη

dψ

dη(1 − e cosη)2. (3.30)

Evaluating dψ/dη from (3.29), integrating, and using trigonometrical iden-tities to simplify the result, we obtain finally

t =a2

L

√1 − e2 (η − e sin η) =

Tr

2π(η − e sin η), (3.28b)


where the second equality follows because the bracket on the right increasesby 2π over an orbital period. This is called Kepler’s equation, and thequantity 2πt/Tr is sometimes called the mean anomaly. Hence

Tr = Tψ =a2

L

√1 − e2 = 2π

√a3

GM, (3.31)

where the second equality uses (3.25b).From (3.12) the energy per unit mass of a particle on a Kepler orbit is

E = −GM

2a. (3.32)

To unbind the particle, we must add the binding energy −E.The study of motion in nearly Kepler potentials is central to the dy-

namics of planetary systems (Murray & Dermott 1999).We have shown that a star on a Kepler orbit completes a radial oscil-

lation in the time required for ψ to increase by ∆ψ = 2π, whereas a starthat orbits in a harmonic-oscillator potential has already completed a radialoscillation by the time ψ has increased by ∆ψ = π. Since galaxies are moreextended than point masses, and less extended than homogeneous spheres,a typical star in a spherical galaxy completes a radial oscillation after its an-gular coordinate has increased by an amount that lies somewhere in betweenthese two extremes; π < ∆ψ < 2π (cf. Problem 3.17). Thus, we expect a starto oscillate from its apocenter through its pericenter and back in a shortertime than is required for one complete azimuthal cycle about the galacticcenter.

It is sometimes useful to consider that an orbit in a non-Kepler forcefield forms an approximate ellipse, though one that precesses by ψp =∆ψ−2π in the time needed for one radial oscillation. For the orbit shown inFigure 3.1, and most galactic orbits, this precession is in the sense oppositeto the rotation of the star itself. The angular velocity Ωp of the rotatingframe in which the ellipse appears closed is

Ωp =ψp

Tr=

∆ψ − 2π

Tr. (3.33)

Hence we say that Ωp is the precession rate of the ellipse. The concept ofclosed orbits in a rotating frame of reference is crucial to the theory of spiralstructure—see §6.2.1, particularly Figure 6.12.

(c) Isochrone potential The harmonic oscillator and Kepler potentialsare both generated by mass distributions that are qualitatively different fromthe mass distributions of galaxies. The only known potential that could begenerated by a realistic stellar system for which all orbits are analytic is theisochrone potential of equation (2.47) (Henon 1959).


Box 3.1: Timing the local group

The nearest giant spiral galaxy is the Sb galaxy M31, at a distance ofabout (740 ± 40) kpc (BM §7.4.1). Our galaxy and M31 are by far thetwo largest members of the Local Group of galaxies. Beyond these, thenext nearest prominent galaxies are in the Sculptor and M81 groups, ata distance of 3 Mpc. Thus the Local Group is an isolated system.

The line-of-sight velocity of the center of M31 relative to the centerof the Galaxy is −125 km s−1 (for a solar circular speed v0 = 220 km s−1,eq. 1.8); it is negative because the two galaxies are approaching one an-other. It seems that gravity has halted and reversed the original motionof M31 away from the Galaxy. Since M31 and the Galaxy are by far themost luminous members of the Local Group, we can treat them as anisolated system of two point masses, and estimate their total mass (Kahn& Woltjer 1959; Wilkinson & Evans 1999). Moreover, the original Hub-ble recession corresponded to an orbit of zero angular momentum, so weexpect the angular momentum of the current orbit to be negligible. Thuswe assume that the eccentricity e = 1.

We may now apply equations (3.28) for a Kepler orbit. Taking thelog of both equations, differentiating with respect to η, and taking theratio, we obtain

d ln r

d ln t=

t

r

dr

dt=

e sin η(η − e sin η)

(1 − e cos η)2. (1)

We set e = 1, and require that r = 740 kpc, dr/dt = −125 km s−1, andt = 13.7 Gyr, the current age of the universe (eq. 1.77). Inserting theseconstraints in (1) gives a nonlinear equation for η, which is easily solvednumerically to yield η = 4.29. Then equations (3.28) yield a = 524 kpcand Tr = 16.6 Gyr, and equation (3.31) finally yields M = 4.6×1012 M"for the total mass of M31 and the Galaxy. The uncertainty in this result,assuming that our model is correct, is probably about a factor of 1.5.

This calculation assumes that the vacuum-energy density ρΛ is zero.Inclusion of non-zero ρΛ is simple (Problem 3.5); with parameters fromequations (1.52) and (1.73), the required mass M increases by 15%.

The luminosity of the Galaxy in the R band is 3×1010 L" (Table 1.2)and M31 is about 1.5 times as luminous (BM Table 4.3); thus, if ourmass estimate is correct, the mass-to-light ratio for the Local Group isΥV ( 60Υ". This is far larger than expected for any normal stellarpopulation, and the total mass is far larger than the masses within theouter edges of the disks of these galaxies, as measured by circular-speedcurves. Thus the Kahn–Woltjer timing argument provided the first directevidence that most of the mass of the Local Group is composed of darkmatter. For a review see Peebles (1996).


Box 3.2: The eccentricity vector for Kepler orbits

The orbit of a test particle in the Kepler potential can also be found usingvector methods. Since the angular momentum per unit mass L = r × vis constant in any central field g(r), with the equation of motion (3.3)and the vector identity (B.9) we have

d

dt(v × L) =

dv

dt× L = g(r)er × (r × v) = g(r) [(er · v)r − rv] . (1)

The time derivative of the unit radial vector is

der

dt=

d

dt

(r

r

)=

v

r−

r · vr3

r =1

r2[rv − (er · v)r] . (2)

Comparing equations (1) and (2) we have

d

dt(v × L) = −g(r)r2 der

dt. (3)

If and only if the field is Kepler, g(r) = −GM/r2, this equation can beintegrated to yield

v × L = GM(er + e), (4)

where e is a vector constant, or integral of motion (see §3.1.1). Takingthe dot product of L with equation (4), we find that e · L = 0, so e liesin the orbital plane. Taking the dot product of r with equation (4) andusing the vector identity (B.8), we have

L2 = GM(r + e · r). (5)

If we now define ψ to be an azimuthal angle in the orbital plane, with eat azimuth ψ0, then e · r = er cos(ψ − ψ0), where e = |e|, and equation(5) can be rewritten

r =L2

GM

1

1 + e cos(ψ − ψ0), (6)

which is the same as equations (3.25b) and (3.26) for a Kepler orbit if weidentify e with the eccentricity. It is therefore natural to call the vectorconstant e the eccentricity vector, also sometimes called the Laplaceor Runge–Lenz vector. The eccentricity vector has length equal to theeccentricity and points from the central mass towards the pericenter. Thedirection of the eccentricity vector is called the line of apsides.

Orbits in other central fields have integrals of motion analogous tothe scalar eccentricity, but they do not have vector integrals analogous tothe eccentricity vector, because orbits in non-Kepler potentials are notclosed.


It is convenient to define an auxiliary variable s by

s ≡ −GM

bΦ= 1 +

√1 +

r2

b2. (3.34)

Solving this equation for r, we find that

r2

b2= s2

(1 −

2

s

)(s ≥ 2). (3.35)

Given this one-to-one relationship between s and r, we may employ s as aradial coordinate in place of r. The integrals (3.17) and (3.18b) for Tr and∆ψ both involve the infinitesimal quantity

dI ≡dr√

2(E − Φ) − L2/r2. (3.36)

When we use equation (3.35) to eliminate r from this expression, we find

dI =b(s − 1)ds√

2Es2 − 2(2E − GM/b)s − 4GM/b − L2/b2. (3.37)

As the star moves from pericenter r1 to apocenter r2, s varies from thesmaller root s1 of the quadratic expression in the denominator of equation(3.37) to the larger root s2. Thus, combining equations (3.17) and (3.37),the radial period is

Tr =2b√−2E

∫ s2

s1

ds(s − 1)√

(s2 − s)(s − s1)=

2πb√−2E

[12 (s1 + s2) − 1

], (3.38)

where we have assumed E < 0 since we are dealing with bound orbits. Butfrom the denominator of equation (3.37) it follows that the roots s1 and s2

obey

s1 + s2 = 2 −GM

Eb, (3.39a)

and so the radial period

Tr =2πGM

(−2E)3/2, (3.39b)

exactly as in the Kepler case (the limit of the isochrone as b → 0). Note thatTr depends on the energy E but not on the angular momentum L—it is thisunique property that gives the isochrone its name.

Equation (3.18b), for the increment ∆ψ in azimuthal angle per cycle inthe radial direction, yields

∆ψ = 2L

∫ s2

s1

dI

r2=

2L

b√−2E

∫ s2

s1

ds(s − 1)

s(s − 2)√

(s2 − s)(s − s1)

= π sgn(L)

(1 +

|L|√L2 + 4GMb

),

(3.40)


where sgn(L) = ±1 depending on the sign of L. From this expression we seethat

π < |∆ψ| < 2π. (3.41)

The only orbits for which |∆ψ| approaches the value 2π characteristic ofKepler motion are those with L2 ) 4GMb. Such orbits never approach thecore r ∼< b of the potential, and hence always move in a near-Kepler field.In the opposite limit, L2 + 4GMb, |∆ψ| → π; physically this implies thatlow angular-momentum orbits fly straight through the core of the potential.In fact, the behavior |∆ψ| → π as L → 0 is characteristic of any sphericalpotential that is not strongly singular at r = 0—see Problem 3.19.

Inserting equations (3.39b) and (3.40) into equation (3.19), we have thatthe azimuthal period of an isochrone orbit is

Tψ =4πGM

(−2E)3/2

√L2 + 4GMb

|L| +√

L2 + 4GMb. (3.42)

(d) Hyperbolic encounters In Chapter 7 we shall find that the dynami-cal evolution of globular clusters is largely driven by gravitational encountersbetween stars. These encounters are described by unbound Kepler orbits.

Let (xM ,vM ) and (xm,vm) be the positions and velocities of two pointmasses M and m, respectively; let r = xM − xm and V = r. Then theseparation vector r obeys equation (D.33),

(mM

M + m

)r = −

GMm

r2er or µr = −

G(M + m)µ

r2er. (3.43)

This is the equation of motion of a fictitious particle, called the reduced par-ticle, which has mass µ = Mm/(M + m) and travels in the Kepler potentialof a fixed body of mass M + m (see Appendix D.1). If ∆vm and ∆vM arethe changes in the velocities of m and M during the encounter, we have

∆vM − ∆vm = ∆V. (3.44a)

Furthermore, since the velocity of the center of mass of the two bodies isunaffected by the encounter (eq. D.19), we also have

M∆vM + m∆vm = 0. (3.44b)

Eliminating ∆vm between equations (3.44) we obtain ∆vM as

∆vM =m

M + m∆V. (3.45)


Figure 3.2 The motion of the re-duced particle during a hyperbolicencounter.

We now evaluate ∆V.Let the component of the initial separation vector that is perpendicular

to the initial velocity vector V0 = V(t = −∞) have length b (see Figure 3.2),the impact parameter of the encounter. Then the conserved angular mo-mentum per unit mass associated with the motion of the reduced particleis

L = bV0. (3.46)

Equation (3.24), which relates the radius and azimuthal angle of a particlein a Kepler orbit, reads in this case,

1

r= C cos(ψ − ψ0) +

G(M + m)

b2V 20

, (3.47)

where the angle ψ is shown in Figure 3.2. The constants C and ψ0 aredetermined by the initial conditions. Differentiating equation (3.47) withrespect to time, we obtain

dr

dt= Cr2ψ sin(ψ − ψ0)

= CbV0 sin(ψ − ψ0),(3.48)

where the second line follows because r2ψ = L. If we define the directionψ = 0 to point towards the particle as t → −∞, we find on evaluatingequation (3.48) at t = −∞,

−V0 = CbV0 sin(−ψ0). (3.49a)

On the other hand, evaluating equation (3.47) at this time we have

0 = C cosψ0 +G(M + m)

b2V 20

. (3.49b)

Eliminating C between these equations, we obtain

tanψ0 = −bV 2

0

G(M + m). (3.50)


But from either (3.47) or (3.48) we see that the point of closest approachis reached when ψ = ψ0. Since the orbit is symmetrical about this point,the angle through which the reduced particle’s velocity is deflected is θdefl =2ψ0−π (see Figure 3.2). It proves useful to define the 90 deflection radiusas the impact parameter at which θdefl = 90:

b90 ≡G(M + m)

V 20

. (3.51)

Thus

θdefl = 2 tan−1

(G(M + m)

bV 20

)= 2 tan−1(b90/b). (3.52)

By conservation of energy, the relative speed after the encounter equals theinitial speed V0. Hence the components ∆V‖ and ∆V⊥ of ∆V parallel andperpendicular to the original relative velocity vector V0 are given by

|∆V⊥| = V0 sin θdefl = V0| sin 2ψ0| =2V0| tanψ0|1 + tan2 ψ0

=2V0(b/b90)

1 + b2/b290

, (3.53a)

|∆V‖| = V0(1 − cos θdefl) = V0(1 + cos 2ψ0) =2V0

1 + tan2 ψ0

=2V0

1 + b2/b290

. (3.53b)

∆V‖ always points in the direction opposite to V0. By equation (3.45) weobtain the components of ∆vM as

|∆vM⊥| =2mV0

M + m

b/b90

1 + b2/b290

, (3.54a)

|∆vM‖| =2mV0

M + m

1

1 + b2/b290

. (3.54b)

∆vM‖ always points in the direction opposite to V0. Notice that in the limitof large impact parameter b, |∆vM⊥| = 2Gm/(bV0), which agrees with thedetermination of the same quantity in equation (1.30).

3.1.1 Constants and integrals of the motion

Any stellar orbit traces a path in the six-dimensional space for which the co-ordinates are the position and velocity x,v. This space is called phasespace.1 A constant of motion in a given force field is any function

1 In statistical mechanics phase space usually refers to position-momentum spacerather than position-velocity space. Since all bodies have the same acceleration in a givengravitational field, mass is irrelevant, and position-velocity space is more convenient.


C(x,v; t) of the phase-space coordinates and time that is constant alongstellar orbits; that is, if the position and velocity along an orbit are given byx(t) and v(t) = dx/dt,

C[x(t1),v(t1); t1] = C[x(t2),v(t2); t2] (3.55)

for any t1 and t2.An integral of motion I(x,v) is any function of the phase-space co-

ordinates alone that is constant along an orbit:

I[x(t1),v(t1)] = I[x(t2),v(t2)]. (3.56)

While every integral is a constant of the motion, the converse is nottrue. For example, on a circular orbit in a spherical potential the azimuthalcoordinate ψ satisfies ψ = Ωt + ψ0, where Ω is the star’s constant angularspeed and ψ0 is its azimuth at t = 0. Hence C(ψ, t) ≡ t− ψ/Ω is a constantof the motion, but it is not an integral because it depends on time as well asthe phase-space coordinates.

Any orbit in any force field always has six independent constants of mo-tion. Indeed, since the initial phase-space coordinates (x0,v0) ≡ [x(0),v(0)]can always be determined from [x(t),v(t)] by integrating the equations ofmotion backward, (x0,v0) can be regarded as six constants of motion.

By contrast, orbits can have from zero to five integrals of motion. Incertain important cases, a few of these integrals can be written down easily:in any static potential Φ(x), the Hamiltonian H(x,v) = 1

2v2 + Φ is anintegral of motion. If a potential Φ(R, z, t) is axisymmetric about the zaxis, the z-component of the angular momentum is an integral, and in aspherical potential Φ(r, t) the three components of the angular-momentumvector L = x × v constitute three integrals of motion. However, we shallfind in §3.2 that even when integrals exist, analytic expressions for them areoften not available.

These concepts and their significance for the geometry of orbits in phasespace are nicely illustrated by the example of motion in a spherically sym-metric potential. In this case the Hamiltonian H and the three componentsof the angular momentum per unit mass L = x×v constitute four integrals.However, we shall find it more convenient to use |L| and the two independentcomponents of the unit vector n = L/|L| as integrals in place of L. We haveseen that n defines the orbital plane within which the position vector r andthe velocity vector v must lie. Hence we conclude that the two independentcomponents of n restrict the star’s phase point to a four-dimensional regionof phase space. Furthermore, the equations H(x,v) = E and |L(x,v)| = L,where L is a constant, restrict the phase point to that two-dimensional sur-face in this four-dimensional region on which vr = ±

√2[E − Φ(r)] − L2/r2

and vψ = L/r. In §3.5.1 we shall see that this surface is a torus and that thesign ambiguity in vr is analogous to the sign ambiguity in the z-coordinate


of a point on the sphere r2 = 1 when one specifies the point through its xand y coordinates. Thus, given E, L, and n, the star’s position and velocity(up to its sign) can be specified by two quantities, for example r and ψ.

Is there a fifth integral of motion in a spherical potential? To study thisquestion, we examine motion in the potential

Φ(r) = −GM

(1

r+

a

r2

). (3.57)

For this potential, equation (3.11b) becomes

d2u

dψ2+

(1 −

2GMa

L2

)u =

GM

L2, (3.58)

the general solution of which is

u = C cos

(ψ − ψ0

K

)+

GMK2

L2, (3.59a)

where

K ≡(

1 −2GMa

L2

)−1/2

. (3.59b)

Hence

ψ0 = ψ − K Arccos

[1

C

(1

r−

GMK2

L2

)], (3.60)

where t = Arccosx is the multiple-valued solution of x = cos t, and C canbe expressed in terms of E and L by

E = 12

C2L2

K2− 1

2

(GMK

L

)2

. (3.61)

If in equations (3.59b), (3.60) and (3.61) we replace E by H(x,v) and L by|L(x,v)| = |x × v|, the quantity ψ0 becomes a function of the phase-spacecoordinates which is constant as the particle moves along its orbit. Hence ψ0

is a fifth integral of motion. (Since the function Arccosx is multiple-valued,a judicious choice of solution is necessary to avoid discontinuous jumps inψ0.) Now suppose that we know the numerical values of E, L, ψ0, and theradial coordinate r. Since we have four numbers—three integrals and onecoordinate—it is natural to ask how we might use these numbers to determinethe azimuthal coordinate ψ. We rewrite equation (3.60) in the form

ψ = ψ0 ± K cos−1

[1

C

(1

r−

GMK2

L2

)]+ 2nKπ, (3.62)


where cos−1(x) is defined to be the value of Arccos (x) that lies between 0and π, and n is an arbitrary integer. If K is irrational—as nearly all realnumbers are—then by a suitable choice of the integer n, we can make ψmodulo 2π approximate any given number as closely as we please. Thusfor any values of E and L, and any value of r between the pericenter andapocenter for the given E and L, an orbit that is known to have a givenvalue of the integral ψ0 can have an azimuthal angle as close as we please toany number between 0 and 2π.

On the other hand, if K is rational these problems do not arise. Thesimplest and most important case is that of the Kepler potential, when a = 0and K = 1. Equation (3.62) now becomes

ψ = ψ0 ± cos−1

[1

C

(1

r−

GM

L2

)]+ 2nπ, (3.63)

which yields only two values of ψ modulo 2π for given E, L and r.These arguments can be restated geometrically. The phase space has

six dimensions. The equation H(x,v) = E confines the orbit to a five-dimensional subspace. The vector equation L(x,v) = constant adds threefurther constraints, thereby restricting the orbit to a two-dimensional surface.Through the equation ψ0(x,v) = constant the fifth integral confines the orbitto a one-dimensional curve on this surface. Figure 3.1 can be regarded asa projection of this curve. In the Kepler case K = 1, the curve closes onitself, and hence does not cover the two-dimensional surface H = constant ,L = constant . But when K is irrational, the curve is endless and denselycovers the surface of constant H and L.

We can make an even stronger statement. Consider any volume of phasespace, of any shape or size. Then if K is irrational, the fraction of the timethat an orbit with given values of H and L spends in that volume does notdepend on the value that ψ0 takes on this orbit.

Integrals like ψ0 for irrational K that do not affect the phase-space dis-tribution of an orbit, are called non-isolating integrals. All other integralsare called isolating integrals. The examples of isolating integrals that wehave encountered so far, namely, H , L, and the function ψ0 when K = 1,all confine stars to a five-dimensional region in phase space. However, therecan also be isolating integrals that restrict the orbit to a six-dimensionalsubspace of phase space—see §3.7.3. Isolating integrals are of great practicaland theoretical importance, whereas non-isolating integrals are of essentiallyno value for galactic dynamics.

3.2 Orbits in axisymmetric potentials 159

3.2 Orbits in axisymmetric potentials

Few galaxies are even approximately spherical, but many approximate figuresof revolution. Thus in this section we begin to explore the types of orbitsthat are possible in many real galaxies. As in Chapter 2, we shall usuallyemploy a cylindrical coordinate system (R,φ, z) with origin at the galacticcenter, and shall align the z axis with the galaxy’s symmetry axis.

Stars whose motions are confined to the equatorial plane of an axisym-metric galaxy have no way of perceiving that the potential in which theymove is not spherically symmetric. Therefore their orbits will be identicalwith those we discussed in the last section; the radial coordinate R of a staron such an orbit oscillates between fixed extrema as the star revolves aroundthe center, and the orbit again forms a rosette figure.

3.2.1 Motion in the meridional plane

The situation is much more complex and interesting for stars whose motionscarry them out of the equatorial plane of the system. The study of suchgeneral orbits in axisymmetric galaxies can be reduced to a two-dimensionalproblem by exploiting the conservation of the z-component of angular mo-mentum of any star. Let the potential, which we assume to be symmetricabout the plane z = 0, be Φ(R, z). Then the motion is governed by theLagrangian

L = 12

[R2 +

(Rφ)2

+ z2]− Φ(R, z). (3.64)

The momenta are

pR = R ; pφ = R2φ ; pz = z, (3.65)

so the Hamiltonian is

H = 12

(p2

R +p2

φ

R2+ p2

z

)+ Φ(R, z). (3.66)

From Hamilton’s equations (D.54) we find that the equations of motion are

pR = R =p2

φ

R3−

∂Φ

∂R, (3.67a)

pφ =d

dt

(R2φ

)= 0, (3.67b)

pz = z = −∂Φ

∂z. (3.67c)

Equation (3.67b) expresses conservation of the component of angular mo-mentum about the z axis, pφ = Lz (a constant), while equations (3.67a) and(3.67c) describe the coupled oscillations of the star in the R and z-directions.


Figure 3.3 Level contours of the effective potential of equation (3.70) when v0 = 1,Lz = 0.2. Contours are shown for Φeff = −1, −0.5, 0, 0.5, 1, 1.5, 2, 3, 5. The axis ratiois q = 0.9 in the left panel and q = 0.5 in the right.

After replacing pφ in (3.67a) by its numerical value Lz, the first and last ofequations (3.67) can be written

R = −∂Φeff

∂R; z = −

∂Φeff

∂z, (3.68a)

where

Φeff ≡ Φ(R, z) +L2

z

2R2(3.68b)

is called the effective potential. Thus the three-dimensional motion ofa star in an axisymmetric potential Φ(R, z) can be reduced to the two-dimensional motion of the star in the (R, z) plane (the meridional plane)under the Hamiltonian

Heff = 12 (p2

R + p2z) + Φeff(R, z). (3.69)

Notice that Heff differs from the full Hamiltonian (3.66) only in the substi-tution of the constant Lz for the azimuthal momentum pφ. Consequently,the numerical value of Heff is simply the orbit’s total energy E. The dif-ference E − Φeff is the kinetic energy of motion in the (R, z) plane, equalto 1

2 (p2R + p2

z). Since kinetic energy is non-negative, the orbit is restrictedto the area in the meridional plane satisfying the inequality E ≥ Φeff . Thecurve bounding this area is called the zero-velocity curve, since the orbitcan only reach this curve if its velocity in the (R, z) plane is instantaneouslyzero.

Figure 3.3 shows contour plots of the effective potential

Φeff = 12v2

0 ln

(R2 +

z2

q2

)+

L2z

2R2, (3.70)


Figure 3.4 Two orbits in the potential of equation (3.70) with q = 0.9. Both orbits areat energy E = −0.8 and angular momentum Lz = 0.2, and we assume v0 = 1.

for v0 = 1, Lz = 0.2 and axial ratios q = 0.9 and 0.5. This resembles theeffective potential experienced by a star in an oblate spheroidal galaxy thathas a constant circular speed v0 (§2.3.2). Notice that Φeff rises very steeplynear the z axis, as if the axis of symmetry were protected by a centrifugalbarrier.

The minimum in Φeff has a simple physical significance. The minimumoccurs where

0 =∂Φeff

∂R=

∂Φ

∂R−

L2z

R3; 0 =

∂Φeff

∂z. (3.71)

The second of these conditions is satisfied anywhere in the equatorial planez = 0 on account of the assumed symmetry of Φ about this place, and thefirst is satisfied at the guiding-center radius Rg where

(∂Φ

∂R

)

(Rg,0)

=L2

z

R3g

= Rgφ2. (3.72)

This is simply the condition for a circular orbit with angular speed φ. Thusthe minimum of Φeff occurs at the radius at which a circular orbit has angularmomentum Lz, and the value of Φeff at the minimum is the energy of thiscircular orbit.

Unless the gravitational potential Φ is of some special form, equations(3.68a) cannot be solved analytically. However, we may follow the evolutionof R(t) and z(t) by integrating the equations of motion numerically, startingfrom a variety of initial conditions. Figure 3.4 shows the result of two suchintegrations for the potential (3.69) with q = 0.9 (see Richstone 1982). Theorbits shown are of stars of the same energy and angular momentum, yet theylook quite different in real space, and hence the stars on these orbits mustmove through different regions of phase space. Is this because the equationsof motion admit a third isolating integral I(R, z, pR, pz) in addition to E andLz?


3.2.2 Surfaces of section

The phase space associated with the motion we are considering has fourdimensions, R, z, pR, and pz, and the four-dimensional motion of the phase-space point of an individual star is too complicated to visualize. Nonetheless,we can determine whether orbits in the (R, z) plane admit an additionalisolating integral by use of a simple graphical device. Since the HamiltonianHeff(R, z, pR, pz) is constant, we could plot the motion of the representativepoint in a three-dimensional reduced phase space, say (R, z, pR), and thenpz would be determined (to within a sign) by the known value E of Heff .However, even three-dimensional spaces are difficult to draw, so we simplyshow the points where the star crosses some plane in the reduced phase space,say the plane z = 0; these points are called consequents. To remove thesign ambiguity in pz, we plot the (R, pR) coordinates only when pz > 0. Inother words, we plot the values of R and pR every time the star crosses theequator going upward. Such plots were first used by Poincare and are calledsurfaces of section.2 The key feature of the surface of section is that, eventhough it is only two-dimensional, no two distinct orbits at the same energycan occupy the same point. Also, any orbit is restricted to an area in thesurface of section defined by the constraint Heff ≥ 1

2 R2 + Φeff ; the curvebounding this area is often called the zero-velocity curve of the surface ofsection, since it can only be reached by an orbit with pz = 0.

Figure 3.5 shows the (R, pR) surface of section at the energy of the orbitsof Figure 3.4: the full curve is the zero-velocity curve, while the dots showthe consequents generated by the orbit in the left panel of Figure 3.4. Thecross near the center of the surface of section, at (R = 0.26, pR = 0), is thesingle consequent of the shell orbit, in which the trajectory of the star isrestricted to a two-dimensional surface. The shell orbit is the limit of orbitssuch as those shown in Figure 3.4 in which the distance between the innerand outer boundaries of the orbit shrinks to zero.

In Figure 3.5 the consequents of the orbit of the left panel of Figure 3.4appear to lie on a smooth curve, called the invariant curve of the orbit. Theexistence of the invariant curve implies that some isolating integral I is re-spected by this orbit. The curve arises because the equation I = constant re-stricts motion in the two-dimensional surface of section to a one-dimensionalcurve (or perhaps to a finite number of discrete points in exceptional cases).It is often found that for realistic galactic potentials, orbits do admit an in-tegral of this type. Since I is in addition to the two classical integrals H andpφ, it is called the third integral. In general there is no analytic expressionfor I as a function of the phase-space variables, so it is called a non-classicalintegral.

2 A surface of section is defined by some arbitrarily chosen condition, here z = 0, pz >0. Good judgment must be used in the choice of this condition lest some important orbitsnever satisfy it, and hence do not appear on the surface of section.


Figure 3.5 Points generated by the orbit of the left panel of Figure 3.4 in the (R, pR)surface of section. If the total angular momentum L of the orbit were conserved, the pointswould fall on the dashed curve. The full curve is the zero-velocity curve at the energy ofthis orbit. The × marks the consequent of the shell orbit.

Figure 3.6 The total angular momentum is almost constant along the orbit shown in theleft panel of Figure 3.5. For clarity L(t) is plotted only at the beginning and end of a longintegration.

We may form an intuitive picture of the nature of the third integral byconsidering two special cases. If the potential Φ is spherical, we know that thetotal angular momentum |L| is an integral. This suggests that for a nearlyspherical potential—this one has axis ratio q = 0.9—the third integral maybe approximated by |L|. The dashed curve in Figure 3.5 shows the curve


on which the points generated by the orbit of the left panel of Figure 3.4would lie if the third integral were |L|, and Figure 3.6 shows the actual timeevolution of |L| along that orbit—notice that although |L| oscillates rapidly,its mean value does not change even over hundreds of orbital times. Fromthese two figures we see that |L| is an approximately conserved quantity,even for orbits in potentials that are significantly flattened. We may thinkof these orbits as approximately planar and with more or less fixed peri- andapocenter radii. The approximate orbital planes have a fixed inclination tothe z axis but precess about this axis, at a rate that gradually tends to zeroas the potential becomes more and more nearly spherical.

The second special case is when the potential is separable in R and z:

Φ(R, z) = ΦR(R) + Φz(z). (3.73)

Then the third integral can be taken to be the energy of vertical motion

Hz = 12p2

z + Φz(z). (3.74)

Along nearly circular orbits in a thin disk, the potential is approximatelyseparable, so equation (3.74) provides a useful expression for the third inte-gral. In §3.6.2b we discuss a more sophisticated approximation to the thirdintegral for orbits in thin disks.

3.2.3 Nearly circular orbits: epicycles and the velocity ellipsoid

In disk galaxies many stars are on nearly circular orbits, so it is useful toderive approximate solutions to equations (3.68a) that are valid for suchorbits. We define

x ≡ R − Rg, (3.75)

where Rg(Lz) is the guiding-center radius for an orbit of angular momentumLz (eq. 3.72). Thus (x, z) = (0, 0) are the coordinates in the meridional planeof the minimum in Φeff . When we expand Φeff in a Taylor series about thispoint, we obtain

Φeff = Φeff(Rg, 0)+ 12

(∂2Φeff

∂R2

)

(Rg,0)

x2+ 12

(∂2Φeff

∂z2

)

(Rg,0)

z2+O(xz2). (3.76)

Note that the term that is proportional to xz vanishes because Φeff is assumedto be symmetric about z = 0. The equations of motion (3.68a) become verysimple in the epicycle approximation in which we neglect all terms in Φeff

of order xz2 or higher powers of x and z. We define two new quantities by

κ2(Rg) ≡(∂2Φeff

∂R2

)

(Rg,0)

; ν2(Rg) ≡(∂2Φeff

∂z2

)

(Rg,0)

, (3.77)


for then equations (3.68a) become

x = −κ2x, (3.78a)

z = −ν2z. (3.78b)

According to these equations, x and z evolve like the displacements of twoharmonic oscillators, with frequencies κ and ν, respectively. The two frequen-cies κ and ν are called the epicycle or radial frequency and the verticalfrequency. If we substitute from equation (3.68b) for Φeff we obtain3

κ2(Rg) =

(∂2Φ

∂R2

)

(Rg,0)

+3L2

z

R4g

=

(∂2Φ

∂R2

)

(Rg,0)

+3

Rg

(∂Φ

∂R

)

(Rg,0)

, (3.79a)

ν2(Rg) =

(∂2Φ

∂z2

)

(Rg,0)

. (3.79b)

Since the circular frequency is given by

Ω2(R) =1

R

(∂Φ

∂R

)

(R,0)

=L2

z

R4, (3.79c)

equation (3.79a) may be written

κ2(Rg) =

(R

dΩ2

dR+ 4Ω2

)

Rg

. (3.80)

Note that the radial and azimuthal periods (eqs. 3.17 and 3.19) are simply

Tr =2π

κ; Tψ =

2π

Ω. (3.81)

Very near the center of a galaxy, where the circular speed rises approx-imately linearly with radius, Ω is nearly constant and κ ( 2Ω. Elsewhere Ωdeclines with radius, though rarely faster than the Kepler falloff, Ω ∝ R−3/2,which yields κ = Ω. Thus, in general,

Ω ∼< κ ∼< 2Ω. (3.82)

Using equations (3.19) and (3.81), it is easy to show that this range is con-sistent with the range of ∆ψ given by equation (3.41) for the isochronepotential.

3 The formula for the ratio κ2/Ω2 from equations (3.79) was already known to Newton;see Proposition 45 of his Principia.


It is useful to define two functions

A(R) ≡ 12

(vc

R−

dvc

dR

)= − 1

2RdΩ

dR,

B(R) ≡ − 12

(vc

R+

dvc

dR

)= −

(Ω + 1

2RdΩ

dR

),

(3.83)

where vc(R) = RΩ(R) is the circular speed at radius R. These functions arerelated to the circular and epicycle frequencies by

Ω = A − B ; κ2 = −4B(A − B) = −4BΩ. (3.84)

The values taken by A and B at the solar radius can be measured directlyfrom the kinematics of stars in the solar neighborhood (BM §10.3.3) andare called the Oort constants.4 Taking values for these constants fromTable 1.2, we find that the epicycle frequency at the Sun is κ0 = (37 ±3) km s−1 kpc−1, and that the ratio κ0/Ω0 at the Sun is

κ0

Ω0= 2

√−B

A − B= 1.35 ± 0.05. (3.85)

Consequently the Sun makes about 1.3 oscillations in the radial direction inthe time it takes to complete an orbit around the galactic center. Hence itsorbit does not close on itself in an inertial frame, but forms a rosette figurelike those discussed above for stars in spherically symmetric potentials.

The equations of motion (3.78) lead to two integrals, namely, the one-dimensional Hamiltonians

HR ≡ 12 (x2 + κ2x2) ; Hz ≡ 1

2 (z2 + ν2z2) (3.86)

of the two oscillators. Thus if the star’s orbit is sufficiently nearly circularthat our truncation of the series for Φeff (eq. 3.76) is justified, then the orbitadmits three integrals of motion: HR, Hz , and pφ. These are all isolatingintegrals.

From equations (3.75), (3.77), (3.78), and (3.86) we see that the Hamil-tonian of such a star is made up of three parts:

H = HR(R, pR) + Hz(z, pz) + Φeff(Rg, 0). (3.87)

4 Jan Hendrik Oort (1900–1992) was Director of Leiden Observatory in the Nether-lands from 1945 to 1970. In 1927 Oort confirmed Bertil Lindblad’s hypothesis of galacticrotation with an analysis of the motions of nearby stars that established the mathematicalframework for studying Galactic rotation. With his student H. van de Hulst, he predictedthe 21-cm line of neutral hydrogen. Oort also established the Netherlands as a worldleader in radio astronomy, and showed that many comets originate in a cloud surroundingthe Sun at a distance ∼ 0.1 pc, now called the Oort cloud.


Thus the three integrals of motion can equally be chosen as (HR, Hz, pφ) or(H, Hz , pφ), and in the latter case Hz, which is a classical integral, is playingthe role of the third integral.

We now investigate what the ratios of the frequencies κ, Ω and ν tellus about the properties of the Galaxy. At most points in a typical galacticdisk (including the solar neighborhood) vc ( constant , and from (3.80) it iseasy to show that in this case κ2 = 2Ω2. In cylindrical coordinates Poisson’sequation for an axisymmetric galaxy reads

4πGρ =1

R

∂

∂R

(R∂Φ

∂R

)+∂2Φ

∂z2

(1

R

dv2c

dR+ ν2,

(3.88)

where in the second line we have approximated the right side by its valuein the equatorial plane and used equation (3.79b). If the mass distributionwere spherical, we would have Ω2 ( GM/R3 = 4

3πGρ, where M is the massand ρ is the mean density within the sphere of radius R about the galacticcenter. From the plot of the circular speed of an exponential disk shown inFigure 2.17, we know that this relation is not far from correct even for a flatdisk. Hence, at a typical point in a galaxy such as the Milky Way

ν2

κ2( 3

2ρ/ρ. (3.89)

That is, the ratio ν2/κ2 is a measure of the degree to which the galactic mate-rial is concentrated towards the plane, and will be significantly greater thanunity for a disk galaxy. From Table 1.1 we shall see that ρ ( 0.1M" pc−3,so the vertical period of small oscillations is 2π/ν ( 87 Myr. For vc =220 km s−1 and R0 = 8 kpc (Table 1.2) we find ρ = 0.039M" pc−3. Equa-tion (3.89) then yields ν/κ ( 2.0.

From equation (3.88) it is clear that we expect Φeff ∝ z2 only for valuesof z small enough that ρdisk(z) ( constant , i.e., for z + 300 pc at R0. Forstars that do not rise above this height, equation (3.78b) yields

z = Z cos(νt + ζ), (3.90)

where Z and ζ are arbitrary constants. However, the orbits of the majorityof disk stars carry these stars further above the plane than 300 pc (Prob-lem 4.23). Therefore the epicycle approximation does not provide a reliableguide to the motion of the majority of disk stars in the direction perpendicu-lar to the disk. The great value of this approximation lies rather in its abilityto describe the motions of stars in the disk plane. So far we have describedonly the radial component of this motion, so we now turn to the azimuthalmotion.


Equation (3.78a), which governs the radial motion, has the general so-lution

x(t) = X cos(κt + α), (3.91)

where X ≥ 0 and α are arbitrary constants. Now let Ωg = Lz/R2g be

the angular speed of the circular orbit with angular momentum Lz. Sincepφ = Lz is constant, we have

φ =pφ

R2=

Lz

R2g

(1 +

x

Rg

)−2

( Ωg

(1 −

2x

Rg

).

(3.92)

Substituting for x from (3.91) and integrating, we obtain

φ = Ωgt + φ0 − γX

Rgsin(κt + α), (3.93a)

where

γ ≡2Ωg

κ= −

κ

2B, (3.93b)

where the second equality is derived using (3.84). The nature of the mo-tion described by these equations can be clarified by erecting Cartesian axes(x, y, z) with origin at the guiding center, (R,φ) = (Rg, Ωgt + φ0). Thex and z coordinates have already been defined, and the y coordinate is per-pendicular to both and points in the direction of rotation.5 To first order inthe small parameter X/Rg we have

y = −γX sin(κt + α)

≡ −Y sin(κt + α).(3.94)

Equations (3.91) and (3.94) are the complete solution for an equatorial orbitin the epicycle approximation. The motion in the z-direction is independentof the motion in x and y. In the (x, y) plane the star moves on an ellipsecalled the epicycle around the guiding center (see Figure 3.7). The lengthsof the semi-axes of the epicycle are in the ratio

X

Y= γ−1. (3.95)

For a harmonic oscillator potential X/Y = 1 and for a Kepler potentialX/Y = 1

2 ; the inequality (3.82) shows that in most galactic potentials

5 In applications to the Milky Way, which rotates clockwise when viewed from thenorth Galactic pole, either ez is directed towards the south Galactic pole, or (x, y, z) is aleft-handed coordinate system; we make the second choice in this book.


Figure 3.7 An elliptical Kepler orbit(dashed curve) is well approximatedby the superposition of motion atangular frequency κ around a smallellipse with axis ratio 1

2 , and motionof the ellipse’s center in the oppositesense at angular frequency Ω arounda circle (dotted curve).

Y > X , so the epicycle is elongated in the tangential direction.6 Fromequation (3.85), X/Y ( 0.7 in the solar neighborhood. The motion aroundthe epicycle is in the opposite sense to the rotation of the guiding centeraround the galactic center, and the period of the epicycle motion is 2π/κ,while the period of the guiding-center motion is 2π/Ωg.

Consider the motion of a star on an epicyclic orbit, as viewed by anastronomer who sits at the guiding center of the star’s orbit. At differenttimes in the orbit the astronomer’s distance measurements range from a max-imum value Y down to X . Since by equation (3.95), X/Y = κ/(2Ωg), thesemeasurements yield important information about the galactic potential. Ofcourse, the epicycle period is much longer than an astronomer’s lifetime, sowe cannot in practice measure the distance to a given star as it moves aroundits epicycle. Moreover, in general we do not know the location of the guidingcenter of any given star. But we can measure vR and vφ(R0) − vc(R0) for agroup of stars, each of which has its own guiding-center radius Rg, as theypass near the Sun at radius R0. We now show that from these measurementswe can determine the ratio 2Ω/κ. We have

vφ(R0) − vc(R0) = R0(φ− Ω0) = R0(φ− Ωg + Ωg − Ω0)

( R0

[(φ − Ωg) −

(dΩ

dR

)

Rg

x

].

(3.96a)

With equation (3.92) this becomes

vφ(R0) − vc(R0) ( −R0x

(2Ω

R+

dΩ

dR

)

Rg

. (3.96b)

6 Epicycles were invented by the Greek astronomer Hipparchus (190–120 BC) to de-scribe the motion of the planets about the Sun. Hipparchus also measured the distanceto the Moon and discovered the precession of the Earth’s spin axis. Epicycles—the firstknown perturbation expansion—were not very successful, largely because Hipparchus usedcircular epicycles with X/Y = 1. If only he had used epicycles with the proper axis ratioX/Y = 1

2 !


If we evaluate the coefficient of the small quantity x at R0 rather than Rg,we introduce an additional error in vφ(R0) which is of order x2 and thereforenegligible. Making this approximation we find

vφ(R0) − vc(R0) ( −x

(2Ω + R

dΩ

dR

)

R0

. (3.96c)

Finally using equations (3.83) to introduce Oort’s constants, we obtain

vφ(R0) − vc(R0) ( 2Bx =κ

γx =

κ

γX cos(κt + α). (3.97)

Averaging over the phases α of stars near the Sun, we find

[vφ − vc(R0)]2 =κ2X2

2γ2= 2B2X2. (3.98)

Similarly, we may neglect the dependence of κ on Rg to obtain with equation(3.84)

v2R = 1

2κ2X2 = −2B(A − B)X2. (3.99)

Taking the ratio of the last two equations we have

[vφ − vc(R0)]2

v2R

(−B

A − B= −

B

Ω0=

κ20

4Ω20

= γ−2 ( 0.46. (3.100)

In §4.4.3 we shall re-derive this equation from a rather different point of viewand compare its predictions with observational data.

Note that the ratio in equation (3.100) is the inverse of the ratio of themean-square azimuthal and radial velocities relative to the guiding center:by (3.95)

y2

x2=

12 (κY )2

12 (κX)2

= γ2. (3.101)

This counter-intuitive result arises because one measure of the rms tangentialvelocity (eq. 3.101) is taken with respect to the guiding center of a single star,while the other (eq. 3.100) is taken with respect to the circular speed at thestar’s instantaneous radius.

This analysis also leads to an alternative expression for the integral ofmotion HR defined in equation (3.86). Eliminating x using equation (3.97),we have

HR = 12 x2 + 1

2γ2[vφ(R0) − vc(R0)]

2. (3.102)

3.3 Orbits in planar non-axisymmetric potentials 171

3.3 Orbits in planar non-axisymmetric potentials

Many, possibly most, galaxies have non-axisymmetric structures. These areevident near the centers of many disk galaxies, where one finds a luminousstellar bar—the Milky Way possesses just such a bar (BM §10.3). Non-axisymmetry is harder to detect in an elliptical galaxy, but we believe thatmany elliptical galaxies, especially the more luminous ones, are triaxial ratherthan axisymmetric (BM §4.2). Evidently we need to understand how starsorbit in a non-axisymmetric potential if we are to model galaxies successfully.

We start with the simplest possible problem, namely, planar motion ina non-rotating potential.7 Towards the end of this section we generalizethe discussion to two-dimensional motion in potentials whose figures rotatesteadily, and in the next section we show how an understanding of two-dimensional motion can be exploited in problems involving three-dimensionalpotentials.

3.3.1 Two-dimensional non-rotating potential

Consider the logarithmic potential (cf. §2.3.2)

ΦL(x, y) = 12v2

0 ln

(R2

c + x2 +y2

q2

)(0 < q ≤ 1). (3.103)

This potential has the following useful properties:(i) The equipotentials have constant axial ratio q, so the influence of the

non-axisymmetry is similar at all radii. Since q ≤ 1, the y axis is theminor axis.

(ii) For R =√

x2 + y2 + Rc, we may expand ΦL in powers of R/Rc andfind

ΦL(x, y) (v20

2R2c

(x2 +

y2

q2

)+ constant (R + Rc), (3.104)

which is just the potential of the two-dimensional harmonic oscillator.In §2.5 we saw that gravitational potentials of this form are generatedby homogeneous ellipsoids. Thus for R ∼< Rc, ΦL approximates thepotential of a homogeneous density distribution.

(iii) For R ) Rc and q = 1, ΦL ( v20 ln R, which yields a circular speed

vc ( v0 that is nearly constant. Thus the radial component of the forcegenerated by ΦL with q ( 1 is consistent with the flat circular-speedcurves of many disk galaxies.

The simplest orbits in ΦL are those that are confined to R + Rc; when ΦL

is of the form (3.104), the orbit is the sum of independent harmonic motions

7 This problem is equivalent to that of motion in the meridional plane of an axisym-metric potential when Lz = 0.


Figure 3.8 Two orbits of a com-mon energy in the potential ΦL

of equation (3.103) when v0 = 1,q = 0.9 and Rc = 0.14: top, a boxorbit; bottom, a loop orbit. Theclosed parent of the loop orbit is alsoshown. The energy, E = −0.337, isthat of the isopotential surface thatcuts the long axis at x = 5Rc.

parallel to the x and y axes. The frequencies of these motions are ωx = v0/Rc

and ωy = v0/qRc, and unless these frequencies are commensurable (i.e.,unless ωx/ωy = n/m for some integers n and m), the star eventually passesclose to every point inside a rectangular box. These orbits are thereforeknown as box orbits.8 Such orbits have no particular sense of circulationabout the center and thus their time-averaged angular momentum is zero.They respect two integrals of the motion, which we may take to be theHamiltonians of the independent oscillations parallel to the coordinate axes,

Hx = 12v2

x + 12v2

0x2

R2c

; Hy = 12v2

y + 12v2

0y2

q2R2c. (3.105)

To investigate orbits at larger radii R ∼> Rc, we must use numericalintegrations. Two examples are shown in Figure 3.8. Neither orbit fills theelliptical zero-velocity curve ΦL = E, so both orbits must respect a secondintegral in addition to the energy. The upper orbit is still called a box orbitbecause it can be thought of as a distorted form of a box orbit in the two-dimensional harmonic oscillator. Within the core the orbit’s envelope runsapproximately parallel to the long axis of the potential, while for R ) Rc

the envelope approximately follows curves of constant azimuth or radius.In the lower orbit of Figure 3.8, the star circulates in a fixed sense about

the center of the potential, while oscillating in radius. Orbits of this typeare called loop orbits. Any star launched from R ) Rc in the tangentialdirection with a speed of order v0 will follow a loop orbit. If the star islaunched at speed ∼ v0 at a large angle to the tangential direction, theannulus occupied by the orbit will be wide, while if the launch angle is small,the annulus is narrow. This dependence is analogous to the way in which

8 The curve traced by a box orbit is sometimes called a Lissajous figure and is easilydisplayed on an oscilloscope.


Figure 3.9 The (x, x) surface of section formed by orbits in ΦL of the same energy as theorbits depicted in Figure 3.8. The isopotential surface of this energy cuts the long axis atx = 0.7. The curves marked 4 and 1 correspond to the box and loop orbits shown in thetop and bottom panels of Figure 3.8.

the thickness of the rosette formed by an orbit of given energy in a planaraxisymmetric potential depends on its angular momentum. This analogysuggests that stars on loop orbits in ΦL may respect an integral that is somesort of generalization of the angular momentum pφ.

We may investigate these orbits further by generating a surface of sec-tion. Figure 3.9 is the surface of section y = 0, y > 0 generated by orbitsin ΦL of the same energy as the orbits shown in Figure 3.8. The boundarycurve in this figure arises from the energy constraint

12 x2 + ΦL(x, 0) ≤ 1

2 (x2 + y2) + ΦL(x, 0) = Hy=0. (3.106)

Each closed curve in this figure corresponds to a different orbit. All theseorbits respect an integral I2 in addition to the energy because each orbit isconfined to a curve.

There are two types of closed curve in Figure 3.9, corresponding tothe two basic types of orbit that we have identified. The lower panel ofFigure 3.8 shows the spatial form of the loop orbit that generates the curvemarked 1 in Figure 3.9. At a given energy there is a whole family of suchorbits that differ in the width of the elliptical annuli within which they areconfined—see Figure 3.10. The unique orbit of this family that circulates in


Figure 3.10 A selection of loop (top row) and box (bottom row) orbits in the potentialΦL(q = 0.9, Rc = 0.14) at the energy of Figures 3.8 and 3.9.

an anti-clockwise sense and closes on itself after one revolution is the closedloop orbit, which is also shown at the bottom of Figure 3.8. In the surface ofsection this orbit generates the single point 3. Orbits with non-zero annularwidths generate the curves that loop around the point 3. Naturally, thereare loop orbits that circulate in a clockwise sense in addition to the anti-clockwise orbits; in the surface of section their representative curves looparound the point 2.

The second type of closed curve in Figure 3.9 corresponds to box orbits.The box orbit shown at the top of Figure 3.8 generates the curve marked 4.All the curves in the surface of section that are symmetric about the origin,rather than centered on one of the points 2 or 3, correspond to box orbits.These orbits differ from loop orbits in two major ways: (i) in the courseof time a star on any of them passes arbitrarily close to the center of thepotential (in the surface of section their curves cross x = 0), and (ii) stars onthese orbits have no unique sense of rotation about the center (in the surfaceof section their curves are symmetric about x = 0). The outermost curvein Figure 3.9 (the zero-velocity curve) corresponds to the orbit on whichy = y = 0; on this orbit the star simply oscillates back and forth along thex axis. We call this the closed long-axis orbit. The curves interior tothis bounding curve that also center on the origin correspond to less and lesselongated box orbits. The bottom row of Figure 3.10 shows this progressionfrom left to right. Notice the strong resemblance of the most eccentric loop


Figure 3.11 The appearance of the surface of section Figure 3.9 if orbits conserved (a)angular momentum (eq. 3.107; dashed curves), or (b) Hx (eq. 3.105; inner dotted curves),or (c) H′

x (eq. 3.108; outer dot-dashed curves).

orbit in the top right panel to the least elongated box orbit shown belowit. The big difference between these orbits is that the loop orbit has a fixedsense of circulation about the center, while the box orbit does not.

It is instructive to compare the curves of Figure 3.9 with the curvesgenerated by the integrals that we encountered earlier in this chapter. Forexample, if the angular momentum pφ were an integral, the curves on thesurface of section y = 0, y > 0 would be given by the relation

(pφ)y=0 = xy = x√

2[E − ΦL(x, 0)] − x2. (3.107)

These curves are shown as dashed curves in Figure 3.11. They resemble thecurves in Figure 3.9 near the closed loop orbits 2 and 3, thus supporting oursuspicion that the integral respected by loop orbits is some generalization ofangular momentum. However, the dashed curves do not reproduce the curvesgenerated by box orbits. If the extra integral were the Hamiltonian Hx ofthe x-component of motion in the harmonic potential (3.105), the curves inFigure 3.9 would be the dotted ellipses near the center of Figure 3.11. Theyresemble the curves in Figure 3.9 that are generated by the box orbits only inthat they are symmetrical about the x axis. Figure 3.11 shows that a betterapproximation to the invariant curves of box orbits is provided by contours


of constantH ′

x ≡ 12 x2 + Φ(x, 0). (3.108)

H ′x may be thought of as the Hamiltonian associated with motion parallel to

the potential’s long axis. In a sense the integrals respected by box and looporbits are analogous to H ′

x and pφ, respectively.Figures 3.8 and 3.9 suggest an intimate connection between closed or-

bits and families of non-closed orbits. We say that the clockwise closedloop orbit is the parent of the family of clockwise loop orbits. Similarly, theclosed long-axis orbit y = 0 is the parent of the box orbits.

The closed orbits that are the parents of orbit families are all stable,since members of their families that are initially close to them remain closeat all times. In fact, we may think of any member of the family as engagedin stable oscillations about the parent closed orbit. A simple example of thisstate of affairs is provided by orbits in an axisymmetric potential. In a two-dimensional axisymmetric potential there are only two stable closed orbitsat each energy—the clockwise and the anti-clockwise circular orbits.9 Allother orbits, having non-zero eccentricity, belong to families whose parentsare these two orbits. The epicycle frequency (3.80) is simply the frequencyof small oscillations around the parent closed orbit.

The relationship between stable closed orbits and families of non-closedorbits enables us to trace the evolution of the orbital structure of a potentialas the energy of the orbits or the shape of the potential is altered, simply bytracing the evolution of the stable closed orbits. For example, consider howthe orbital structure supported by ΦL (eq. 3.103) evolves as we pass from theaxisymmetric potential that is obtained when q = 1 to the barred potentialsthat are obtained when q < 1. When q = 1, pφ is an integral, so the surfaceof section is qualitatively similar to the dashed curves in Figure 3.11. Theonly stable closed orbits are circular, and all orbits are loop orbits. Whenwe make q slightly smaller than unity, the long-axis orbit becomes stableand parents a family of elongated box orbits that oscillate about the axialorbit. As q is diminished more and more below unity, a larger and largerportion of phase space comes to be occupied by box rather than loop orbits.Comparison of Figures 3.9 and 3.12 shows that this evolution manifests itselfin the surface of section by the growth of the band of box orbits that runsaround the outside of Figure 3.12 at the expense of the two bull’s-eyes in thatfigure that are associated with the loop orbits. In real space the closed looporbits become more and more elongated, with the result that less and lessepicyclic motion needs to be added to one of these closed orbits to fill in thehole at its center and thus terminate the sequence of loop orbits. The erosionof the bull’s-eyes in the surface of section is associated with this process.

The appearance of the surface of section also depends on the energyof its orbits. Figure 3.13 shows a surface of section for motion in ΦL(q =

9 Special potentials such as the Kepler potential, in which all orbits are closed, mustbe excepted from this statement.


Figure 3.12 When the potential ΦL is made more strongly barred by diminishing q, theproportion of orbits that are boxes grows at the expense of the loops: the figure shows thesame surface of section as Figure 3.9 but for q = 0.8 rather than q = 0.9.

0.9, Rc = 0.14) at a lower energy than that of Figure 3.9. The changes in thesurface of section are closely related to changes in the size and shape of thebox and loop orbits. Box orbits that reach radii much greater than the coreradius Rc have rather narrow waists (see Figure 3.10), and closed loop orbitsof the same energy are nearly circular. If we consider box orbits and closedloop orbits of progressively smaller dimensions, the waists of the box orbitsbecome steadily less narrow, and the closed orbits become progressively moreeccentric as the dimensions of the orbits approach Rc. Eventually, at anenergy Ec, the closed loop orbit degenerates into a line parallel to the shortaxis of the potential. Loop orbits do not exist at energies less than Ec. AtE < Ec, all orbits are box orbits. The absence of loop orbits at E < Ec

is not unexpected since we saw above (eq. 3.105) that when x2 + y2 + R2c ,

the potential is essentially that of the two-dimensional harmonic oscillator,none of whose orbits are loops. At these energies the only closed orbits arethe short- and the long-axis closed orbits, and we expect both of these orbitsto be stable. In fact, the short-axis orbit becomes unstable at the energyEc at which the loop orbits first appear. One says that the stable short-axisorbit of the low-energy regime bifurcates into the stable clockwise and anti-clockwise loop orbits at Ec. Stable closed orbits often appear in pairs likethis.


Figure 3.13 At low energies in a barred potential a large fraction of all orbits are boxes:the figure shows the same surface of section as Figure 3.9 but for the energy whose iso-potential surface cuts the x axis at x = 0.35 rather than at x = 0.7 as in Figure 3.9.

Many two-dimensional barred potentials have orbital structures thatresemble that of ΦL. In particular:(i) Most orbits in these potentials respect a second integral in addition to

energy.(ii) The majority of orbits in these potentials can be classified as either loop

orbits or box orbits. The loop orbits have a fixed sense of rotation andnever carry the star near the center, while the box orbits have no fixedsense of rotation and allow the star to pass arbitrarily close to the center.

(iii) When the axial ratio of the isopotential curves is close to unity, most ofthe phase space is filled with loop orbits, but as the axial ratio changesaway from unity, box orbits fill a bigger fraction of phase space.

Although these properties are fairly general, in §3.7.3 we shall see that certainbarred potentials have considerably more complex orbital structures.

3.3.2 Two-dimensional rotating potential

The figures of many non-axisymmetric galaxies rotate with respect to in-ertial space, so we now study orbits in rotating potentials. Let the frameof reference in which the potential Φ is static rotate steadily at angular ve-locity Ωb, often called the pattern speed. In this frame the velocity is x


and the corresponding velocity in an inertial frame is x + Ωb × x. Thus theLagrangian is

L = 12

∣∣x + Ωb × x∣∣2 − Φ(x). (3.109)

Consequently, the momentum is

p =∂L∂x

= x + Ωb × x, (3.110)

which is just the momentum in the underlying inertial frame. The Hamil-tonian is

HJ = p · x − L= p · (p − Ωb × x) − 1

2p2 + Φ

= 12p2 + Φ − Ωb · (x × p),

(3.111)

where we have used the vector identity (B.8). Since p coincides with themomentum in an inertial frame, x × p = L is the angular momentum and12p2 + Φ is the Hamiltonian H that governs the motion in the inertial frame.Hence, (3.111) can be written

HJ = H − Ωb · L. (3.112)

Since Φ(x) is constant in the rotating frame, HJ has no explicit time de-pendence, and its derivative along any orbit dHJ/dt = ∂HJ/∂t vanishes(eq. D.56). Thus HJ is an integral, called the Jacobi integral: in a ro-tating non-axisymmetric potential, neither H nor L is conserved, but thecombination H − Ωb · L is conserved. From (3.111) it is easy to show thatthe constant value of HJ may be written as

EJ = 12 |x|

2 + Φ − 12 |Ωb × x|2

= 12 |x|

2 + Φeff ,(3.113)

where the effective potential

Φeff(x) ≡ Φ(x) − 12 |Ωb × x|2

= Φ(x) − 12

[|Ωb|2|x|2 − (Ωb · x)2

].

(3.114)

In deriving the second line we have used the identity (B.10). The effectivepotential is the sum of the gravitational potential and a repulsive centrifugalpotential. For Ωb = Ωbez, this additional term is simply − 1

2Ω2R2 incylindrical coordinates.

With equation (3.111) Hamilton’s equations become

p = −∂HJ

∂x= −∇Φ − Ωb × p

x =∂HJ

∂p= p− Ωb × x,

(3.115)


Figure 3.14 Contours of constant effective potential Φeff when the potential is given byequation (3.103) with v0 = 1, q = 0.8, Rc = 0.1, and Ωb = 1. The point marked L3 is aminimum of Φeff , while those marked L4 and L5 are maxima. Φeff has saddle points atL1 and L2.

where we have used the identity (B.40). Eliminating p between these equa-tions we find

x = −∇Φ − 2Ωb × x − Ωb × (Ωb × x)

= −∇Φ − 2Ωb × x + |Ωb|2x − Ωb(Ωb · x).(3.116)

Here −2Ωb × x is known as the Coriolis force and −Ωb × (Ωb × x) is thecentrifugal force. Taking the gradient of the last line of equation (3.114),we see that (3.116) can be written in the simpler form

x = −∇Φeff − 2Ωb × x. (3.117)

The surface Φeff = EJ is often called the zero-velocity surface. Allregions in which Φeff > EJ are forbidden to the star. Thus, although thesolution of the differential equations for the orbit in a rotating potentialmay be difficult, we can at least define forbidden regions into which the starcannot penetrate.

Figure 3.14 shows contours of Φeff for the potential ΦL of equation(3.103). Φeff is characterized by five stationary points, marked L1 to L5,


at which ∇Φeff = 0. These points are sometimes called Lagrange pointsafter similar points in the restricted three-body problem (Figure 8.6). Thecentral stationary point L3 in Figure 3.14 is a minimum of the potentialand is surrounded by a region in which the centrifugal potential − 1

2Ω2bR2

makes only a small contribution to Φeff . At each of the four points L1,L2, L4, and L5, it is possible for a star to travel on a circular orbit whileappearing to be stationary in the rotating frame, because the gravitationaland centrifugal forces precisely balance. Such orbits are said to corotatewith the potential. The stationary points L1 and L2 on the x axis (the longaxis of the potential) are saddle points, while the stationary points L4 andL5 along the y axis are maxima of the effective potential. Stars with valuesof EJ smaller than the value Φc taken by Φeff at L1 and L2 cannot movefrom the center of the potential to infinity, or indeed anywhere outside theinner equipotential contour that runs through L1 and L2. By contrast, astar for which EJ exceeds Φc, or any star that is initially outside the contourthrough L1 and L2, can in principle escape to infinity. However, it cannotbe assumed that a star of the latter class will necessarily escape, becausethe Coriolis force prevents stars from accelerating steadily in the directionof −∇Φeff .

We now consider motion near each of the Lagrange points L1 to L5.These are stationary points of Φeff , so when we expand Φeff around one ofthese points xL = (xL, yL) in powers of (x − xL) and (y − yL), we have

Φeff(x, y) = Φeff(xL, yL) + 12

(∂2Φeff

∂x2

)

xL

(x − xL)2

+

(∂2Φeff

∂x∂y

)

xL

(x − xL)(y − yL) + 12

(∂2Φeff

∂y2

)

xL

(y − yL)2 + · · · .

(3.118)Furthermore, for any bar-like potential whose principal axes lie along thecoordinate axes, ∂2Φeff/∂x∂y = 0 at xL by symmetry. Hence, if we retainonly quadratic terms in equation (3.118) and define

ξ ≡ x − xL ; η ≡ y − yL, (3.119)

and

Φxx ≡(∂2Φeff

∂x2

)

xL

; Φyy ≡(∂2Φeff

∂y2

)

xL

, (3.120)

the equations of motion (3.117) become for a star near xL,

ξ = 2Ωbη − Φxxξ ; η = −2Ωbξ − Φyyη. (3.121)

This is a pair of linear differential equations with constant coefficients. Thegeneral solution can be found by substituting ξ = X exp(λt), η = Y exp(λt),


where X , Y , and λ are complex constants. With these substitutions, equa-tions (3.121) become

(λ2 + Φxx)X − 2λΩbY = 0 ; 2λΩbX + (λ2 + Φyy)Y = 0. (3.122)

These simultaneous equations have a non-trivial solution for X and Y onlyif the determinant ∣∣∣∣

λ2 + Φxx −2λΩb

2λΩb λ2 + Φyy

∣∣∣∣ = 0. (3.123)

Thus we require

λ4 + λ2(Φxx + Φyy + 4Ω2

b

)+ ΦxxΦyy = 0. (3.124)

This is the characteristic equation for λ. It has four roots, which may beeither real or complex. If λ is a root, −λ is also a root, so if there is anyroot that has non-zero real part Re(λ) = γ, the general solution to equations(3.121) will contain terms that cause |ξ| and |η| to grow exponentially in time;|ξ| ∝ exp(|γ|t) and |η| ∝ exp(|γ|t). Under these circumstances essentially allorbits rapidly flee from the Lagrange point, and the approximation on whichequations (3.121) rest breaks down. In this case the Lagrange point is saidto be unstable.

When all the roots of equation (3.124) are pure imaginary, say λ = ±iαor ±iβ, with 0 ≤ α ≤ β real, the general solution to equations (3.121) is

ξ = X1 cos(αt + φ1) + X2 cos(βt + φ2),

η = Y1 sin(αt + φ1) + Y2 sin(βt + φ2),(3.125)

and the Lagrange point is stable, since the perturbations ξ and η oscillaterather than growing. Substituting these equations into the differential equa-tions (3.121), we find that X1 and Y1 and X2 and Y2 are related by

Y1 =Φxx − α2

2ΩbαX1 =

2Ωbα

Φyy − α2X1, (3.126a)

and

Y2 =Φxx − β2

2ΩbβX2 =

2Ωbβ

Φyy − β2X2. (3.126b)

The following three conditions are necessary and sufficient for both roots λ2

of the quadratic equation (3.124) in λ2 to be real and negative, and hencefor the Lagrange point to be stable:

(i)

(ii)

(iii)

λ21λ

22 = ΦxxΦyy > 0,

λ21 + λ2

2 = −(Φxx + Φyy + 4Ω2

b

)< 0,

λ2 real ⇒ (Φxx + Φyy + 4Ω2b)

2 > 4ΦxxΦyy.

(3.127)


At saddle points of Φeff such as L1 and L2, Φxx and Φyy have oppositesigns, so these Lagrange points violate condition (i) and are always unsta-ble. At a minimum of Φeff , such as L3, Φxx and Φyy are both positive, soconditions (i) and (ii) are satisfied. Condition (iii) is also satisfied becauseit can be rewritten in the form

(Φxx − Φyy)2 + 8Ω2b (Φxx + Φyy) + 16Ω4

b > 0, (3.128)

which is satisfied whenever both Φxx and Φyy are positive. Hence L3 isstable.

For future use we note that when Φxx and Φyy are positive, we mayassume Φxx < Φyy (since the x axis is the major axis of the potential) andwe have already assumed that α < β, so we can show from (3.124) that

α2 < Φxx < Φyy < β2. (3.129)

Also, when Ω2b → 0, α2 tends to Φxx, and β2 tends to Φyy.

The stability of the Lagrange points at maxima of Φeff , such as L4 andL5, depends on the details of the potential. For the potential ΦL of equation(3.103) we have

Φeff = 12v2

0 ln

(R2

c + x2 +y2

q2

)− 1

2Ω2b(x

2 + y2), (3.130)

so L4 and L5 occur at (0,±yL), where

yL ≡

√v20

Ω2b

− q2R2c , (3.131)

and we see that L4, L5 are present only if Ωb < v0/(qRc). Differentiatingthe effective potential again we find

Φxx(0, yL) = −Ω2b(1 − q2)

Φyy(0, yL) = −2Ω2b

[1 − q2

(ΩbRc

v0

)2 ].

(3.132)

Hence ΦxxΦyy is positive if the Lagrange points exist, and stability condition(i) of (3.127) is satisfied. Deciding whether the other stability conditions holdis tedious in the general case, but straightforward in the limit of negligiblecore radius, ΩbRc/v0 + 1 (which applies, for example, to Figure 3.14). ThenΦxx+Φyy +4Ω2

b = Ω2b(1+q2), so condition (ii) is satisfied. A straightforward

calculation shows that condition (iii) holds—and thus that L4 and L5 arestable—providing q2 >

√32 − 5 ( (0.810)2. For future use we note that for

small Rc, and to leading order in the ellipticity ε = 1 − q, we have

α2 = 2εΩ2b = −Φxx ; β2 = 2(1 − 2ε)Ω2

b = 2Ω2b + O(ε). (3.133)


Equations (3.125) describing the motion about a stable Lagrange pointshow that each orbit is a superposition of motion at frequencies α and βaround two ellipses. The shapes of these ellipses and the sense of the star’smotion on them are determined by equations (3.126). For example, in thecase of small Rc and ε, the α-ellipse around the point L4 is highly elongatedin the x- or ξ-direction (the tangential direction), while the β-ellipse hasY2 = −X2/

√2. The star therefore moves around the β-ellipse in the sense

opposite to that of the rotation of the potential. The β-ellipse is simply thefamiliar epicycle from §3.2.3, while the α-ellipse represents a slow tangentialwallowing in the weak non-axisymmetric component of ΦL.

Now consider motion about the central Lagrange point L3. From equa-tions (3.126) and the inequality (3.129), it follows that Y1/X1 > 0. Thusthe star’s motion around the α-ellipse has the same sense as the rotationof the potential; such an orbit is said to be prograde or direct. WhenΩ2

b + |Φxx|, it is straightforward to show from equations (3.124) and (3.126)that X1 ) Y1 and hence that this prograde motion runs almost parallelto the long axis of the potential—this is the long-axis orbit familiar to usfrom our study of non-rotating bars. Conversely the star moves around theβ-ellipse in the sense opposite to that of the rotation of the potential (the mo-tion is retrograde), and |X2| < |Y2|. When Ω2

b/|Φxx| is small, the β-ellipsegoes over into the short-axis orbit of a non-rotating potential. A generalprograde orbit around L3 is made up of motion on the β-ellipse around aguiding center that moves around the α-ellipse, and conversely for retrogradeorbits.

We now turn to a numerical study of orbits in rotating potentials thatare not confined to the vicinity of a Lagrange point. We adopt the logarith-mic potential (3.103) with q = 0.8, Rc = 0.03, v0 = 1, and Ωb = 1. Thischoice places the corotation annulus near RCR = 30Rc. The Jacobi integral(eq. 3.112) now plays the role that energy played in our similar investigationof orbits in non-rotating potentials, and by a slight abuse of language weshall refer to its value EJ as the “energy”. At radii R ∼< Rc the two impor-tant sequences of stable closed orbits in the non-rotating case are the long-and the short-axis orbits. Figure 3.15 confirms the prediction of our analytictreatment that in the presence of rotation these become oval in shape. Or-bits of both sequences are stable and therefore parent families of non-closedorbits.

Consider now the evolution of the orbital structure as we leave the coreregion. At an energy E1, similar to that at which loop orbits first appeared inthe non-rotating case, pairs of prograde orbits like those shown in Figure 3.16appear. Only one member of the pair is stable. When it first appears,the stable orbit is highly elongated parallel to the short axis, but as theenergy is increased it becomes more round. Eventually the decrease in theelongation of this orbit with increasing energy is reversed, the orbit againbecomes highly elongated parallel to the short axis and finally disappears


Figure 3.15 In the near-harmonic core of a rotating potential, the closed orbits areelongated ellipses. Stars on the orbits shown as full curves circulate about the center inthe same sense as the potential’s figure rotates. On the dashed orbits, stars circulate inthe opposite sense. The x axis is the long axis of the potential.

Figure 3.16 Closed orbits at two energies higher than those shown in Figure 3.15. Justoutside the potential’s near-harmonic core there are at each energy two prograde closedorbits aligned parallel to the potential’s short axis. One of these orbits (the less elongated)is stable, while the other is unstable.


Figure 3.17 Near the energy at which the orbit pairs shown in Figure 3.16 appear, theclosed long-axis orbits develop ears. Panel (a) shows orbits at energies just below andabove this transition. Panel (b) shows the evolution of the closed long-axis orbits athigher energies. Notice that in panel (a) the x- and y-scales are different. The smallestorbit in panel (b) is the larger of the two orbits in panel (a).

along with its unstable companion orbit at an energy E2.10 In the notationof Contopoulos & Papayannopoulos (1980) these stable orbits are said tobelong to the sequence x2, while their unstable companions are of thesequence x3.

The sequence of long-axis orbits (often called the sequence x1) suffersa significant transition near E2. On the low-energy side of the transitionthe long-axis orbits are extremely elongated and lens shaped (smaller orbitin Figure 3.17a). On the high-energy side the orbits are self-intersecting(larger orbit in Figure 3.17a). As the energy continues to increase, the or-bit’s ears become first more prominent and then less prominent, vanishingto form a cusped orbit (Figure 3.17b). At still higher energies the orbitsbecome approximately elliptical (largest orbit in Figure 3.17b), first growingrounder and then adopt progressively more complex shapes as they approach

10 In the theory of weak bars, the energies E1 and E2 at which these prograde or-bits appear and disappear are associated with the first and second inner Lindblad radii,respectively (eq. 3.150).


Figure 3.18 A plot of the Jacobi constant EJ of closed orbits in ΦL(q = 0.8, Rc =0.03,Ωb = 1) against the value of y at which the orbit cuts the potential’s short axis. Thedotted curve shows the relation Φeff (0, y) = EJ. The families of orbits x1–x4 are marked.

the corotation region in which the Lagrange points L1, L2, L4, and L5 arelocated.

In the vicinity of the corotation annulus, there are important sequencesof closed orbits on which stars move around one of the Lagrange points L4

or L5, rather than about the center.Essentially all closed orbits that carry stars well outside the corotation

region are nearly circular. In fact, the potential’s figure spins much morerapidly than these stars circulate on their orbits, so the non-axisymmetricforces on such stars tend to be averaged out. One finds that at large radiiprograde orbits tend to align with the bar, while retrograde orbits alignperpendicular to the bar.

These results are summarized in Figure 3.18. In this figure we plotagainst the value of EJ for each closed orbit the distance y at which it crossesthe short axis of the potential. Each sequence of closed orbits generates acontinuous curve in this diagram known as the characteristic curve of thatsequence.

The stable closed orbits we have described are all associated with sub-stantial families of non-closed orbits. Figure 3.19 shows two of these. Asin the non-rotating case, a star on one of these non-closed orbits may beconsidered to be executing stable oscillations about one of the fundamentalclosed orbits. In potentials of the form (3.103) essentially all orbits belongto one of these families. This is not always true, however, as we explain in§3.7.

It is important to distinguish between orbits that enhance the elongationof the potential and those that oppose it. The overall mass distribution of a


Figure 3.19 Two non-closed orbits of a common energy in the rotating potential ΦL.

galaxy must be elongated in the same sense as the potential, which suggeststhat most stars are on orbits on which they spend the majority of their timenearer to the potential’s long axis than to its short axis. Interior to thecorotation radius, the only orbits that satisfy this criterion are orbits of thefamily parented by the long-axis orbits, which therefore must be the mostheavily populated orbits in any bar that is confined by its own gravity. Theshapes of these orbits range from butterfly-like at radii comparable to thecore radius Rc, to nearly rectangular between Rc and the inner Lindbladradius (see below), to oval between this radius and corotation.

To an observer in an inertial frame of reference, stars on orbits belongingto the long-axis family circulate about the center of the potential in the samesense as the potential rotates. One part of the circulation seen by such anobserver is due to the rotation of the frame of reference in which the potentialis static. A second component of circulation is due to the mean streamingmotion of such stars when referred to the rotating frame of the potential.Both components of circulation diminish towards zero if the angular velocityof the potential is reduced to zero. Near corotation the dominant componentarises from the rotation of the frame of reference of the potential, while atsmall radii the more important component is the mean streaming motion ofthe stars through the rotating frame of reference.

3.3.3 Weak bars

Before we leave the subject of orbits in planar non-axisymmetric potentials,we derive an analytic description of loop orbits in weak bars.

(a) Lindblad resonances We assume that the figure of the potentialrotates at some steady pattern speed Ωb, and we seek to represent a generalloop orbit as a superposition of the circular motion of a guiding center andsmall oscillations around this guiding center. Hence our treatment of orbits


in weak non-axisymmetric potentials will be closely related to the epicycletheory of nearly circular orbits in an axisymmetric potential (§3.2.3).

Let (R,ϕ) be polar coordinates in the frame that rotates with the po-tential, such that the line ϕ = 0 coincides with the long axis of the potential.Then the Lagrangian is

L = 12 R2 + 1

2 [R(ϕ+ Ωb)]2 − Φ(R,ϕ), (3.134)

so the equations of motion are

R = R(ϕ+ Ωb)2 −∂Φ

∂R, (3.135a)

d

dt[R2(ϕ+ Ωb)] = −

∂Φ

∂ϕ. (3.135b)

Since we assume that the bar is weak, we may write

Φ(R,ϕ) = Φ0(R) + Φ1(R,ϕ), (3.136)

where |Φ1/Φ0| + 1. We divide R and ϕ into zeroth- and first-order parts

R(t) = R0 + R1(t) ; ϕ(t) = ϕ0(t) + ϕ1(t) (3.137)

by substituting these expressions into equation (3.135) and requiring thatthe zeroth-order terms should sum to zero. Thus

R0 (ϕ0 + Ωb)2 =

(dΦ0

dR

)

R0

and ϕ0 = constant. (3.138)

This is the usual equation for centrifugal equilibrium at R0. If we defineΩ0 ≡ Ω(R0), where

Ω(R) ≡ ±√

1

R

dΦ0

dR(3.139)

is the circular frequency at R in the potential Φ0, equation (3.138) for theangular speed of the guiding center (R0,ϕ0) becomes

ϕ0 = Ω0 − Ωb, (3.140)

where Ω0 > 0 for prograde orbits and Ω0 < 0 for retrograde ones. We choosethe origin of time such that

ϕ0(t) = (Ω0 − Ωb)t. (3.141)

The first-order terms in the equations of motion (3.135) now yield

R1 +

(d2Φ0

dR2− Ω2

)

R0

R1 − 2R0Ω0ϕ1 = −(∂Φ1

∂R

)

R0

, (3.142a)


ϕ1 + 2Ω0R1

R0= −

1

R20

(∂Φ1

∂ϕ

)

R0

. (3.142b)

To proceed further we must choose a specific form of Φ1; we set

Φ1(R,ϕ) = Φb(R) cos(mϕ), (3.143)

where m is a positive integer, since any potential that is an even function ofϕ can be expanded as a sum of terms of this form. In practice we are mostlyconcerned with the case m = 2 since the potential is then barred. If ϕ = 0is to coincide with the long axis of the potential, we must have Φb < 0.

So far we have assumed only that the angular velocity ϕ1 is small, notthat ϕ1 is itself small. Allowing for large excursions in ϕ1 will be importantwhen we consider what happens at resonances in part (b) of this section, butfor the moment we assume that ϕ1 + 1 and hence that ϕ(t) always remainsclose to (Ω0 − Ωb)t. With this assumption we may replace ϕ by ϕ0 in theexpressions for ∂Φ1/∂R and ∂Φ1/∂ϕ to yield

R1 +

(d2Φ0

dR2− Ω2

)

R0

R1 − 2R0Ω0ϕ1 = −(

dΦb

dR

)

R0

cos [m(Ω0 − Ωb)t] ,

(3.144a)

ϕ1 + 2Ω0R1

R0=

mΦb(R0)

R20

sin [m(Ω0 − Ωb)t] . (3.144b)

Integrating the second of these equations, we obtain

ϕ1 = −2Ω0R1

R0−

Φb(R0)

R20(Ω0 − Ωb)

cos [m(Ω0 − Ωb)t] + constant. (3.145)

We now eliminate ϕ1 from equation (3.144a) to find

R1 + κ20R1 = −

[dΦb

dR+

2ΩΦb

R(Ω − Ωb)

]

R0

cos [m(Ω0 − Ωb)t] + constant,

(3.146a)where

κ20 ≡

(d2Φ0

dR2+ 3Ω2

)

R0

=

(R

dΩ2

dR+ 4Ω2

)

R0

(3.146b)

is the usual epicycle frequency (eq. 3.80). The constant in equation (3.146a)is unimportant since it can be absorbed by a shift R1 → R1 + constant .

Equation (3.146a) is the equation of motion of a harmonic oscillator ofnatural frequency κ0 that is driven at frequency m(Ω0 − Ωb). The generalsolution to this equation is

R1(t) = C1 cos(κ0t + α) −[dΦb

dR+

2ΩΦb

R(Ω − Ωb)

]

R0

cos [m(Ω0 − Ωb)t]

∆,

(3.147a)


where C1 and α are arbitrary constants, and

∆ ≡ κ20 − m2(Ω0 − Ωb)2. (3.147b)

If we use equation (3.141) to eliminate t from equation (3.147a), we find

R1(ϕ0) = C1 cos

(κ0ϕ0

Ω0 − Ωb+ α

)+ C2 cos(mϕ0), (3.148a)

where

C2 ≡ −1

∆

[dΦb

dR+

2ΩΦb

R(Ω − Ωb)

]

R0

. (3.148b)

If C1 = 0, R1(ϕ0) becomes periodic in ϕ0 with period 2π/m, and thus theorbit that corresponds to C1 = 0 is a closed loop orbit. The orbits withC1 /= 0 are the non-closed loop orbits that are parented by this closed looporbit. In the following we set C1 = 0 so that we may study the closed looporbits.

The right side of equation (3.148a) for R1 becomes singular at a numberof values of R0:(i) Corotation resonance. When

Ω0 = Ωb, (3.149)

ϕ0 = 0, and the guiding center corotates with the potential.(ii) Lindblad resonances. When

m(Ω0 − Ωb) = ±κ0, (3.150)

the star encounters successive crests of the potential at a frequency thatcoincides with the frequency of its natural radial oscillations. Radiiat which such resonances occur are called Lindblad radii after theSwedish astronomer Bertil Lindblad (1895–1965). The plus sign in equa-tion (3.150) corresponds to the case in which the star overtakes thepotential, encountering its crests at the resonant frequency κ0; this iscalled an inner Lindblad resonance. In the case of a minus sign, thecrests of the potential sweep by the more slowly rotating star, and R0

is said to be the radius of the outer Lindblad resonance.

There is a simple connection between these two types of resonance. A circularorbit has two natural frequencies. If the star is displaced radially, it oscillatesat the epicycle frequency κ0. On the other hand, if the star is displacedazimuthally in such a way that it is still on a circular orbit, then it willcontinue on a circular orbit displaced from the original one. Thus the staris neutrally stable to displacements of this form; in other words, its naturalazimuthal frequency is zero. The two types of resonance arise when the


Figure 3.20 The full curves are the characteristic curves of the prograde (upper) andretrograde (lower) circular orbits in the isochrone potential (2.47) when a rotating frameof reference is employed. The dashed curve shows the relation Φeff (0, y) = EJ, and the dotsmark the positions of the Lindblad resonances when a small non-axisymmetric componentis added to the potential.

forcing frequency seen by the star, m(Ω0 − Ωb), equals one of the naturalfrequencies ±κ0 and 0.

Figure 6.11 shows plots of Ω, Ω + 12κ and Ω− 1

2κ for two circular-speedcurves typical of galaxies. A galaxy may have 0, 1, 2, or more Lindbladresonances. The Lindblad and corotation resonances play a central role inthe study of bars and spiral structure, and we shall encounter them againChapter 6.

From equation (3.148a) it follows that for m = 2 the closed loop orbitis aligned with the bar whenever C2 > 0, and is aligned perpendicular tothe bar when C2 < 0. When R0 passes through a Lindblad or corotationresonance, the sign of C2, and therefore the orientation of the closed looporbits, changes.

It is interesting to relate the results of this analytic treatment to theorbital structure of a strong bar that we obtained numerically in the lastsubsection. In this connection it is helpful to compare Figure 3.18, whichshows data for a barred potential, with Figure 3.20, which describes orbitsin an axisymmetric potential viewed from a rotating frame. The full curvesin Figure 3.20 show the relationship between the Jacobi constant EJ and theradii of prograde and retrograde circular orbits in the isochrone potential(2.47). As in Figure 3.18, the dotted curve marks the relation Φeff(0, y) =EJ. There are no orbits in the region to the right of this curve, whichtouches the curve of the prograde circular orbits at the corotation resonance,marked CR in the figure. If in the given frame we were to add a small


non-axisymmetric component to the potential, the orbits marked by largedots would lie at the Lindblad resonances (from right to left, the first andsecond inner Lindblad resonances and the outer Lindblad resonance markedOL). We call the radius of the first inner Lindblad resonance11 RIL1, andsimilarly RIL2, ROL, and RCR for the radii of the other Lindblad resonancesand of corotation. Equations (3.148) with C1 = 0 describe nearly circularorbits in a weakly barred potential. Comparing Figure 3.20 with Figure 3.18,we see that nearly circular retrograde orbits belong to the family x4. Nearlycircular prograde orbits belong to different families depending on their radius.Orbits that lie within RIL1 belong to the family x1. In the radius rangeRIL1 < R < RIL2 the families x2 and x3 exist and contain orbits that aremore circular than those of x1. We identify the orbits described by (3.148)with orbits of the family x2 as indicated in Figure 3.20, since the family x3

is unstable. In the radius range R > RIL2, equations (3.148) with C1 = 0describe orbits of the family x1. Thus equations (3.148) describe only thefamilies of orbits in a barred potential that are parented by a nearly circularorbit. However, when the non-axisymmetric component of the potential isvery weak, most of phase space is occupied by such orbits. As the non-axisymmetry of the potential becomes stronger, families of orbits that arenot described by equations (3.148) become more important.

(b) Orbits trapped at resonance When R0 approaches the radius of ei-ther a Lindblad resonance or the corotation resonance, the value of R1 thatis predicted by equations (3.148) becomes large, and our linearized treat-ment of the equations of motion breaks down. However, one can modify theanalysis to cope with these resonances. We now discuss the necessary modi-fications for the case of the corotation resonance. The case of the Lindbladresonances is described in Goldreich & Tremaine (1981).

The appropriate modification is suggested by our investigation of orbitsnear the Lagrange points L4 and L5 in the potential ΦL (eq. 3.103), whenthe radius is large compared to the core radius Rc and the ellipticity ε = 1−qapproaches zero. In this limit the non-axisymmetric part of the potential isproportional to ε, so we have an example of a weak bar when ε → 0. Wefound in §3.3.2 that a star’s orbit was a superposition of motion at frequenciesα and β around two ellipses. In the limit ε→ 0, the β-ellipse represents thefamiliar epicyclic motion and will not be considered further. The α-ellipse ishighly elongated in the azimuthal direction, with axis ratio |Y1/X1| =

√2ε,

and its frequency is small, α =√

2εΩb.These results suggest we consider the approximation in which R1, R1,

and ϕ1 are small but ϕ1 is not. Specifically, if the bar strength Φ1 is pro-portional to some small parameter that we may call ε, we assume that ϕ1

is of order unity, R1 is of order ε1/2, and the time derivative of any quan-tity is smaller than that quantity by of order ε1/2. Let us place the guiding

11 Also called the inner inner Lindblad resonance.


Box 3.3: The donkey effect

An orbiting particle that is subject to weak tangential forces can exhibitunusual behavior. To illustrate this, suppose that the particle has massm and is in a circular orbit of radius r, with angular speed φ = Ω(r)given by rΩ2(r) = dΦ/dr (eq. 3.7a). Now let us imagine that the particleexperiences a small force, F , directed parallel to its velocity vector. Sincethe force is small, the particle remains on a circular orbit, which slowlychanges in radius in response to the force. To determine the rate ofchange of radius, we note that the angular momentum is L(r) = mr2Ωand the torque is N = rF = L. Thus

r =dr

dLL =

F/m

2Ω + r dΩ/dr= −

F

2mB; (1)

where B(r) = −Ω− 12r dΩ/dr is the function defined by equation (3.83).

The azimuthal angle accelerates at a rate

φ =dΩ

drr = −

2Ar

r, (2)

where A(r) = − 12rdΩ/dr (eq. 3.83). Combining these results,

rφ =A

mBF. (3)

This acceleration in azimuthal angle can be contrasted to the accelerationof a free particle under the same force, x = F/m. Thus the particlebehaves as if it had an inertial mass mB/A, which is negative whenever

−2 <d ln Ω

d ln R< 0. (4)

Almost all galactic potentials satisfy this inequality. Thus the orbitingparticle behaves as if it had negative inertial mass, accelerating in theopposite direction to the applied force.

There are many examples of this phenomenon in galactic dynamics,which has come to be called the donkey effect: to quote Lynden–Bell& Kalnajs (1972), who introduced the term, “in azimuth stars behavelike donkeys, slowing down when pulled forwards and speeding up whenheld back.”

The simplest example of the donkey effect is an Earth satellite sub-jected to atmospheric drag: the satellite sinks gradually into a lower orbitwith a larger circular speed and shorter orbital period, so the drag forcespeeds up the angular passage of the satellite across the sky.


center at L5 [Ω(R0) = Ωb; ϕ0 = π/2] and use equation (3.146b) to write theequations of motion (3.142) as

R1 +(κ2

0 − 4Ω20

)R1 − 2R0Ω0ϕ1 = −

∂Φ1

∂R, (3.151a)

ϕ1 + 2Ω0R1

R0= −

1

R20

∂Φ1

∂ϕ. (3.151b)

According to our ordering, the terms on the left side of the first line are oforder ε3/2, ε1/2, and ε1/2, respectively, while the term on the right side isof order ε. All the terms on the second line are of order ε. Hence we maysimplify the first line by keeping only the terms of order ε1/2:

(κ2

0 − 4Ω20

)R1 − 2R0Ω0ϕ1 = 0. (3.152)

Substituting equation (3.152) into equation (3.151b) to eliminate R1, we find

ϕ1

(κ2

0

κ20 − 4Ω2

0

)= −

1

R20

∂Φ1

∂ϕ

∣∣∣∣(R0,ϕ0+ϕ1)

. (3.153)

Substituting from equation (3.143) for Φ1 we obtain with m = 2

ϕ1 = −2Φb

R20

(4Ω2

0 − κ20

κ20

)sin [2(ϕ0 + ϕ1)] . (3.154)

By inequality (3.82) we have that 4Ω20 > κ2

0. Also we have Φb < 0 andϕ0 = π/2, and so equation (3.154) becomes

d2ψ

dt2= −p2 sinψ, (3.155a)

where

ψ ≡ 2ϕ1 and p2 ≡4

R20

|Φb(R0)|4Ω2

0 − κ20

κ20

. (3.155b)

Equation (3.155a) is simply the equation of a pendulum. Notice thatthe singularity in R1 that appeared at corotation in equations (3.148) hasdisappeared in this more careful analysis. Notice also the interesting fact thatthe stable equilibrium point of the pendulum, ϕ1 = 0, is at the maximum,not the minimum, of the potential Φ1 (Box 3.3). If the integral of motion

Ep = 12 ψ

2 − p2 cosψ (3.156)

is less than p2, the star oscillates slowly or librates about the Lagrangepoint, whereas if Ep > p2, the star is not trapped by the bar but circulates


about the center of the galaxy. For small-amplitude librations, the librationfrequency is p, consistent with our assumption that the oscillation frequencyis of order ε1/2 when Φb is of order ε. Large-amplitude librations of this kindmay account for the rings of material often seen in barred galaxies (page 538).

We may obtain the shape of the orbit from equation (3.152) by usingequation (3.156) to eliminate ϕ1 = 1

2 ψ:

R1 = −2R0Ω0ϕ1

4Ω20 − κ2

0

= ±21/2R0Ω0

4Ω20 − κ2

0

√Ep + p2 cos(2ϕ1). (3.157)

We leave as an exercise the demonstration that when Ep ) p2, equation(3.157) describes the same orbits as are obtained from (3.148a) with C1 = 0and Ω /= Ωb.

The analysis of this subsection complements the analysis of motion nearthe Lagrange points in §3.3.2. The earlier analysis is valid for small oscil-lations around a Lagrange point of an arbitrary two-dimensional rotatingpotential, while the present analysis is valid for excursions of any amplitudein azimuth around the Lagrange points L4 and L5, but only if the potentialis nearly axisymmetric.

3.4 Numerical orbit integration

In most stellar systems, orbits cannot be computed analytically, so effectivealgorithms for numerical orbit integration are among the most importanttools for stellar dynamics. The orbit-integration problems we have to addressvary in complexity from following a single particle in a given, smooth galacticpotential, to tens of thousands of interacting stars in a globular cluster, tobillions of dark-matter particles in a simulation of cosmological clustering.In each of these cases, the dynamics is that of a Hamiltonian system: with Nparticles there are 3N coordinates that form the components of a vector q(t),and 3N components of the corresponding momentum p(t). These vectorssatisfy Hamilton’s equations,

q =∂H

∂p; p = −

∂H

∂q, (3.158)

which can be written asdw

dt= f(w, t), (3.159)

where w ≡ (q,p) and f ≡ (∂H/∂p,−∂H/∂q). For simplicity we shall as-sume in this section that the Hamiltonian has the form H(q,p) = 1

2p2+Φ(q),although many of our results can be applied to more general Hamiltonians.Given a phase-space position w at time t, and a timestep h, we require an

3.4 Numerical orbit integration 197

algorithm—an integrator—that generates a new position w′ that approxi-mates the true position at time t′ = t+h. Formally, the problem to be solvedis the same whether we are following the motion of a single star in a givenpotential, or the motion of 1010 particles under their mutual gravitationalattraction.

The best integrator to use for a given problem is determined by severalfactors:• How smooth is the potential? The exploration of orbits in an analytic

model of a galaxy potential places fewer demands on the integrator thanfollowing orbits in an open cluster, where the stars are buffeted by closeencounters with their neighbors.

• How cheaply can we evaluate the gravitational field? At one extreme,evaluating the field by direct summation in simulations of globular clus-ter with ∼> 105 particles requires O(N 2) operations, and thus is quiteexpensive compared to the O(N) cost of orbit integrations. At theother extreme, tree codes, spherical-harmonic expansions, or particle-mesh codes require O(N ln N) operations and thus are comparable incost to the integration. So the integrator used in an N-body simulationof a star cluster should make the best possible use of each expensive butaccurate force evaluation, while in a cosmological simulation it is betterto use a simple integrator and evaluate the field more frequently.

• How much memory is available? The most accurate integrators use theposition and velocity of a particle at several previous timesteps to helppredict its future position. When simulating a star cluster, the numberof particles is small enough (N ∼< 105) that plenty of memory should beavailable to store this information. In a simulation of galaxy dynamicsor a cosmological simulation, however, it is important to use as manyparticles as possible, so memory is an important constraint. Thus forsuch simulations the optimal integrator predicts the future phase-spaceposition using only the current position and gravitational field.

• How long will the integration run? The answer can range from a fewcrossing times for the simulation of a galaxy merger to 105 crossingtimes in the core of a globular cluster. Long integrations require thatthe integrator does not introduce any systematic drift in the energy orother integrals of motion.

Useful references include Press et al. (1986), Hairer, Lubich, & Wanner(2002), and Aarseth (2003).

3.4.1 Symplectic integrators

(a) Modified Euler integrator Let us replace the original HamiltonianH(q,p) = 1

2p2 + Φ(q) by the time-dependent Hamiltonian

Hh(q,p, t) = 12p2 + Φ(q)δh(t), where δh(t) ≡ h

∞∑

j=−∞

δ(t − jh) (3.160)


is an infinite series of delta functions (Appendix C.1). Averaged over a timeinterval that is long compared to h, 〈Hh〉 ( H , so the trajectories determinedby Hh should approach those determined by H as h → 0.

Hamilton’s equations for Hh read

q =∂Hh

∂p= p ; p = −

∂Hh

∂q= −∇Φ(q)δh(t). (3.161)

We now integrate these equations from t = −ε to t = h−ε, where 0 < ε + h.Let the system have coordinates (q,p) at time t = −ε, and first ask forits coordinates (q,p) at t = +ε. During this short interval q changes by anegligible amount, and p suffers a kick governed by the second of equations(3.161). Integrating this equation from t = −ε to ε is trivial since q is fixed,and we find

q = q ; p = p − h∇Φ(q); (3.162a)

this is called a kick step because the momentum changes but the positiondoes not. Next, between t = +ε and t = h− ε, the value of the delta functionis zero, so the system has constant momentum, and Hamilton’s equationsyield for the coordinates at t = h − ε

q′ = q + hp ; p′ = p; (3.162b)

this is called a drift step because the position changes but the momentumdoes not. Combining these results, we find that over a timestep h startingat t = −ε the Hamiltonian Hh generates a map (q,p) → (q′,p′) given by

p′ = p− h∇Φ(q) ; q′ = q + hp′. (3.163a)

Similarly, starting at t = +ε yields the map

q′ = q + hp ; p′ = p− h∇Φ(q′). (3.163b)

These maps define the “kick-drift” or “drift-kick” modified-Euler inte-grator. The performance of this integrator in a simple galactic potential isshown in Figure 3.21.

The map induced by any Hamiltonian is a canonical or symplectic map(page 803), so it can be derived from a generating function. It is simpleto confirm using equations (D.93) that the generating function S(q,p′) =q ·p′+ 1

2hp′2+hΦ(q) yields the kick-drift modified-Euler integrator (3.163a).According to the modified-Euler integrator, the position after timestep

h isq′ = q + hp′ = q + hp− h2

∇Φ(q), (3.164)

while the exact result may be written as a Taylor series,

q′ = q + hq(t = 0) + 12h2q(t = 0) + O(h3) = q + hp − 1

2h2∇Φ(q) + O(h3).

(3.165)


Figure 3.21 Fractional energy error as a function of time for several integrators, followinga particle orbiting in the logarithmic potential Φ(r) = ln r. The orbit is moderately ec-centric (apocenter twice as big as pericenter). The timesteps are fixed, and chosen so thatthere are 300 evaluations of the force or its derivatives per period for all of the integrators.The integrators shown are kick-drift modified-Euler (3.163a), leapfrog (3.166a), Runge–Kutta (3.168), and Hermite (3.172a–d). Note that (i) over moderate time intervals, theerrors are smallest for the fourth-order integrators (Runge–Kutta and Hermite), interme-diate for the second-order integrator (leapfrog), and largest for the first-order integrator(modified-Euler); (ii) the energy error of the symplectic integrators does not grow withtime.

The error after a single step of the modified-Euler integrator is seen to beO(h2), so it is said to be a first-order integrator.

Since the mappings (3.163) are derived from the Hamiltonian (3.160),they are symplectic, so either flavor of the modified-Euler integrator is asymplectic integrator. Symplectic integrators conserve phase-space vol-ume and Poincare invariants (Appendix D.4.2). Consequently, if the inte-grator is used to advance a series of particles that initially lie on a closedcurve in the (qi, pi) phase plane, the curve onto which it moves the parti-cles has the same line integral

∮pidqi around it as the original curve. This

conservation property turns out to constrain the allowed motions in phasespace so strongly that the usual tendency of numerical orbit integrations todrift in energy (sometimes called numerical dissipation, even through theenergy can either decay or grow) is absent in symplectic integrators (Hairer,


Lubich, & Wanner 2002).

Leapfrog integrator By alternating kick and drift steps in more elab-orate sequences, we can construct higher-order integrators (Yoshida 1993);these are automatically symplectic since they are the composition of maps(the kick and drift steps) that are symplectic. The simplest and most widelyused of these is the leapfrog or Verlet integrator in which we drift for 1

2h,kick for h and then drift for 1

2h:

q1/2 = q + 12hp ; p′ = p− h∇Φ(q1/2) ; q′ = q + 1

2hp′. (3.166a)

This algorithm is sometimes called “drift-kick-drift” leapfrog; an equallygood form is “kick-drift-kick” leapfrog:

p1/2 = p− 12h∇Φ(q) ; q′ = q + hp1/2 ; p′ = p − 1

2h∇Φ(q′). (3.166b)

Drift-kick-drift leapfrog can also be derived by considering motion in theHamiltonian (3.160) from t = − 1

2h to t = 12h.

The leapfrog integrator has many appealing features: (i) In contrastto the modified-Euler integrator, it is second- rather than first-order accu-rate, in that the error in phase-space position after a single timestep is O(h3)(Problem 3.26). (ii) Leapfrog is time reversible in the sense that if leapfrogadvances the system from (q,p) to (q′,p′) in a given time, it will also ad-vance it from (q′,−p′) to (q,−p) in the same time. Time-reversibility isa constraint on the phase-space flow that, like symplecticity, suppresses nu-merical dissipation, since dissipation is not a time-reversible phenomenon(Roberts & Quispel 1992; Hairer, Lubich, & Wanner 2002). (iii) A sequenceof n leapfrog steps can be regarded as a drift step for 1

2h, then n kick-driftsteps of the modified-Euler integrator, then a drift step for − 1

2h; thus ifn ) 1 the leapfrog integrator requires negligibly more work than the samenumber of steps of the modified-Euler integrator. (iv) Leapfrog also needsno storage of previous timesteps, so is economical of memory.

Because of all these advantages, most codes for simulating collisionlessstellar systems use the leapfrog integrator. Time-reversible, symplectic inte-grators of fourth and higher orders, derived by combining multiple kick anddrift steps, are described in Problem 3.27 and Yoshida (1993).

One serious limitation of symplectic integrators is that they work wellonly with fixed timesteps, for the following reason. Consider an integratorwith fixed timestep h that maps phase-space coordinates w to w′ = W(w, h).The integrator is symplectic if the function W satisfies the symplectic con-dition (D.78), which involves the Jacobian matrix gαβ = ∂Wα/∂wβ. Nowsuppose that the timestep is varied, by choosing it to be some function h(w)

of location in phase space, so w′ = W[w, h(w)] ≡ W(w). The Jacobian

matrix of W is not equal to the Jacobian matrix of W, and in general willnot satisfy the symplectic condition; in words, a symplectic integrator withfixed timestep is generally no longer symplectic once the timestep is varied.


Fortunately, the geometric constraints on phase-space flow imposed bytime-reversibility are also strong, so the leapfrog integrator retains its goodbehavior if the timestep is adjusted in a time-reversible manner, even thoughthe resulting integrator is no longer symplectic. Here is one way to do this:suppose that the appropriate timestep h is given by some function τ(w) ofthe phase-space coordinates. Then we modify equations (3.166a) to

q1/2 = q + 12hp ; p1/2 = p − 1

2h∇Φ(q1/2),

t′ = t + 12 (h + h′),

p′ = p1/2 − 12h′

∇Φ(q1/2) ; q′ = q1/2 + 12h′p′.

(3.167)

Here h′ is determined from h by solving the equation u(h, h′) = τ(q1/2,p1/2),where τ(q,p) is the desired timestep at (q,p) and u(h, h′) is any symmetricfunction of h and h′ such that u(h, h) = h; for example, u(h, h′) = 1

2 (h + h′)or u(h, h′) = 2hh′/(h + h′).

3.4.2 Runge–Kutta and Bulirsch–Stoer integrators

To follow the motion of particles in a given smooth gravitational potentialΦ(q) for up to a few hundred crossing times, the fourth-order Runge–Kuttaintegrator provides reliable transportation. The algorithm is

k1 = hf(w, t) ; k2 = hf(w + 12k1, t + 1

2h),

k3 = hf(w + 12k2, t + 1

2h) ; k4 = hf(w + k3, t + h),

w′ = w + 16 (k1 + 2k2 + 2k3 + k4) ; t′ = t + h.

(3.168)The Runge–Kutta integrator is neither symplectic nor reversible, and it re-quires considerably more memory than the leapfrog integrator because mem-ory has to be allocated to k1, . . . ,k4. However, it is easy to use and providesfourth-order accuracy.

The Bulirsch–Stoer integrator is used for the same purposes as theRunge–Kutta integrator; although more complicated to code, it often sur-passes the Runge–Kutta integrator in performance. The idea behind thisintegrator is to estimate w(t + h) from w(t) using first one step of length h,then two steps of length h/2, then four steps of length h/4, etc., up to 2K

steps of length h/2K for some predetermined number K. Then one extrap-olates this sequence of results to the coordinates that would be obtained inthe limit K → ∞. Like the Runge–Kutta integrator, this integrator achievesspeed and accuracy at the cost of the memory required to hold intermediateresults. Like all high-order integrators, the Runge–Kutta and Bulirsch–Stoerintegrators work best when following motion in smooth gravitational fields.


3.4.3 Multistep predictor-corrector integrators

We now discuss more complex integrators that are widely used in simulationsof star clusters. We have a trajectory that has arrived at some phase-spaceposition w0 at time t0, and we wish to predict its position w1 at t1. The gen-eral idea is to assume that the trajectory w(t) is a polynomial function of timewpoly(t), called the interpolating polynomial. The interpolating polyno-mial is determined by fitting to some combination of the present positionw0, the past positions, w−1,w−2, . . . at times t−1, t−2, . . ., and the presentand past phase-space velocities, which are known through wj = f(wj , tj).There is no requirement that f is derived from Hamilton’s equations, so thesemethods can be applied to any first-order differential equations; on the otherhand they are not symplectic.

If the interpolating polynomial has order k, then the error after a smalltime interval h is given by the first term in the Taylor series for w(t) notrepresented in the polynomial, which is O(hk+1). Thus the order of theintegrator is k.12

The Adams–Bashforth multistep integrator takes wpoly to be theunique kth-order polynomial that passes through w0 at t0 and through thek points (t−k+1, w−k+1), . . . , (t0, w0).

Explicit formulae for the Adams–Bashforth integrators are easy to findby computer algebra; however, the formulae are too cumbersome to writehere except in the special case of equal timesteps, tj+1 − tj = h for all j.Then the first few Adams–Bashforth integrators are

w1 = w0 + h

w0 (k = 1)32w0 − 1

2w−1 (k = 2)2312w0 − 4

3w−1 + 512 w−2 (k = 3)

5524w0 − 59

24w−1 + 3724w−2 − 3

8w−3 (k = 4).

(3.169)

The case k = 1 is called Euler’s integrator.The Adams–Moulton integrator differs from Adams–Bashforth only

in that it computes the interpolating polynomial from the position w0 andthe phase-space velocities w−k+2, . . . , w1. For equal timesteps, the first fewAdams–Moulton integrators are

w1 = w0 + h

w1 (k = 1)12w1 + 1

2w0 (k = 2)512w1 + 2

3w0 − 112w−1 (k = 3)

38w1 + 19

24w0 − 524w−1 + 1

24w−2 (k = 4).

(3.170)

12 Unfortunately, the term “order” is used both for the highest power retained in theTaylor series for w(t), tk, and the dependence of the one-step error on the timestep, hk+1;fortunately, both orders are the same.


Since w1 is determined by the unknown phase-space position w1 throughw1 = f(w1, t1), equations (3.170) are nonlinear equations for w1 that mustbe solved iteratively. The Adams–Moulton integrator is therefore said to beimplicit, in contrast to Adams–Bashforth, which is explicit.

The strength of the Adams–Moulton integrator is that it determinesw1 by interpolating the phase-space velocities, rather than by extrapolatingthem, as with Adams–Bashforth. This feature makes it a more reliable andstable integrator; the cost is that a nonlinear equation must be solved atevery timestep.

In practice the Adams–Bashforth and Adams–Moulton integrators areused together as a predictor-corrector integrator. Adams–Bashforth isused to generate a preliminary value w1 (the prediction or P step), whichis then used to generate w1 = f(w1, t1) (the evaluation or E step), which isused in the Adams–Moulton integrator (the corrector or C step). This three-step sequence is abbreviated as PEC. In principle one can then iterate theAdams–Moulton integrator to convergence through the sequence PECEC· · ·;however, this is not cost-effective, since the Adams–Moulton formula, evenif solved exactly, is only an approximate representation of the differentialequation we are trying to solve. Thus one usually stops with PEC (stopthe iteration after evaluating w1 twice) or PECE (stop the iteration afterevaluating w1 twice).

When these methods are used in orbit integrations, the equations ofmotion usually have the form x = v, v = −∇Φ(x, t). In this case it is bestto apply the integrator only to the second equation, and to generate thenew position x1 by analytically integrating the interpolating polynomial forv(t)—this gives a formula for x1 that is more accurate by one power of h.

Analytic estimates (Makino 1991) suggest that the one-step error in theAdams–Bashforth–Moulton predictor-corrector integrator is smaller than theerror in the Adams–Bashforth integrator by a factor of 5 for k = 2, 9 fork = 3, 13 for k = 4, etc. These analytic results, or the difference between thepredicted and corrected values of w1, can be used to determine the longesttimestep that is compatible with a prescribed target accuracy—see §3.4.5.

Because multistep integrators require information from the present timeand k − 1 past times, a separate startup integrator, such as Runge–Kutta,must be used to generate the first k− 1 timesteps. Multistep integrators arenot economical of memory because they store the coefficients of the entireinterpolating polynomial rather than just the present phase-space position.

3.4.4 Multivalue integrators

By differentiating the equations of motion w = f(w) with respect to time, weobtain an expression for w, which involves second derivatives of the poten-tial, ∂2Φ/∂qi∂qj. If our Poisson solver delivers reliable values for these secondderivatives, it can be advantageous to use w or even higher time derivativesof w to determine the interpolating polynomial wpoly(t). Algorithms that


employ the second and higher derivatives of w are called multivalue inte-grators.

In the simplest case we set wpoly(t) to the kth-order polynomial thatmatches w and its first k time derivatives at t0; this provides k+1 constraintsfor the k + 1 polynomial coefficients and corresponds to predicting w(t) byits Taylor series expansion around t0. A more satisfactory approach is todetermine wpoly(t) from the values taken by w, w, w, etc., at both t0 andt1. Specifically, for even k only, we make wpoly(t) the kth-order polynomialthat matches w at t0 and its first 1

2k time derivatives at both t0 and t1—onceagain this provides 1 + 2× 1

2k = k + 1 constraints and hence determines thek + 1 coefficients of the interpolating polynomial. The first few integratorsof this type are

w1 = w0 +

12h(w0 + w1) (k = 2)12h(w0 + w1) + 1

12h2(w0 − w1) (k = 4)12h(w0 + w1) + 1

10h2(w0 − w1)

+ 1120h3(

...w0 +

...w1) (k = 6).

(3.171)

Like the Adams–Moulton integrator, all of these integrators are implicit, andin fact the first of these formulae is the same as the second-order Adams–Moulton integrator in equation (3.170). Because these integrators employinformation from only t0 and t1, there are two significant simplificationscompared to multistep integrators: no separate startup procedure is needed,and the formulae look the same even if the timestep is variable.

Multivalue integrators are sometimes called Obreshkov (or Obrechkoff)or Hermite integrators, the latter name arising because they are based onHermite interpolation, which finds a polynomial that fits specified values ofa function and its derivatives (Butcher 1987).

Makino & Aarseth (1992) and Makino (2001) recommend a fourth-ordermultivalue predictor-corrector integrator for star-cluster simulations. Theirpredictor is a single-step, second-order multivalue integrator, that is, a Tay-lor series including terms of order h2. Writing dv/dt = g, where g is thegravitational field, their predicted velocity is

vp,1 = v0 + hg0 + 12h2g0. (3.172a)

The predicted position is obtained by analytically integrating the interpolat-ing polynomial for v,

xp,1 = x0 + hv0 + 12h2g0 + 1

6h3g0. (3.172b)

The predicted position and velocity are used to compute the gravitationalfield and its time derivative at time t1, g1 and g1. These are used to correctthe velocity using the fourth-order formula (3.171):

v1 = v0 + 12h(g0 + g1) + 1

12h2(g0 − g1); (3.172c)


in words, v1 is determined by the fourth-order interpolating polynomialvpoly(t) that satisfies the five constraints vpoly(t0) = v0, vpoly(ti) = gi,vpoly(ti) = gi for i = 0, 1.

To compute the corrected position, the most accurate procedure is tointegrate analytically the interpolating polynomial for v, which yields:

x1 = x0 + hv0 + 120h2(7g0 + 3g1) + 1

60h3(3g0 − 2g1). (3.172d)

The performance of this integrator, often simply called the Hermite integra-tor, is illustrated in Figure 3.21.

3.4.5 Adaptive timesteps

Except for the simplest problems, any integrator should have an adap-tive timestep, that is, an automatic procedure that continually adjuststhe timestep to achieve some target level of accuracy. Choosing the righttimestep is one of the most challenging tasks in designing a numerical in-tegration scheme. Many sophisticated procedures are described in publiclyavailable integration packages and numerical analysis textbooks. Here weoutline a simple approach.

Let us assume that our goal is that the error in w after some short time τshould be less than ε|w0|, where ε + 1 and w0 is some reference phase-spaceposition. We first move from w to w2 by taking two timesteps of lengthh + τ . Then we return to w and take one step of length 2h to reach w1.Suppose that the correct position after an interval 2h is w′, and that ourintegrator has order k. Then the errors in w1 and w2 may be written

w1 − w′ ( (2h)k+1E ; w2 − w′ ( 2hk+1E, (3.173)

where E is an unknown error vector. Subtracting these equations to eliminatew′, we find E ( (w1 − w2)/[2(2k − 1)hk+1]. Now if we advance for a timeτ , using n ≡ τ/h′ timesteps of length h′, the error will be

∆ = nh′k+1E = (w1 − w2)

τh′k

2(2k − 1)hk+1. (3.174)

Our goal that |∆| ∼< ε|w0| will be satisfied if

h′ < hmax ≡(

2(2k − 1)h

τ

ε|w0||w1 − w2|

)1/k

h. (3.175)

If we are using a predictor-corrector scheme, a similar analysis can beused to deduce hmax from the difference of the phase-space positions returnedby the predictor and the corrector, without repeating the entire predictor-corrector sequence.


3.4.6 Individual timesteps

The density in many stellar systems varies by several orders of magnitudebetween the center and the outer parts, and as a result the crossing time oforbits near the center is much smaller than the crossing time in the outerenvelope. For example, in a typical globular cluster the crossing time at thecenter is ∼< 1 Myr, while the crossing time near the tidal radius is ∼ 100 Myr.Consequently, the timestep that can be safely used to integrate the orbits ofstars is much smaller at the center than the edge. It is extremely inefficientto integrate all of the cluster stars with the shortest timestep needed for anystar, so integrators must allow individual timesteps for each star.

If the integrator employs an interpolating polynomial, the introductionof individual timesteps is in principle fairly straightforward. To advance agiven particle, one uses the most recent interpolating polynomials of all theother particles to predict their locations at whatever times the integratorrequires, and then evaluates the forces between the given particle and theother particles.

This procedure makes sense if the Poisson solver uses direct summation(§2.9.1). However, with other Poisson solvers there is a much more efficientapproach. Suppose, for example, that we are using a tree code (§2.9.2).Then before a single force can be evaluated, all particles have to be sortedinto a tree. Once that has been done, it is comparatively inexpensive toevaluate large numbers of forces; hence to minimize the computational workdone by the Poisson solver, it is important to evaluate the forces on manyparticles simultaneously. A block timestep scheme makes this possiblewhilst allowing different timesteps for different particles, by quantizing thetimesteps. We now describe how one version of this scheme works with theleapfrog integrator.

We assign each particle to one of K + 1 classes, such that particles inclass k are to be advanced with timestep hk ≡ 2kh for k = 0, 1, 2, . . . , K.Thus h is the shortest timestep (class 0) and 2Kh is the longest (class K).The Poisson solver is used to evaluate the gravitational field at the initialtime t0, and each particle is kicked by the impulse − 1

2hk∇Φ, correspondingto the first part of the kick-drift-kick leapfrog step (3.166b). In Figure 3.22the filled semicircles on the left edge of the diagram symbolize these kicks;they are larger at the top of the diagram to indicate that the strength of thekicks increases as 2k. Then every particle is drifted through time h, and thePoisson solver is used only to find the forces on the particles in class 0, sothese particles can be kicked by −h∇Φ, which is the sum of the kicks at theend of their first leapfrog step and the start of their second.

Next we drift all particles through h a second time, and use the Poissonsolver to find the forces on the particles in both class 0 and class 1. Theparticles of class 0 are kicked by −h∇Φ, and the particles of class 1 arekicked by −h1∇Φ = −2h∇Φ. After an interval 3h the particles in class 0are kicked, after 4h the particles in classes 0, 1 and 2 are kicked, etc. This


Figure 3.22 Schematic of the block timestep scheme, for a system with 5 classes ofparticles, having timestep h (class k = 0), 2h, . . . , 16h (class k = K = 4). The particlesare integrated for a total time of 16h. Each filled circle or half-circle marks the timeat which particles in a given class are kicked. Each vertical bar marks a time at whichparticles in a class are paused in their drift step, without being kicked, in order to calculatetheir contribution to the kick given to particles in lower classes. The kicks at the startand end of the integration, t = 0 and t = 16h, are half as strong as the other kicks, andso are denoted by half-circles.

process continues until all particles are due for a kick, after a time hK = 2Kh.The final kick for particles in class k is − 1

2hk∇Φ, which completes 2K−k

leapfrog steps for each particle. At this point it is prudent to reconsiderhow the particles are assigned to classes in case some need smaller or largertimesteps.

A slightly different block timestep scheme works well with a particle-mesh Poisson solver (§2.9.3) when parts of the computational domain arecovered by finer meshes than others, with each level of refinement being bya factor of two in the number of mesh points per unit length (Knebe, Green,& Binney 2001). Then particles are assigned timesteps according to thefineness of the mesh they are in: particles in the finest mesh have timestep∆t = h, while particles in the next coarser mesh have ∆t = 2h, and so on.Particles on the finest mesh are drifted through time 1

2h before the densityis determined on this mesh, and the Poisson solver is invoked to determinethe forces on this mesh. Then the particles on this mesh are kicked throughtime h and drifted through time 1

2h. Then the same drift-kick-drift sequenceis used to advance particles on the next coarser mesh through time 2h. Nowthese particles are ahead in time of the particles on the finest mesh. Thissituation is remedied by again advancing the particles on the finest meshby h with the drift-kick-drift sequence. Once the particles on the two finest


meshes have been advanced through time 2h, we are ready to advance by∆t = 4h the particles that are the next coarser mesh, followed by a repeatof the operations that were used to advance the particles on the two finestmeshes by 2h. The key point about this algorithm is that at each levelk, particles are first advanced ahead of particles on the next coarser mesh,and then the latter particles jump ahead of the particles on level k so thenext time the particles on level k are advanced, they are catching up withthe particles of the coarser mesh. Errors arising from moving particles in agravitational field from the surroundings that is out-of-date are substantiallycanceled by errors arising from moving particles in an ambient field that hasrun ahead of itself.

3.4.7 Regularization

In any simulation of a star cluster, sooner or later two particles will sufferan encounter having a very small impact parameter. In the limiting case inwhich the impact parameter is exactly zero (a collision orbit), the equationof motion for the distance r between the two particles is (eq. D.33)

r = −GM/r2, (3.176)

where M is the sum of the masses of the two particles. This equation is sin-gular at r = 0, and a conscientious integrator will attempt to deal with thesingularity by taking smaller and smaller timesteps as r diminishes, therebybringing the entire N-body integration grinding to a halt. Even in a near-collision orbit, the integration through pericenter will be painfully slow. Thisproblem is circumvented by transforming to a coordinate system in whichthe two-body problem has no singularity—this procedure is called regular-ization (Stiefel & Schiefele 1971; Mikkola 1997; Heggie & Hut 2003; Aarseth2003). Standard integrators can then be used to solve the equations of mo-tion in the regularized coordinates.

(a) Burdet–Heggie regularization The simplest approach to regular-ization is time transformation. We write the equations of motion for thetwo-body problem as

r = −GMr

r3+ g, (3.177)

where g is the gravitational field from the other N − 2 bodies in the simula-tion, and change to a fictitious time τ that is defined by

dt = r dτ. (3.178)

Denoting derivatives with respect to τ by a prime we find

r =dτ

dt

dr

dτ=

1

rr′ ; r =

dτ

dt

d

dτ

1

rr′ =

1

r2r′′ −

r′

r3r′. (3.179)


Figure 3.23 Fractional energy error from integrating one pericenter passage of a highlyeccentric orbit in a Keplerian potential, as a function of the number of force evaluations.The orbit has semi-major axis a = 1 and eccentricity e = 0.99, and is integrated fromr = 1, r < 0 to r = 1, r > 0. Curves labeled by “RK” are followed using a fourth-order Runge–Kutta integrator (3.168) with adaptive timestep control as described byPress et al. (1986). The curve labeled “U” for “unregularized” is integrated in Cartesiancoordinates, the curve “BH” uses Burdet–Heggie regularization, and the curve “KS” usesKustaanheimo–Stiefel regularization. The curve labeled “U,LF” is followed in Cartesiancoordinates using a leapfrog integrator with timestep proportional to radius (eq. 3.167).The horizontal axis is the number of force evaluations used in the integration.

Substituting these results into the equation of motion, we obtain

r′′ =r′

rr′ − GM

r

r+ r2g. (3.180)

The eccentricity vector e (eq. 4 of Box 3.2) helps us to simplify this equation.We have

e = v × (r × v) − GM er

= |r′|2r

r2−

r′

rr′ − GM

r

r,

(3.181)

where we have used v = r = r′/r and the vector identity (B.9). Thusequation (3.180) can be written

r′′ = |r′|2r

r2− 2GM

r

r− e + r2g. (3.182)


The energy of the two-body orbit is

E2 = 12v2 −

GM

r=

|r′|2

2r2−

GM

r, (3.183)

so we arrive at the regularized equation of motion

r′′ − 2E2r = −e + r2g, (3.184)

in which the singularity at the origin has disappeared. This must be supple-mented by equations for the rates of change of E2, e, and t with fictitioustime τ ,

E′2 = g · r′ ; e′ = 2r(r′ · g) − r′(r · g) − g(r · r′) ; t′ = r. (3.185)

When the external field g vanishes, the energy E2 and eccentricity vector eare constants, the equation of motion (3.184) is that of a harmonic oscillatorthat is subject to a constant force −e, and the fictitious time τ is proportionalto the eccentric anomaly (Problem 3.29).

Figure 3.23 shows the fractional energy error that arises in the integra-tion of one pericenter passage of an orbit in a Kepler potential with eccen-tricity e = 0.99. The error is plotted as a function of the number of forceevaluations; this is the correct economic model if force evaluations dominatethe computational cost, as is true for N-body integrations with N ) 1. Notethat even with ∼> 1000 force evaluations per orbit, a fourth-order Runge–Kutta integrator with adaptive timestep is sometimes unable to follow theorbit. Using the same integrator, Burdet–Heggie regularization reduces theenergy error by almost five orders of magnitude.

This figure also shows the energy error that arises when integrating thesame orbit using leapfrog with adaptive timestep (eq. 3.167) in unregularizedcoordinates. Even though leapfrog is only second-order, it achieves an accu-racy that substantially exceeds that of the fourth-order Runge–Kutta inte-grator in unregularized coordinates, and approaches the accuracy of Burdet–Heggie regularization. Thus a time-symmetric leapfrog integrator providesmuch of the advantage of regularization without coordinate or time trans-formations.

(b) Kustaanheimo–Stiefel (KS) regularization An alternative reg-ularization procedure, which involves the transformation of the coordinatesin addition to time, can be derived using the symmetry group of the Keplerproblem, the theory of quaternions and spinors, or several other methods(Stiefel & Schiefele 1971; Yoshida 1982; Heggie & Hut 2003). Once againwe use the fictitious time τ defined by equation (3.178). We also define afour-vector u = (u1, u2, u3, u4) that is related to the position r = (x, y, z) by

u21 = 1

2 (x + r) cos2 ψ

u24 = 1

2 (x + r) sin2 ψ

u2 =yu1 + zu4

x + r

u3 =zu1 − yu4

x + r,

(3.186)

3.5 Angle-action variables 211

where ψ is an arbitrary parameter. The inverse relations are

x = u21 − u2

2 − u23 + u2

4 ; y = 2(u1u2 − u3u4) ; z = 2(u1u3 + u2u4). (3.187)

Note that r = u21 + u2

2 + u23 + u2

4. Let Φe be the potential that generates theexternal field g = −∇Φe. Then in terms of the new variables the equationof motion (3.177) reads

u′′ − 12Eu = − 1

4

∂

∂u

(|u|2Φe

),

E = 12v2 −

GM

r+ Φe = 2

|u′|2

|u|2−

GM

|u|2+ Φe,

E′ = |u|2∂Φe

∂t; t′ = |u|2,

(3.188)

When the external force vanishes, the first of equations (3.188) is the equationof motion for a four-dimensional harmonic oscillator.

Figure 3.23 shows the fractional energy error that arises in the integra-tion of an orbit with eccentricity e = 0.99 using KS regularization. Usingthe same integrator, the energy error is more than an order of magnitudesmaller than the error using Burdet–Heggie regularization.

3.5 Angle-action variables

In §3.1 we introduced the concept of an integral of motion and we saw thatevery spherical potential admitted at least four integrals Ii, namely, theHamiltonian and the three components of angular momentum. Later wefound that orbits in flattened axisymmetric potentials frequently admit threeintegrals, the classical integrals H and pφ, and the non-classical third inte-gral. Finally in §3.3 we found that many orbits in planar non-axisymmetricpotentials admitted a non-classical integral in addition to the Hamiltonian.

In this section we explore the advantages of using integrals as coordi-nates for phase space. Since elementary Newtonian or Lagrangian mechanicsrestricts our choice of coordinates to ones that are rarely integrals, we workin the more general framework of Hamiltonian mechanics (Appendix D). Fordefiniteness, we shall assume that there are three independent coordinates(so phase space is six-dimensional) and that we have three analytic isolatingintegrals Ii(x,v). We shall focus on a particular set of canonical coordinates,called angle-action variables; the three momenta are integrals, called “ac-tions”, and the conjugate coordinates are called “angles”. An orbit fortunateenough to possess angle-action variables is called a regular orbit.

We start with a number of general results that apply to any systemof angle-action variables. Then in a series of subsections we obtain explicit


expressions for these variables in terms of ordinary phase-space coordinatesfor spherical potentials, flattened axisymmetric potentials and planar, non-axisymmetric potentials. The section ends with a description of how ac-tions enable us to solve problems in which the gravitational potential evolvesslowly.

Angle-action variables cannot be defined for many potentials of practicalimportance for galactic dynamics. Nonetheless, the conceptual framework ofangle-action variables proves extremely useful for understanding the complexphenomena that arise in potentials that do not admit them.

The discussion below is heuristic and non-rigorous; for a precise andelegant account see Arnold (1989).

3.5.1 Orbital tori

Let us denote the angle-action variables by (θ,J). We assume that themomenta J = (J1, J2, J3) are integrals of motion. Then Hamilton’s equations(D.54) for the motion of the Ji read

0 = Ji = −∂H

∂θi. (3.189)

Therefore, the Hamiltonian must be independent of the coordinates θ, thatis H = H(J). Consequently, we can trivially solve Hamilton’s equations forthe θi as functions of time:

θi =∂H

∂Ji≡ Ωi(J), a constant ⇒ θi(t) = θi(0) + Ωit. (3.190)

So everything lies at our feet if we can install three integrals of motion asthe momenta of a system of canonical coordinates.13

We restrict our attention to bound orbits. In this case, the Cartesiancoordinates xi cannot increase without limit as the θi do (eq. 3.190). Fromthis we infer that the xi are periodic functions of the θi. We can scale θi sothat x returns to its original value after θi has increased by 2π. Then we canexpand x in a Fourier series (Appendix B.4)

x(θ,J) =∑

n

Xn(J)ein·θ, (3.191)

where the sum is over all vectors n with integer components. When we elim-inate the θi using equation (3.190), we find that the spatial coordinates are

13 To be able to use the Ji as a set of momenta, they must satisfy the canonical com-mutation relations (D.71), so we require [Ji, Jj ] = 0; functions satisfying this conditionare said to be in involution. For example, the components of angular momenta are notin involution: [Lx, Ly] = Lz , etc.


Figure 3.24 Two closed paths on atorus that cannot be deformed intoone another, nor contracted to singlepoints.

Fourier series in time, in which every frequency is a sum of integer multiplesof the three fundamental frequencies Ωi(J) that are defined by equa-tion (3.190). Such a time series is said to be conditionally periodic orquasiperiodic.14 For example, in spherical potentials (§3.1) the periods Tr

and Tψ are inverses of such fundamental frequencies: Ti = 2π/Ωi. The thirdfundamental frequency is zero because the orbital plane is fixed in space—see§3.5.2.

An orbit is said to be resonant when its fundamental frequencies satisfya relation of the form n · Ω = 0 for some integer triple n /= 0. Usually thisimplies that two of the frequencies are commensurable, that is the ratioΩi/Ωj is a rational number (−nj/ni).

Consider the three-surface (i.e., volume) of fixed J and varying θ. Thisis a cube of side-length 2π, and points on opposite sides must be identifiedsince we have seen that incrementing, say, θ1 by 2π while leaving θ2, θ3 fixedbrings one back to the same point in phase space. A cube with faces identifiedin this way is called a three-torus by analogy with the connection between arectangle and a two-torus: if we sew together opposite edges of a rectangularsheet of rubber, we generate the doughnut-shaped inner tube of a bicycletire.

We shall find that these three-tori are in many respects identical withorbits, so it is important to have a good scheme for labeling them. The bestset of labels proves to be the Poincare invariants (Appendix D.4.2)

J ′i ≡

1

2π

∫ ∫dq · dp =

1

2π

∫ ∫ ∑

j=1,3

dqjdpj, (3.192)

where the integral is over any surface that is bounded by the path γi on whichθi increases from 0 to 2π while everything else is held constant (Figure 3.24).Since angle-action variables are canonical, dq · dp = dθ · dJ (eq. D.84), so

J ′i =

1

2π

∫ ∫

interior of γi

dθ · dJ =1

2π

∫ ∫

interior of γi

dθidJi. (3.193)

14 Observers of binary stars use the term quasiperiod more loosely. Our usage is equiv-alent to what is meant by a quasicrystal: a structure whose Fourier transform is discrete,but in which there are more fundamental frequencies than independent variables (in ourcase one, t, in a quasicrystal three x, y, z).


Box 3.4: Angle-action variables as polar coordinates

The figure shows the intersection with a coordinate plane of some ofthe nested orbital tori of a two-dimensional harmonic oscillator. Thecoordinates qi, pi have been scaled such that the tori appear as circles.The values of the action Ji on successive tori are chosen to be 0, 1, 2 . . .(in some suitable units), so, by equation (3.192), the areas inside suc-cessive tori are 0, 2π, 4π, . . .. Hence, the radii r = (q2

i + p2i )

1/2 of suc-cessive circles are

√2 × (0, 1,

√2,√

3, . . .). In general the radius ofthe circle associated with the toruson which Ji takes the value J ′ isr =

√2J ′. In this plane, the angle

variable θi is closely analogous tothe usual azimuthal angle. Hence,angle-action variables are closelyanalogous to plane polar coordi-nates, the major difference beingthat coordinate circles are labelednot by radius but by

√2 times the

area they enclose. The generatingfunction for the transformationfrom (θi, Ji) to (qi, pi) is given inProblem 3.31.

As Box 3.4 explains, angle-action variables are a kind of polar coordinatesfor phase space, and have a coordinate singularity within the domain ofintegration. We must exclude this from the domain of integration beforewe use Green’s theorem to convert the surface integral in (3.193) into a lineintegral. The value of our surface integral is unchanged by excluding thispoint, but when we use Green’s theorem (eq. B.61) on the original domainless the excluded point, we obtain two line integrals, one along the curve γi

and one along the boundary that surrounds the excluded point—along thissecond boundary, Ji takes some definite value, Jc

i , say, and θi takes all valuesin the range (0, 2π). Thus we have

J ′i =

1

2π

(∮

γi

Jidθi −∮

Ji=Jci

Jidθi

)= Ji − Jc

i . (3.194)

This equation shows that the label J ′i defined by equation (3.192) will be

identical with our original action coordinate Ji providing we set Ji = 0 at thecoordinate singularity that marks the center of the angle-action coordinatesystem. We shall henceforth assume that this choice has been made.

In practical applications we often evaluate the integral of equation(3.192) using phase-space coordinates that have no singularity within the


domain of integration. Then we can replace the surface integral with a lineintegral that is easier to evaluate:

Ji =1

2π

∮

γi

p · dq. (3.195)

(a) Time-averages theorem In Chapter 4 we shall make extensive usea result that we can now prove.

Time averages theorem When a regular orbit is non-resonant, the averagetime that the phase point of a star on that orbit spends in any region D ofits torus is proportional to the integral V (D) =

∫D d3θ.

Proof: Let fD be the function such that fD(θ) = 1 when the point θ lies inD, and is zero otherwise. We may expand fD in a Fourier series (cf. eq. 3.191)

fD(θ) =∞∑

n=−∞Fn exp(in · θ). (3.196)

Now ∫

torusd3θ fD(θ) =

∫

Dd3θ = V (D). (3.197a)

With equation (3.196) we therefore have

V (D) =

∫

torusd3θ fD(θ) =

∞∑

n=−∞Fn

3∏

k=1

∫ 2π

0dψ exp(inkψ) = (2π)3F0.

(3.197b)On the other hand, the fraction of the interval (0, T ) during which the star’sphase point lies in D is

τT (D) =1

T

∫ T

0dt fD[θ(t)], (3.198)

where θ(t) is the position of the star’s phase point at time t. With equations(3.190) and (3.196), equation (3.198) becomes

τT (D) =1

T

∑

n

ein·θ(0)

∫ T

0dt Fnei(n·Ω)t

= F0 +1

T

∑

n (=0

ein·θ(0)Fnei(n·Ω)T − 1

in · Ω.

(3.199)

Thus

limT→∞

τT (D) = F0 =V (D)

(2π)3, (3.200)


Figure 3.25 The action space of an axisymmetric potential. Two constant-energy surfacesare shown for the spherical isochrone potential (2.47). The surfaces H = −0.5(GM/2b)and H = −0.03(GM/2b) are shown (eq. 3.226) with the axes all scaled to length 5

√GMb.

which completes the proof.5

Note that if the orbit is resonant, n ·Ω vanishes for some n /= 0 and thesecond equality in (3.199) becomes invalid, so the theorem cannot be proved.In fact, if Ωi : Ωj = m : n, say, then by equations (3.190) I4 ≡ nθi − mθj

becomes an isolating integral that confines the star to a spiral on the torus.We shall see below that motion in a spherical potential provides an importantexample of this phenomenon.

(b) Action space In Chapter 4 we shall develop the idea that galaxiesare made up of orbits, and we shall find it helpful to think of whole orbitsas single points in an abstract space. Any isolating integrals can serve as co-ordinates for such a representation, but the most advantageous coordinatesare the actions. We define action space to be the imaginary space whoseCartesian coordinates are the actions. Figure 3.25 shows the action space ofan axisymmetric potential, when the actions can be taken to be generaliza-tions of the actions for spherical potentials listed in Table 3.1 below. Pointson the axes represent orbits for which only one of the integrals (3.192) isnon-zero. These are the closed orbits. The origin represents the orbit of astar that just sits at the center of the potential. In each octant, surfaces ofconstant energy are approximate planes; by equation (3.190) the local nor-mal to this surface is parallel to the vector Ω. Every point in the positivequadrant Jr, Jϑ ≥ 0, all the way to infinity, represents a bound orbit.

A region R3 in action space represents a group of orbits. Let the volumeof R3 be V3. The volume of six-dimensional phase space occupied by theorbits is

V6 =

∫

R6

d3xd3v, (3.201)


where R6 is the region of phase space visited by stars on the orbits of R3.Since the coordinate set (J, θ) is canonical, d3xd3v = d3Jd3θ (see eq. D.81)and thus

V6 =

∫

R6

d3Jd3θ. (3.202)

But for any orbit the angle variables cover the range (0, 2π), so we mayimmediately integrate over the angles to find

V6 = (2π)3∫

R3

d3J = (2π)3V3. (3.203)

Thus the volume of a region of action space is directly proportional to thevolume of phase space occupied by its orbits.

(c) Hamilton–Jacobi equation The transformation between any twosets of canonical coordinates can be effected with a generating function (Ap-pendix D.4.6). Let S(q,J) be the (unknown) generating function of thetransformation between angle-action variables and ordinary phase space co-ordinates (q,p) such as q = x, p = v. Then (eq. D.93)

θ =∂S

∂J; p =

∂S

∂q, (3.204)

where p and θ are now to be considered functions of q and J. We can useS(q,J) to eliminate p from the usual Hamiltonian function H(q,p) and thusexpress H as a function

H(q,

∂S

∂q(q,J)

).

of (q,J). By moving along an orbit, we can vary the qi while holding constantthe Ji. As we vary the qi in this way, H must remain constant at the energyE of the orbit in question. This suggests that we investigate the partialdifferential equation

H(q,

∂S

∂q

)= E at fixed J. (3.205)

If we can solve this Hamilton–Jacobi equation, our solution should con-tain some arbitrary constants Ki—we shall see below that we usually solvethe equation by the method of separation of variables (e.g., §2.4) and theconstants are separation constants. We identify the Ki with functions of theactions as follows. Eliminating p from equation (3.195) we have

Ji =1

2π

∮

γi

∂S

∂q· dq =

∆S(K)

2π. (3.206)


This equation states that Ji is proportional to the increment in the generatingfunction when one passes once around the torus on the ith path—S, like themagnetic scalar potential around a current-carrying wire, is a multivaluedfunction. The increment in S, and therefore Ji, depends on the integrationconstants that appear in S, so these are functions of the actions.

Once the Hamilton–Jacobi equation has been solved and the integralsin (3.206) have been evaluated, S becomes a known function S(q,J) andwe can henceforth use equations (3.204) to transform between angle-actionvariables and ordinary phase-space coordinates. In particular, we can inte-grate orbits trivially by transforming the initial conditions into angle-actionvariables, incrementing the angles, and transforming back to ordinary phase-space coordinates.

Let us see how this process works in a simple example. The Hamiltonianof a two-dimensional harmonic oscillator is

H(x,p) = 12 (p2

x + p2y + ω2

xx2 + ω2yy2). (3.207)

Substituting in px = ∂S/∂x, py = ∂S/∂y (eq. 3.204), the Hamilton–Jacobiequation reads

(∂S

∂x

)2+(∂S

∂y

)2+ ω2

xx2 + ω2yy2 = 2E, (3.208)

where S is a function of x, y and J. We solve this partial differential equationby the method of separation of variables.15 We write S(x, y,J) = Sx(x,J) +Sy(y,J) and rearrange the equation to

(∂Sx

∂x

)2+ ω2

xx2 = 2E −(∂Sy

∂y

)2− ω2

yy2. (3.209)

The left side does not depend on y and the right side does not depend onx. Consequently, each side can only be a function of J, which we call K2(J)because it is evidently non-negative:

K2 ≡(∂Sx

∂x

)2+ ω2

xx2. (3.210)

It follows that

Sx(x,J) = K

∫ x

dx′ ε

√1 −

ω2xx′2

K2,

15 When this method is applied in quantum mechanics and in potential theory (e.g.,§2.4) one usually assumes that the dependent variable is a product of functions of onevariable, rather than a sum of such functions as here.


where ε is chosen to be ±1 so that Sx(x,J) increases continuously along apath over the orbital torus. Changing the variable of integration, we have

Sx(x,J) =K2

ωx

∫dψ′ sin2 ψ′ where x = −

K

ωxcosψ

=K2

2ωx

(ψ − 1

2 sin 2ψ).

(3.211)

Moreover, px = ∂S/∂x = εK√

1 − ω2xx2/K2 = K sinψ, so both x and px

return to their old values when ψ is incremented by 2π. We infer thatincrementing ψ by 2π carries us around the path γx that is associated withJx through equation (3.192). Thus equation (3.206) now yields

Jx =∆S

2π=

∆Sx

2π. (3.212)

Equation (3.211) tells us that when ψ is incremented by 2π, Sx increases byK2π/ωx. Hence,

Jx(x, px) =K2

2ωx=

p2x + ω2

xx2

2ωx, (3.213)

where the last equality follows from (3.210) with ∂Sx/∂x replaced by px.The solution for Jy(y, py) proceeds analogously and yields

Jy(y, py) =2E − K2

2ωy=

p2y + ω2

yy2

2ωy. (3.214)

Comparing with equation (3.207), we find that

H(J) = ωxJx + ωyJy. (3.215)

Notice from (3.215) that Ωx ≡ ∂H/∂Jx = ωx and similarly for Ωy.Finally we determine the angle variables from the second of equations

(3.204). The obvious procedure is to eliminate both K and ψ from equation(3.211) in favor of Jx and x. In practice it is expedient to leave ψ in andtreat it as a function of Jx and x:

Sx(x,J) = Jx(ψ − 12 sin 2ψ) where cosψ = −

√ωx

2Jxx, (3.216)

so

θx =∂S

∂Jx= ψ − 1

2 sin 2ψ + Jx(1 − cos 2ψ)∂ψ

∂Jx

= ψ − 12 sin 2ψ + sin2 ψ cotψ

= ψ.

(3.217)

Thus the variable ψ that we introduced for convenience in doing an integralis, in fact, the angle variable conjugate to Jx. Problem 3.33 explains analternative, and sometimes simpler, route to the angle variables.


3.5.2 Angle-action variables for spherical potentials

We now derive angle-action variables for any spherical potential. These areuseful not only for strictly spherical systems, but also for axisymmetric disks,and serve as the starting point for perturbative analyses of mildly asphericalpotentials. To minimize confusion between ordinary spherical polar coordi-nates and angle variables, in this section we reserve ϑ for the usual polarangle, and continue to use θi for the variable conjugate to Ji.

The Hamilton–Jacobi equation (3.205) for the potential Φ(r) is

E = 12 |∇S|2 + Φ(r)

= 12

[(∂S

∂r

)2+(1

r

∂S

∂ϑ

)2+( 1

r sinϑ

∂S

∂φ

)2]

+ Φ(r),(3.218)

where we have used equation (B.38) for the gradient operator in sphericalpolar coordinates. We write the generating function as S(x,J) = Sr(r,J) +Sϑ(ϑ,J) + Sφ(φ,J) and solve (3.218) by separation of variables. With thehelp of equation (3.204) we find

L2z =

(∂Sφ

∂φ

)2

= p2φ, (3.219a)

L2 −L2

z

sin2 ϑ=

(∂Sϑ

∂ϑ

)2

= p2ϑ, (3.219b)

2E − 2Φ(r) −L2

r2=

(∂Sr

∂r

)2

= p2r. (3.219c)

Here we have introduced two separation constants, L and Lz. We assumethat L > 0 and choose the sign of Lz so that Lz = pφ; with these conven-tions L and Lz prove to be the magnitude and z-component of the angular-momentum vector (Problem 3.20). Taking the square root of each equationand integrating, we obtain a formula for S:

S(x,J) =

∫ φ

0dφLz +

∫ ϑ

π/2dϑ εϑ

√L2 −

L2z

sin2 ϑ

+

∫ r

rmin

dr εr

√2E − 2Φ(r) −

L2

r2,

(3.220)

where εϑ and εr are chosen to be ±1 such that the integrals in which theyappear increase monotonically along a path over the orbital torus. Thelower limits of these integrals specify some point on the orbital torus, andare arbitrary. It is convenient to take rmin to be the orbit’s pericenter radius.

To obtain the actions from equation (3.206) we have to evaluate thechange in S as we go round the orbital torus along curves on which only one


of the coordinates is incremented. The case of changing φ is easy: on therelevant curve, φ increases by 2π, so (3.220) states that ∆S = 2πLz and

Jφ = Lz. (3.221)

We call Jφ the azimuthal action. Consider next the case of changing ϑ.Let ϑmin be the smallest value that ϑ attains on the orbit, given by

sinϑmin =|Lz|L

, (3.222)

ϑmin ≤ π/2. Then we start at π/2, where the integrand peaks, and integrateto π−ϑmin, where it vanishes. We have now integrated over a quarter periodof the integrand, so the whole integral is four times the value from this leg,

Jϑ =2

π

∫ π−ϑmin

π/2dϑ

√L2 −

L2z

sin2 ϑ= L − |Lz|. (3.223)

We call Jϑ the latitudinal action. The evaluation of Jr from equations(3.206) and (3.220) proceeds similarly and yields

Jr =1

π

∫ rmax

rmin

dr

√2E − 2Φ(r) −

L2

r2, (3.224)

where rmax is the radius of the apocenter—rmin and rmax are the two rootsof the radical—and Jr is the radial action.

An important example is that of the isochrone potential (2.47), whichencompasses both the Kepler and spherical harmonic potentials as limitingcases. One finds that (Problem 3.41)

Jr =GM√−2E

− 12

(L + 1

2

√L2 − 4GMb

). (3.225)

If we rewrite this expression as an equation for the Hamiltonian HI = E asa function of the actions, we obtain

HI(J) = −(GM)2

2[Jr + 1

2

(L +

√L2 + 4GMb

)]2 (L = Jθ + |Jφ|). (3.226a)

Differentiating this expression with respect to the actions, we find the fre-quencies (eq. 3.190):

Ωr =(GM)2

[Jr + 1

2 (L +√

L2 + 4GMb )]3

Ωϑ = 12

(1 +

L√L2 + 4GMb

)Ωr ; Ωφ = sgn(Jφ)Ωϑ.

(3.226b)


It is straightforward to check that these results are consistent with the ra-dial and azimuthal periods determined in §3.1c.16 In the limit b → 0, theisochrone potential becomes the Kepler potential and all three frequenciesbecome equal. The corresponding results for the spherical harmonic oscilla-tor are obtained by examining the limit b → ∞ (Problem 3.36).

Jθ and Jφ occur in equations (3.226) only in the combination L = Jθ +|Jφ|, and in fact, the Hamiltonian for any spherical potential is a functionH(Jr, L). Therefore we elevate L to the status of an action by making thecanonical transformation that is defined by the generating function (eq. D.93)

S′ = θφJ1 + θϑ(J2 − |J1|) + θrJ3, (3.227)

where (J1, J2, J3) are new angle-action coordinates. Differentiating with re-spect to the old angles, we discover the connection between the new and oldactions:

Jφ =∂S′

∂θφ= J1 ⇒ J1 = Lz,

Jϑ =∂S′

∂θϑ= J2 − |J1| ⇒ J2 = Jϑ + |Jφ| = L,

Jr =∂S′

∂θr= J3.

(3.228)

Thus the new action J2 is L as desired. Differentiating S ′ with respect tothe new actions we find that the new angles are

θ1 = θφ − sgn(J1)θϑ ; θ2 = θϑ ; θ3 = θr. (3.229)

Equation (3.224) can be regarded as an implicit equation for the Hamil-tonian H(J) = E in terms of J3 = Jr and J2 = L. Since J1 does not appearin this equation, the Hamiltonian of any spherical potential must be of theform H(J2, J3). Thus Ω1 = ∂H/∂J1 = 0 for all spherical potentials, andthe corresponding angle θ1 is an integral of motion. In §3.1 we saw that anyspherical potential admits four isolating integrals. Here we have recoveredthis result from a different point of view: three of the integrals are the actions(J1, J2, J3), and the fourth is the angle θ1.

From Figure 3.26 we see that for orbits with Lz > 0 the inclinationof the orbital plane i = 1

2π − ϑmin, while when Lz < 0, i = 12π + ϑmin.

Combining these equations with (3.222) we find that

i = cos−1(Lz/L) = cos−1(J1/J2). (3.230)

16 A minor difference is that in the analysis of §3.1c, the angular momentum L couldhave either sign. Here L = |L| is always non-negative, while Lz can have either sign.


line of nodes

star Figure 3.26 Angles defined by anorbit. The orbit is confined to aplane whose normal makes an an-gle i, the inclination, with the zaxis. The orbital plane intersects thexy plane along the line of nodes.The ascending node is the node atwhich z > 0, and the angle Ω is thelongitude of the ascending node. El-ementary trigonometry shows thatu = sin−1(cot i cotϑ) = φ − Ω andthat cosϑ = sin i sinψ, were ψ is theangle between the line of nodes andthe radius vector to the star.

We now obtain explicit expressions for the angle variables of any spher-ical potential by evaluating ∂S/∂Ji = θi, where S is derived from equa-tion (3.220) by replacing E with H(J2, J3), L with J2, and Lz with J1. Wehave

S = φJ1 +

∫ ϑ

π/2dϑ εϑ

√

J22 −

J21

sin2 ϑ+

∫ r

rmin

dr εr

√2H(J2, J3) − 2Φ(r) −

J22

r2.

(3.231)Figure 3.26 helps us to interpret our final result. It depicts the star after ithas passed the line of nodes, moving upward. At this instant, ϑ < 0, and wemust choose εϑ = −1 to make the first integral of equation (3.231) increasing.We therefore specialize to this case, and using (3.230) find

θ1 =∂S

∂J1= φ+ sgn(J1)

∫ ϑ

π/2

dϑ

sinϑ√

sin2 ϑ sec2 i − 1

= φ− u,

(3.232a)

where17

sin u ≡ cot i cotϑ. (3.232b)

Figure 3.26 demonstrates that the new variable u is actually φ−Ω and thusthat θ1 = Ω, the longitude of the ascending node.18 Thus θ1 is constantbecause the line of nodes is fixed. If the potential were not spherical, but

17 This follows because

d[sin−1(cot i cotϑ)] = − csc2 ϑ cot i dϑ/p

1 − cot2 i cot2 ϑ

= sgn(cos i)/(sin ϑp

sin2 ϑ sec2 i − 1).(3.233)

18 Equation (3.232b) has two solutions in (0, 2π) and care must be taken to choose thecorrect solution.


Table 3.1 Angle-action variables in a spherical potential

actions Jφ = Lz ; Jϑ = L − |Lz| ; Jr

angles θφ = Ω + sgn(Lz)θϑ ; θϑ ; θr

Hamiltonian H(Jϑ + |Jφ|, Jr)frequencies Ωφ = sgn(Lz)Ωϑ ; Ωϑ ; Ωr

actions J1 = Lz ; J2 = L ; J3 = Jr

angles θ1 = Ω ; θ2 = θϑ ; θ3 = θr

Hamiltonian H(J2, J3)frequencies Ω1 = 0 ; Ω2 = Ωϑ ; Ω3 = Ωr

actions Ja = Lz ; Jb = L ; Jc = Jr + Langles θa = Ω ; θb = θϑ − θr ; θc = θr

Hamiltonian H(Jb, Jc − Jb)frequencies Ωa = 0 ; Ωb = Ωϑ − Ωr ; Ωc = Ωr

notes: The Delaunay variables (Ja, Jb, Jc) are defined in Appendix E.When possible, actions and angles are expressed in terms of the total an-gular momentum L, the z-component of angular momentum Lz , the radialaction Jr , and the longitude of the ascending node Ω (Figure 3.26). Unfor-tunately, Ω is also used for the frequency corresponding to a given action,but in this case it is always accompanied by a subscript. The Hamiltonianis H(L, Jr).

merely axisymmetric, θ1 would not be constant and the orbital plane wouldprecess.

Next we differentiate equation (3.231) to obtain θ3. Only the third term,which is equal to Sr, depends on J3. Thus we have

θ3 =(∂Sr

∂J3

)

J2

=(∂Sr

∂H

)

J2

( ∂H

∂J3

)

J2

=(∂Sr

∂H

)

J2

Ω3, (3.234)

where the last step follows from equation (3.190). Similarly,

θ2 =( ∂S

∂J2

)

J3

=(∂Sϑ

∂J2

)

J3

+(∂Sr

∂H

)

J2

( ∂H

∂J2

)

J3

+(∂Sr

∂J2

)

H. (3.235)

We eliminate ∂Sr/∂H using equation (3.234),

θ2 =(∂Sϑ

∂J2

)

J3

+Ω2

Ω3θ3 +

(∂Sr

∂J2

)

H. (3.236)

From equation (3.231) with εϑ = −1, it is straightforward to show that(∂Sϑ

∂J2

)

J1

= sin−1

(cosϑ

sin i

). (3.237)

Now let ψ be the angle measured in the orbital plane from the line of nodesto the current position of the star. From Figure 3.26 it is easy to see thatcosϑ = sin i sinψ; thus (∂Sϑ

∂J2

)

J1

= ψ. (3.238)


The other two partial derivatives in equations (3.234) and (3.235) canonly be evaluated once Φ(r) has been chosen. In the case of the isochronepotential (2.47), we have

(∂Sr

∂H

)

J2

=

∫ r

rmin

dI ;(∂Sr

∂J2

)

H= −J2

∫ r

rmin

dI

r2(3.239)

where dI is defined by (3.36). Hence the integrals to be performed are justindefinite versions of the definite integrals that yielded Tr and ∆ψ in §3.1c.The final answers are most conveniently expressed in terms of an auxiliaryvariable η that is defined by (cf. eqs. 3.28, 3.32 and 3.34)

s = 2 +c

b(1 − e cos η) where

c ≡GM

−2H− b,

e2 ≡ 1 −J2

2

GMc

(1 +

b

c

),

s ≡ 1 +√

1 + r2/b2.

(3.240)

Then one has19

θ3 = η −ec

c + bsin η

θ2 = ψ +Ω2

Ω3θ3 − tan−1

(√1 + e

1 − etan(1

2η)

)

−1√

1 + 4GMb/J22

tan−1

(√1 + e + 2b/c

1 − e + 2b/ctan(1

2η)

).

(3.241)

Thus in the case of the isochrone potential we can analytically evaluate allthree angle variables from ordinary phase-space coordinates (x,v).

To summarize, in an arbitrary spherical potential two of the actionscan be taken to be the total angular momentum L and its z-componentLz, and one angle can be taken to be the longitude of the ascending nodeΩ. The remaining action and angles can easily be determined by numericalevaluation of the integral (3.224) and integrals analogous to those of equa-tion (3.239). In the isochrone potential, all angle-action variables can beobtained analytically from the ordinary phase-space coordinates (x,v). Theanalytic relations among angle-action variables in spherical potentials aresummarized in Table 3.1.

19 In numerical work, care must be taken to ensure that the branch of the inversetrigonometric functions is chosen so that the angle variables increase continuously.


3.5.3 Angle-action variables for flattened axisymmetric potentials

In §3.2 we used numerical integrations to show that most orbits in flattenedaxisymmetric potentials admit three isolating integrals, only two of whichwere identified analytically. Now we take up the challenge of identifyingthe missing “third integral” analytically, and deriving angle-action variablesfrom it and the classical integrals. It proves possible to do this only forspecial potentials, and we start by examining the potentials for which wehave obtained action integrals for clues as to what a promising potentialmight be.

(a) Stackel potentials In §3.3 we remarked that box orbits in a pla-nar non-rotating bar potential resemble Lissajous figures generated by two-dimensional harmonic motion, while loop orbits have many features in com-mon with orbits in axisymmetric potentials. Let us examine these parallelsmore closely. The orbits of a two-dimensional harmonic oscillator admit twoindependent isolating integrals, Hx = 1

2 (p2x + ω2

xx2) and Hy = 12 (p2

y + ω2yy2)

(eq. 3.207). At each point in the portion of the (x, y) plane visited by theorbit, the particle can have one of four momentum vectors. These mo-menta arise from the ambiguity in the signs of px and py when we aregiven only Ex and Ey , the values of Hx and Hy: px(x) = ±

√2Ex − ω2

xx2;

py(y) = ±√

2Ey − ω2yy2. The boundaries of the orbit are the lines on which

px = 0 or py = 0.Consider now planar orbits in a axisymmetric potential Φ(r). These

orbits fill annuli. At each point in the allowed annulus two momenta arepossible: pr(r) = ±

√2(E − Φ) − L2

z/r2, pφ = Lz. The boundaries of theorbit are the curves on which pr = 0.

These examples have a number of important points in common:(i) The boundaries of orbits are found by equating to zero one canonical

momentum in a coordinate system that reflects the symmetry of thepotential.

(ii) The momenta in this privileged coordinate system can be written asfunctions of only one variable: px(x) and py(y) in the case of the har-monic oscillator; and pr(r) and pφ = Lz (which depends on neithercoordinate) in the case of motion in a axisymmetric potential.

(iii) These expressions for the momenta are found by splitting the Hamilton–Jacobi equation H −E = 0 (eq. 3.205) into two parts, each of which is afunction of only one coordinate and its conjugate momentum. In the caseof the harmonic oscillator, 0 = H −E = 1

2 |p|2 + 1

2 (ω2xx2 + ω2

yy2)−E =

Hx(px, x) + Hy(py, y) − E. In the case of motion in an axisymmetricpotential, 0 = r2(H − E) = r2

[12p2

r + Φ(r)]− r2E + 1

2p2φ.

The first of these observations suggests that we look for a curvilinear coordi-nate system whose coordinate curves run parallel to the edges of numericallyintegrated orbits, such as those plotted in Figure 3.4. Figure 3.27 shows that


Figure 3.27 The boundaries of orbits in the meridional plane approximately coincidewith the coordinate curves of a system of spheroidal coordinates. The dotted lines arethe coordinate curves of the system defined by (3.242) and the full curves show the sameorbits as Figure 3.4.

the (u, v) coordinate system defined by

R = ∆ sinh u sin v ; z = ∆ coshu cos v (3.242)

achieves this goal to high accuracy: the orbits of Figure 3.4 can be approxi-mately bounded top and bottom by curves of constant v and right and leftby curves of constant u.20

Now that we have chosen a coordinate system, item (iii) above suggeststhat we next write the Hamiltonian function in terms of u, v, and theirconjugate momenta. The first step is to write the Lagrangian as a functionof the new coordinates and their time derivatives. By an analysis that closelyparallels the derivation of equations (2.99) we may show that

|x|2 = ∆2(sinh2 u + sin2 v

) (u2 + v2

)+ ∆2 sinh2 u sin2 v φ2, (3.243)

and the Lagrangian is

L = 12∆2

[(sinh2 u + sin2 v

) (u2 + v2

)+ sinh2 u sin2 v φ2

]−Φ(u, v). (3.244)

The momenta are (eq. D.49)

pu = ∆2(sinh2 u + sin2 v

)u ; pv = ∆2

(sinh2 u + sin2 v

)v

pφ = ∆2 sinh2u sin2 v φ,(3.245)

20 Note that prolate spheroidal coordinates are used to fit the boundaries of orbits inoblate potentials.


so the Hamiltonian is

H(u, v, pu, pv, pφ) =p2

u + p2v

2∆2(sinh2 u + sin2 v)+

p2φ

2∆2 sinh2 u sin2 v+ Φ(u, v).

(3.246)Since H has no explicit dependence on time, it is equal to some constantE. Likewise, since H is independent of φ, the azimuthal momentum pφ isconstant at some value Lz.

The examples of motion in harmonic and circular potentials suggest thatwe seek a form of Φ(u, v) that will enable us to split a multiple of the equationH(u, v, pu, pv, Lz) = E into a part involving only u and pu and a part thatinvolves only v and pv. Evidently we require that (sinh2 u + sin2 v)Φ be ofthe form U(u) − V (v), i.e., that21

Φ(u, v) =U(u) − V (v)

sinh2 u + sin2 v, (3.247)

for then we may rewrite H = E as

2∆2[E sinh2 u − U(u)

]−p2

u−L2

z

sinh2 u=

L2z

sin2 v+p2

v −2∆2[E sin2 v + V (v)

].

(3.248)It can be shown that potentials of the form (3.247) are generated by bodiesresembling real galaxies (see Problems 2.6 and 2.14), so there are interestingphysical systems for which (3.248) is approximately valid. Potentials of thisform are called Stackel potentials after the German mathematician P.Stackel.22 Our treatment of these potentials will be restricted; much moredetail, including the generalization to triaxial potentials, can be found in deZeeuw (1985).

If the analogy with the harmonic oscillator holds, pu will be a functiononly of u, and similarly for pv. Under these circumstances, the left side ofequation (3.248) does not depend on v, and the right side does not dependon u, so both sides must equal some constant, say 2∆2I3. Hence we wouldthen have

pu = ±√

2∆2[E sinh2 u − I3 − U(u)

]−

L2z

sinh2 u, (3.249a)

pv = ±√

2∆2[E sin2 v + I3 + V (v)

]−

L2z

sin2 v. (3.249b)

21 The denominator of equation (3.247) vanishes when u = 0, v = 0. However, wemay avoid an unphysical singularity in Φ at this point by choosing U and V such thatU(0) = V (0).

22 Stackel showed that the only coordinate system in which the Hamilton–Jacobi equa-tion for H = 1

2p2+Φ(x) separates is confocal ellipsoidal coordinates. The usual Cartesian,spherical and cylindrical coordinate systems are limiting cases of these coordinates, as isthe (u, v,φ) system.


It is a straightforward exercise to show that the analogy with the harmonicoscillator does hold, by direct time differentiation of both sides of equa-tions (3.249), followed by elimination of u and pu with Hamilton’s equations(Problem 3.37). Thus the quantity I3 defined by equations (3.249) is an in-tegral. Moreover, we can display I3 as an explicit function of the phase-spacecoordinates by eliminating E between equations (3.249) (Problem 3.39).

Equations (3.249) enable us to obtain expressions for the actions Ju andJv in terms of the integrals E, I3 and Lz, the last of which is equal to Jφ asin the spherical case. Specifically

Ju =1

π

∫ umax

umin

du

√2∆2

[E sinh2 u − I3 − U(u)

]−

L2z

sinh2 u,

Jv =1

π

∫ vmax

vmin

dv

√2∆2

[E sin2 v + I3 + V (v)

]−

L2z

sin2 v,

Jφ = Lz,

(3.250)

where umin and umax are the smallest and largest values of u at which theintegrand vanishes, and similarly for vmin and vmax.

As in the spherical case, we obtain expressions for the angle variables bydifferentiating the generating function S(u, v,φ, Ju, Jv, Jφ) of the canonicaltransformation between angle-action variables and the (u, v,φ) system. Wetake S to be the sum of three parts Su, Sv and Sφ, each of which depends ononly one of the three coordinate variables. The gradient of Su with respect tou is just pu, so Su is just the indefinite integral with respect to u of (3.249a).After evaluating Sv and Sφ analogously, we use the chain rule to differentiateS =

∑i Si with respect to the actions (cf. the derivation of eq. 3.234):

θu =∂S

∂Ju=∑

i=u,v

(∂Si

∂HΩu +

∂Si

∂I3

∂I3

∂Ju

),

θv =∑

i=u,v

(∂Si

∂HΩv +

∂Si

∂I3

∂I3

∂Jv

),

θφ =∑

i=u,v

(∂Si

∂HΩφ +

∂Si

∂I3

∂I3

∂Lz

)+ φ.

(3.251)

The partial derivatives in these expressions are all one-dimensional integralsthat must in general be done numerically.

The condition (3.247) that must be satisfied by an axisymmetric Stackelpotential is very restrictive because it requires that a function of two variablescan be written in terms of two functions of one variable. Most potentials thatadmit a third integral do not satisfy this condition. In particular the logarith-mic potential ΦL (2.71a) that motivated our discussion is not of Stackel form:we can find a system of spheroidal coordinates that approximately bounds


Figure 3.28 Plots of the effective potentials Ueff (left) and Veff (right) that are definedby equations (3.252) and (3.253) for ∆ = 0.6a3 and Lz = 0.05a3

√W . Curves are shown

for I3 = −0.1W (full) and I3 = 0.1W (dotted).

any given orbit, but in general different orbits require different coordinatesystems.

As an example of the use of equations (3.249) we investigate the shapesthey predict for orbits in the potential obtained by choosing in (3.247)

U(u) = −W sinh u tan−1

(∆ sinh u

a3

)

V (v) = W sin v tanh−1

(∆ sin v

a3

),

(3.252)

where W , ∆, and a3 are constants.23 An orbit of specified E and I3 canexplore all values of u and v for which equations (3.249) predict positivep2

u and p2v. This they will do providing E is larger than the largest of the

“effective potentials”

Ueff(u) ≡L2

z

2∆2 sinh4 u+

I3 + U(u)

sinh2 u, (3.253a)

Veff(v) ≡L2

z

2∆2 sin4 v−

I3 + V (v)

sin2 v. (3.253b)

Figure 3.28 shows these potentials for two values of I3 and all other param-eters fixed. Consider the case in which the energy takes the value −0.453W

23 With these choices for U and V , the potential (3.247) becomes the potential of theperfect prolate spheroid introduced in Problem 2.14.


Figure 3.29 Surface of section atE = −0.5W and Lz = 0.05a3

√W

constructed from equations (3.249)and (3.252) with ∆ = 0.6a3.

(dashed horizontal line). Then for I3 = 0.1W (dotted curves), only a singlevalue of u (u = 1) is permitted, so the orbit is confined to a segment of anellipse in the meridional plane—this is a shell orbit. By contrast all values of|v| larger than the intersection of the dashed and dotted curves in the rightpanel are permitted: these start at |v| = 0.059π. Consequently, the orbitcovers much of the ellipse u = 1 (which in three dimensions is a spheroid).

Consider now the case in which I3 = −0.1W (full curves in Figure 3.28).Now a wide range is permitted in u (0.17 < u < 1.48) and a smaller range inv (|v| > 0.27π). Physically, lowering I3 transfers some of the available energyfrom motion perpendicular to the potential’s equatorial plane into the star’sradial oscillation.

In §3.2.2 we detected the existence of non-classical integrals by plottingsurfaces of section. It is interesting to see how I3 structures surfaces ofsection. If we were to plot the (u, pu) surface of section, the consequents of agiven orbit (definite values of E, Lz, I3) would lie on the curve in the (u, pu)plane whose equation is (3.249a). This equation is manifestly independentof v, so the surface of section would look the same regardless of whether itwas for v = 0, v = 0.1, or whatever. To get the structure of the (R, pR)surfaces of section that we plotted in §3.2.2, for each allowed value of u weget p(u) from (3.249a) and p(v) from (3.249b) with v = π/2, and then obtain(R, pR) from the (u, v, pu, pv) coordinates by inverting the transformations(2.96) and (3.245). Figure 3.29 shows a surface of section generated in thisway.

In §3.2.1 we saw that motion in the meridional plane is governed bya Hamiltonian H(R, z, pR, pz) in which Lz occurs as a parameter and thephase space is four-dimensional. In this space the orbital tori are ordinarytwo-dimensional doughnuts, and a surface of section is simply a cross-sectionthrough a nested sequence of such tori: each invariant curve marks the in-tersection of a two-dimensional doughnut with the two-dimensional surfaceof section.

(b) Epicycle approximation In §3.2.3 we used the epicycle approxima-


tion to obtain solutions to the equations of motion that are approximatelyvalid for nearly circular orbits in an axisymmetric potential. Here we ob-tain the corresponding approximate angle-action variables. In cylindricalcoordinates the Hamilton–Jacobi equation (3.205) is

12

( ∂S

∂R

)2+

1

2R2

(∂S

∂φ

)2+ 1

2

(∂S

∂z

)2+ Φ(R, z) = E. (3.254)

As in equation (2.75a) we assume that Φ is of the form ΦR(R) + Φz(z); theradial dependence of Φz(z) is suppressed because the radial motion is smallin the epicycle approximation. We further assume that S is of the formS(J, R,φ, z) = SR(J, R) + Sφ(J,φ) + Sz(J, z). Now we use the method ofseparation of variables to split equation (3.254) up into three parts:

Ez = 12

(∂Sz

∂z

)2+ Φz(z) ; L2

z =(∂Sφ

∂φ

)2

E − Ez = 12

(∂SR

∂R

)2+ ΦR(R) +

L2z

2R2,

(3.255)

where Ez and Lz are the two constants of separation. The first equation ofthis set leads immediately to an integral for Sz(z)

Sz(z) =

∫ z

0dz′ εz

√2[Ez − Φz(z′)], (3.256)

where εz is chosen to be ±1 such that the integral increases monotonicallyalong the path. If, as in §3.2.3, we assume that Φz = 1

2ν2z2, where ν is a

constant, then our equation for Sz becomes essentially the same as the firstof equations (3.211), and by analogy with equations (3.213) and (3.216), wehave

Jz =Ez

ν; z = −

√2Jz

νcos θz. (3.257)

The second of equations (3.255) trivially yields

Sφ(J,φ) = Lzφ, (3.258)

and it immediately follows that Jφ = Lz. The last of equations (3.255) yields

2(E − Ez) =(∂SR

∂R

)2+ 2Φeff(R), (3.259a)

where (cf. eq. 3.68b)

Φeff(R) ≡ ΦR(R) +J2

φ

2R2. (3.259b)


The epicycle approximation involves expanding Φeff about its minimum,which occurs at the radius Rg(Jφ) of the circular orbit of angular momentumJφ; with x defined by R = Rg + x, the expansion is Φ(R) = Ec(Jφ)+ 1

2κ2x2,

where Ec(Jφ) is the energy of the circular orbit of angular momentum Jφ

and κ is the epicycle frequency defined in equation (3.77). Inserting thisexpansion into (3.259a) and defining ER ≡ E − Ez − Ec, we have

2ER =(∂SR

∂R

)2+ κ2x2, (3.260)

which is the same as equation (3.210) with K2 replaced by 2ER, x by R, andωx by κ. It follows from equations (3.213), (3.216) and (3.217) that

JR =ER

κ; SR(J, R) = JR(θR − 1

2 sin 2θR) ; R = Rg −√

2JR

κcos θR.

(3.261)The last of these equations is equivalent to equation (3.91) if we set θR =κt + α and X = −(2JR/κ)1/2.

Finally, we find an expression for θφ. With equations (3.258) and (3.261)we have

θφ =∂S

∂Jφ=

∂Sφ

∂Jφ+∂SR

∂Jφ= φ+ JR(1 − cos 2θR)

∂θR

∂Jφ

= φ+ 2JR sin2 θR∂θR

∂Jφ.

(3.262)

The derivative of θR has to be taken at constant JR, Jz, R,φ, and z. Wedifferentiate the last of equations (3.261) bearing in mind that both Rg andκ are functions of Jφ:

0 =dRg

dJφ+

1

2κ

dκ

dJφ

√2JR

κcos θR +

√2JR

κsin θR

∂θR

∂Jφ. (3.263)

By differentiating R2gΩg = Jφ with respect to Rg we may show with equation

(3.80) thatdRg

dJφ=

γ

κRg, (3.264)

where γ = 2Ωg/κ is defined by equation (3.93b). Inserting this relation into(3.263) and using the result to eliminate ∂θR/∂Jφ from (3.262), we havefinally

θφ = φ−γ

Rg

√2JR

κsin θR −

JR

2

d lnκ

dJφsin 2θR. (3.265)

This expression should be compared with equation (3.93a). If we set θφ =Ωgt+φ0, θR = κt+α+π, and X = (2JR/κ)1/2 as before, the only difference


Figure 3.30 The boundaries ofloop and box orbits in barred po-tentials approximately coincide withthe curves of a system of spheroidalcoordinates. The figure shows twoorbits in the potential ΦL of equa-tion (3.103), and a number of curveson which the coordinates u and vdefined by equations (3.267) are con-stant.

between the two equations is the presence of a term proportional to sin 2θR inequation (3.265). For nearly circular orbits, this term is smaller than the termproportional to sin θR by

√JR/Jφ and represents a correction to equation

(3.92) that makes the (θR, θφ, JR, Jφ) coordinates canonical (Dehnen 1999a).It is worth noting that when JR /= 0, the frequency associated with φ

is not the circular frequency, Ωg. To see this, recall that the HamiltonianH = ER + Ec + Ez , and ER = κJR, while dEc/dJφ = Ωg, so

Ωφ =∂H

∂Jφ=

dκ

dJφJR + Ωg. (3.266)

3.5.4 Angle-action variables for a non-rotating bar

The (u, v) coordinate system that allowed us to recover angle-action variablesfor flattened axisymmetric potentials enables us to do the same for a planar,non-rotating bar. This fact is remarkable, because we saw in §3.3 that thesesystems support two completely different types of orbit, loops and boxes.Figure 3.30 makes it plausible that the (u, v) system can provide analyticsolutions for both loops and boxes, by showing that the orbits plotted inFigure 3.8 have boundaries that may be approximated by curves of constant uand v (cf. the discussion on page 226). We can explore this idea quantitativelyby defining

x = ∆ sinh u sin v ; y = ∆ cosh u cos v (3.267)

and then replacing R by x and z by y in the formulae of the previous subsec-tion. Further setting φ = Lz = 0 we find by analogy with equations (3.249)that

pu = ±∆ sinh u√

2[E − Ueff(u)] ; pv = ±∆ sin v√

2[E − Veff(v)](3.268a)


Figure 3.31 The effective potentials defined by equations (3.268b) when U and V aregiven by equations (3.252). The curves are for I2 = 1, 0.25, 0.01, 0,−0.01,−0.25 and −1,with the largest values coming on top in the left panel and on the bottom in the rightpanel. The thick curves are for I2 = 0.

where

Ueff(u) =I2 + U(u)

sinh2 u; Veff(v) = −

I2 + V (v)

sin2 v. (3.268b)

Here U and V are connected to the gravitational potential by equation(3.247) as before and I2 is the constant of separation analogous to I3.

An orbit of specified E and I2 is confined to values of u and v at whichboth E ≥ Ueff and E ≥ Veff . Figure 3.31 shows the effective potentials asfunctions of their coordinates for several values of I2 when U and V arechosen to be the functions specified by equations (3.252). In each panel thethick curve is for I2 = 0, with curves for I2 > 0 lying above this in the leftpanel, and below it on the right. Since the curves of Ueff have minima onlywhen I2 > 0, there is a lower limit on the star’s u coordinate only in thiscase. Consequently, stars with I2 ≤ 0 can reach the center, while stars withI2 > 0 cannot reach the center. This suggests that when I2 ≤ 0 the orbit isa box orbit, while when I2 > 0 it is a loop orbit. Comparison of the rightand left panels confirms this conjecture by showing that when I2 > 0 (uppercurves on left and lower curves on right), the minimum value of Ueff is greaterthan the maximum of Veff . Hence when I2 > 0 the condition E > Veff(v)imposes no constraint on v and the boundaries of the orbit are the ellipsesu = umin and u = umax on which E = Ueff . When I2 ≤ 0, by contrast, thecurves on the right tend to ∞ as v → 0, so sufficiently small values of v areexcluded and the boundaries of the orbit are the ellipse u = umax on whichE = Ueff(u) and the hyperbola |v| = vmin on which E = Veff(v).


Figure 3.32 The (u, pu), v = 0 surface of section for motion at E = −0.25 in theStackel potential defined by equations (3.247) and (3.252) with ∆ = 0.6 and a3 = 1.Each curve is a contour of constant I2 (eqs. 3.268). The invariant curves of box orbits(I2 = −0.6,−0.4, . . .) run round the outside of the figure, while the bull’s-eyes at right arethe invariant curves of anti-clockwise loop orbits. Temporarily suspending the conventionthat loops always have u > 0, we show the invariant curves of clockwise loops as thebull’s-eyes at left.

Figure 3.32 shows the (u, pu) surface of section, which is in practicenothing more than a contour plot of the integral I2(E, u, pu) with E fixed(eq. 3.268a). Each contour shows the curve in which an orbital torus is slicedby the surface of section. As in Figure 3.9, for example, there are two differenttypes of contour, namely those generated by the tori of loop orbits (whichcome in pairs, because there are both clockwise and anti-clockwise circulatingloops), and those generated by the tori of box orbits, which envelop all thetori of the loop orbits.

3.5.5 Summary

We have made a considerable investment in the theory of angle-action vari-ables, which is repaid by the power of these variables in investigations of awide variety of dynamical problems. This power arises from the followingfeatures:(i) Angle-action variables are canonical. In particular, the phase-space vol-

ume d3θd3J is the same as the phase-space volume d3qd3p for any otherset of canonical variables (q,p), including the usual Cartesian coordi-nates (x,v).

(ii) Every set of angle-action variables (θ,J) is associated with a Hamil-tonian24 H(J), and orbits in this Hamiltonian have the simple form

24 If a given set of angle-action variables is associated with H(J), then it is also as-

3.6 Slowly varying potentials 237

J = constant , θ = Ωt + constant . A Hamiltonian that admits angle-action variables is said to be integrable. The simplicity of angle-actionvariables makes them indispensable for investigating motion in non-integrable Hamiltonians by using perturbation theory. This techniquewill be used to explore chaotic orbits in §3.7, and the stability of stellarsystems in Chapter 5.

(iii) In the next section we shall see that actions are usually invariant duringslow changes in the Hamiltonian.

3.6 Slowly varying potentials

So far we have been concerned with motion in potentials that are time-independent in either an inertial or a rotating frame. It is sometimes nec-essary to consider how stars move in potentials that are time-dependent.The nature of the problem posed by a time-varying potential depends onthe speed with which the potential evolves. In this section we shall confineourselves to potentials that evolve slowly, in which case angle-action vari-ables enable us to predict how a stellar system will respond to changes inthe gravitational field that confines it. Such changes occur when:(i) Encounters between the individual stars at the core of a dense stellar

system (such as a globular cluster or galaxy center) cause the core toevolve on a timescale of order the relaxation time (1.38), which is muchlonger than the orbital times of individual stars (§7.5).

(ii) Stars of galaxies and globular clusters lose substantial quantities of massas they gradually evolve and shed their envelopes into interstellar orintergalactic space (Box 7.2).

(iii) Gas settles into the equatorial plane of a pre-existing dark halo to forma spiral galaxy. In this case the orbits of the halo’s dark-matter particleswill undergo a slow evolution as the gravitational potential of the diskgains in strength.

Potential variations that are slow compared to a typical orbital frequencyare called adiabatic. We now show that the actions of stars are constantduring such adiabatic changes of the potential. For this reason actions areoften called adiabatic invariants.

3.6.1 Adiabatic invariance of actions

Suppose we have a sequence of potentials Φλ(x) that depend continuouslyon the parameter λ. For each fixed λ we assume that angle-action variablescould be constructed for Φλ. That is, we assume that at all times phase spaceis filled by arrays of nested tori on which the phase points of individual stars

sociated with H(J) ≡ f [H(J)], where f is any differentiable function. Thus, a set ofangle-action variables is associated with infinitely many Hamiltonians.


travel. We consider what happens when λ is changed from its initial value,say λ = λ0, to a new value λ1. After this change has occurred, each star’sphase point will start to move on a torus of the set that belongs to Φλ1 . Ingeneral, two stellar phase points that started out on the same torus of Φλ0 willend up on two different tori of Φλ1 . But if λ is changed very slowly comparedto all the characteristic times 2π/Ωk associated with motion on each torus,all phase points that are initially on a given torus of Φλ0 will be equallyaffected by the variation of λ. This statement follows from the time averagestheorem of §3.5.1a, which shows that all stars spend the same fraction oftheir time in each portion of the torus; hence, all stars are affected by slowchanges in Φλ in the same way. Thus all phase points that start on the sametorus of Φλ0 will end on a single torus of Φλ1 . Said in other language, anytwo stars that are initially on a common orbit (but at different phases) willstill be on a common orbit after the slow variation of λ is complete.

Suppose the variation of λ starts at time t = 0 and is complete by timeτ , and let Ht be the time-evolution operator defined in equation (D.55).Then we have just seen that Hτ , which is a canonical map (see AppendixD.4.4), maps tori of Φλ0 onto tori of Φλ1 . These facts guarantee that actionsare adiabatically invariant, for the following reason. Choose three closedcurves γi, on any torus M of Φλ0 that through the integrals (3.195) generatethe actions Ji of this torus. Then, since Hτ is the endpoint of a continuousdeformation of phase space into itself, the images Hτ (γi) of these curves aresuitable curves along which to evaluate the actions J ′

i of Hτ (M), the torusto which M is mapped by Hτ . But by a corollary to the Poincare invarianttheorem (Appendix D.4.2), we have that if γ is any closed curve and Hτ (γ)is its image under the canonical map Hτ , then

∮

Hτ (γ)p · dq =

∮

γp · dq. (3.269)

Hence J ′i = Ji, and the actions of stars do not change if the potential evolves

sufficiently slowly.It should be stressed that any action Ji with fundamental frequency

Ωi = 0 is not an adiabatic invariant. For example, in a spherical potential,J2 and J3 are normally adiabatic invariants, but J1 is not (Table 3.1).

3.6.2 Applications

We illustrate these ideas with a number of simple examples. Other ap-plications of adiabatic invariants will be found in Binney & May (1986),Lichtenberg & Lieberman (1992), and §4.6.1.

(a) Harmonic oscillator We first consider the one-dimensional harmonicoscillator whose potential is

Φ = 12ω

2x2. (3.270)


Figure 3.33 Checking the invari-ance of the action (3.271) when thenatural frequency of a harmonic os-cillator is varied according to equa-tion (3.277). ∆J is the rms changein the action on integrating the os-cillator’s equation of motion fromt = −20T to t = 20T , using eightequally spaced phases. The rms

change in J declines approximatelyas ∆J ∝ exp(−2.8ω0T ).

By equation (3.213) the action is

J =1

2ω

[p2 + (ωx)2

]=

H

ω, (3.271)

where H(x, p) = 12p2 + 1

2ω2x2. The general solution of the equations of

motion is x(t) = X cos(ωt + φ). In terms of the amplitude of oscillation Xwe have

J = 12ωX2. (3.272)

Now suppose that the oscillator’s spring is slowly stiffened by a factor s2 > 1,so the natural frequency increases to

ω′ = sω. (3.273)

By the adiabatic invariance of J , the new amplitude X ′ satisfies

12ω

′X ′2 = J = 12ωX2. (3.274)

Thus the amplitude is diminished to

X ′ =X√s, (3.275)

while the energy, E = ωJ , has increased to25

E′ = ω′J = sωJ = sE. (3.276)

25 The simplest proof of this result uses quantum mechanics. The energy of a harmonicoscillator is E = (n + 1

2 )hω where n is an integer. When ω is slowly varied, n cannotchange discontinuously and hence must remain constant. Therefore E/ω = E ′/ω′. Ofcourse, for galaxies n is rather large.


Figure 3.34 The envelope of an orbit in the effective potential (3.70) with q = 0.5 (lightcurve) is well modeled by equation (3.279) (heavy curves).

We now ask how rapidly we can change the frequency ω without de-stroying the invariance of J . Let ω vary with time according to

ω(t) = π√

3 + erf(t/T ) . (3.277)

Thus the frequency changes from ω = ω0 ≡√

2π at t + −T to ω = 2π =√2ω0 at t ) T . In Figure 3.33 we show the results of numerically integrating

the oscillator’s equation of motion with ω(t) given by equation (3.277). Weplot the rms difference ∆J between the initial and final values of J for eightdifferent phases of the oscillator at t = −20T . For ω0T ∼> 2, J changes byless than half a percent, and for ω0T ∼> 4, J changes by less than 3 × 10−5.We conclude that the potential does not have to change very slowly for J tobe well conserved. In fact, one can show that the fractional change in J is ingeneral less than exp(−ωT ) for ωT ) 1 (Lichtenberg & Lieberman 1992).

(b) Eccentric orbits in a disk Consider the shapes shown in Figure 3.4of the orbits in the meridional plane of an axisymmetric galaxy. On page 167we remarked that disk stars in the solar neighborhood oscillate perpendicularto the galactic plane considerably more rapidly than they oscillate in theradial direction. Therefore, if we take the radial coordinate R(t) of a diskstar to be a known function of time, we may consider the equation of motion(3.67c) of the z-coordinate to describe motion in a slowly varying potential.If the amplitude of the z-oscillations is small, we may expand ∂Φ/∂z aboutz = 0 to find

z ( −ω2z where ω(t) ≡(∂2Φ

∂z2

)1/2

[R(t),0]

≡√

Φzz[R(t), 0]. (3.278)

If the action integral of this harmonic oscillator is conserved, we expect theamplitude Z(R) to satisfy (see eqs. 3.273 and 3.275)

Z(R) = Z(R0)

(Φzz(R0, 0)

Φzz(R, 0)

)1/4

. (3.279)

Figure 3.34 compares the prediction of (3.279) with the true shape of anorbit in the effective potential (3.70). Evidently the behavior of such orbitscan be accurately understood in terms of adiabatic invariants.

(c) Transient perturbations Consider the motion of a star on a loop


orbit in a slowly varying planar potential Φ(R,φ). The relevant action is

Jφ =1

2π

∫ 2π

0dφ pφ. (3.280)

We now conduct the following thought experiment. Initially the potential Φis axisymmetric. Then pφ = Lz is an integral, and we can trivially evaluatethe integral in (3.280) to obtain Jφ = Lz. We now slowly distort the potentialin some arbitrary fashion into a new axisymmetric configuration. At theend of this operation, the azimuthal action, being adiabatically invariant,still has value Jφ and is again equal to the angular momentum Lz. Thusthe star finishes the experiment with the same angular momentum withwhich it started,26 even though its instantaneous angular momentum, pφ,was changing during most of the experiment. Of course, if the potentialremains axisymmetric throughout, pφ remains an integral at all times and isexactly conserved no matter how rapidly the potential is varied.

A closely related example is a slowly varying external perturbation ofa stellar system, perhaps from the gravitational field of an object passingat a low angular velocity. If the passage is slow enough, the actions areadiabatically invariant, so the distribution of actions in the perturbed systemwill be unchanged by the encounter. In other words, adiabatic encounters,even strong ones, have no lasting effect on a stellar system (§8.2c).

(d) Slow growth of a central black hole As our final application ofthe adiabatic invariance of actions, we consider the evolution of the orbit ofa star near the center of a spherical galaxy, as a massive black hole growsby slowly accreting matter (Goodman & Binney 1984). A more completetreatment of the problem is given in §4.6.1d. We assume that prior to theformation of the hole, the density of material interior to the orbit can betaken to be a constant, so the potential is that of the spherical harmonicoscillator. It is then easy to show that the star’s Hamiltonian can be written(Problem 3.36)

H = ΩrJr + ΩφJφ = 2ΩJr + ΩJφ, (3.281)

where Ω = Ωφ = 12Ωr is the circular frequency, and Jφ = L is the magnitude

of the angular-momentum vector. The radii rmin and rmax of peri- andapocenter are the roots of

0 =J2

φ

2r2+ 1

2Ω2r2 − H ⇒ 0 = r4 −2H

Ω2r2 +

J2φ

Ω2. (3.282)

26 This statement does not apply for stars that switch from loop to box orbits and backagain as the potential is varied (Binney & Spergel 1983; Evans & Collett 1994). Thesestars will generally be on highly eccentric orbits initially.


Hence, the axis ratio of the orbit is

qH =rmin

rmax=

(H/Ω2 −

[(H/Ω2)2 − (Jφ/Ω)2

]1/2

H/Ω2 +[(H/Ω2)2 − (Jφ/Ω)2

]1/2

)1/2

=

(2Jr + Jφ − 2[Jr(Jr + Jφ)]1/2

2Jr + Jφ + 2[Jr(Jr + Jφ)]1/2

)1/2

.

(3.283)

Multiplying top and bottom of the fraction by the top, this last expressionreduces to

qH =1

Jφ

[2Jr + Jφ − 2

√Jr(Jr + Jφ)

]. (3.284)

When the hole has become sufficiently massive, the Hamiltonian may betaken to be that for Kepler motion (eq. E.6) and the orbit becomes an ellipsewith the black hole at the focus rather than the center of the ellipse. A similarcalculation yields for the axis ratio of this ellipse

qK =

[1 −

(rmax − rmin

rmax + rmin

)2]1/2

=Jφ

Jr + Jφ. (3.285)

When Jr/Jφ is eliminated between equations (3.284) and (3.285), we find

qK =4qH

(1 + qH)2. (3.286)

For example, if qH = 0.5 is the original axis ratio, the final one is qK = 0.889,and if initially qH = 0.75, then finally qK = 0.980. Physically, an elongatedellipse that is centered on the black hole distorts into a much rounder orbitwith the black hole at one focus.

For any orbit in a spherical potential the mean-square radial speed is

v2r =

Ωr

π

∫ π/Ωr

0dt v2

r =Ωr

π

∫ rmax

rmin

dr vr = ΩrJr. (3.287a)

Similarly, the mean-square tangential speed is

v2t =

Ωφ

2π

∫ 2π/Ωφ

0dt (Rφ)2 =

Ωφ

2π

∫ 2π

0dφ pφ = ΩφJφ. (3.287b)

Since the actions do not change as the hole grows, the change in the ratio ofthe mean-square speeds is given by

(v2

r/v2t

)K(

v2r/v2

t

)H

=(Ωr/Ωφ)K(Ωr/Ωφ)H

=1

2. (3.288)

Consequently, the growth of the black hole increases the star’s tangentialvelocity much more than it does the radial velocity, irrespective of the originaleccentricity of the orbit. In §4.6.1a we shall investigate the implications ofthis result for measurements of the stellar velocity dispersion near a blackhole, and show how the growth of the black hole enhances the density ofstars in its vicinity.

3.7 Perturbations and chaos 243

3.7 Perturbations and chaos

Analytic solutions to a star’s equations of motion exist for only a few simplepotentials Φ(x). If we want to know how stars will move in a more complexpotential, for example one estimated from observational data, two strategiesare open to us: either solve the equations of motion numerically, or obtainan approximate analytic solution by invoking perturbation theory, whichinvolves expressing the given potential as a sum of a potential for whichwe can solve the equations of motion analytically and a (one hopes) smalladditional term.

Even in the age of fast, cheap and convenient numerical computation,perturbative solutions to the equations of motion are useful in two ways.First, they can be used to investigate the stability of stellar systems (§5.3).Second, they give physical insight into the dynamics of orbits. We startthis section by developing perturbation theory and sketching some of itsastronomical applications; then we describe the phenomenon of orbital chaos,and show that Hamiltonian perturbation theory helps us to understand thephysics of this phenomenon.

3.7.1 Hamiltonian perturbation theory

In §3.3.3 we derived approximate orbits in the potential of a weak bar, bytreating the potential as a superposition of a small non-axisymmetric poten-tial and a much larger axisymmetric one. Our approach involved writing theorbit x(t) as a sum of two parts, one of which described the circular orbitof a guiding center, while the other described epicyclic motion. We workeddirectly with the equations of motion. Angle-action variables enable us todevelop a more powerful perturbative scheme, in which we work with scalarfunctions rather than coordinates, and think of the orbit as a torus in phasespace rather than a time-ordered series of points along a trajectory. For moredetail see Lichtenberg & Lieberman (1992).

Let H0 be an integrable Hamiltonian, and consider the one-parameterfamily of Hamiltonians

Hβ ≡ H0 + βh, (3.289)

where β + 1 and h is a Hamiltonian with gradients that are comparablein magnitude to those of H0. Let (θβ ,Jβ) be angle-action variables forHβ. These coordinates are related to the angle-action variables of H0 by acanonical transformation. As β → 0 the generating function S (AppendixD.4.6) of this transformation will tend to the generating function of theidentity transformation, so we may write

S(θβ ,J0) = θβ · J0 + sβ(θβ ,J0), (3.290)

where sβ is O(β), and (eq. D.94)

Jβ =∂S

∂θβ= J0 +

∂sβ

∂θβ; θ0 = θβ +

∂sβ

∂J0. (3.291)


Substituting these equations into (3.289), we have

Hβ(Jβ) = H0(J0) + βh(θ0,J0)

= H0(Jβ −

∂sβ

∂θβ

)+ βh

(θβ +

∂sβ

∂J0,Jβ −

∂sβ

∂θβ

)

= H0(Jβ) − Ω0(Jβ) ·∂sβ

∂θβ+ βh(θβ,Jβ) + O(β2),

(3.292)

where Ω0 is the derivative of H0 with respect to its argument. We nextexpand h and sβ as Fourier series in the periodic angle variables (AppendixB.4):

h(θβ ,Jβ) =∑

n

hn(Jβ) ein·θβ

; sβ(θβ ,J0) = i∑

n

sβn(J0) ein·θβ

, (3.293)

where n = (n1, n2, n3) is a triple of integers. Substituting these expressionsinto (3.292) we find

Hβ(Jβ) = H0(Jβ) + βh0 +∑

n (=0

(βhn + n · Ω0sβ

n

)ein·θβ

+ O(β2). (3.294)

In this equation Ω0 and hn are functions of Jβ , while sn is a function of J0,but to the required order in β, J0 can be replaced by Jβ .

Since the left side of equation (3.294) does not depend on θβ , on theright the coefficient of exp(in · θβ) must vanish for all n /= 0. Hence theFourier coefficients of S are given by

sβn(J) = −

βhn(J)

n · Ω0(J)+ O(β2) (n /= 0). (3.295)

The O(β) part of equation (3.295) defines the generating function of acanonical transformation. Let (θ′,J′) be the images of (θ0,J0) under thistransformation. Then we have shown that

Hβ(Jβ) = H ′(J′) + β2h′(θ′,J′), (3.296a)

whereH ′(J′) ≡ H0(J′) + βh0(J

′) (3.296b)

and h′ is a function involving second derivatives of H0 and first derivativesof h.

The analysis we have developed can be used to approximate orbits ina given potential. As we saw in §3.2.2, if we know an integral other thanthe Hamiltonian of a system with two degrees of freedom, we can calculatethe curve in a surface of section on which the consequents of a numerically


Figure 3.35 A surface of section for orbits in a flatted isochrone potential. The densitydistribution generating the potential has axis ratio q = 0.7. The points are the consequentsof numerically calculated orbits. The dotted curves show the orbital tori for the sphericalisochrone potential that have the same actions as the numerically integrated orbits. Thefull curves show the result of using first-order Hamiltonian perturbation theory to deformthese tori.

integrated orbit should lie. Since J′ differs from the true action by onlyO(β2) it should provide an approximate integral of motion, and it is inter-esting to compare the invariant curve that it yields with the consequents ofa numerically integrated orbit. Figure 3.35 is a surface of section for orbitsin a flattened isochrone potential. The density distribution that generatesthis potential is obtained by replacing r by

√R2 + z2/q2 and M by M/q

in equations (2.48b) and (2.49). The axis ratio q has been set equal to 0.7.The dots show the consequents of numerically integrated orbits. The dottedcurves show the corresponding invariant curves for the spherical isochrone.The full curves show the results of applying first-order perturbation theory tothe spherical isochrone to obtain better approximations to invariant curves.

The full curves in Figure 3.35 fit the numerical consequents much betterthan the dotted curves, but the fit is not perfect. An obvious strategy for sys-tematically improving our approximation to the true angle-action variablesis to use our existing machinery to derive from (3.296a) a second canonicaltransformation that would enable us to write H as a sum of a HamiltonianH ′′(J′′) that is a function of new actions J′′ and a yet smaller perturbationβ4h′′. After we have performed k transformations, the angle-dependent partof Hβ will be of order β2k

. In practice this procedure is unlikely to workbecause after each application the “unperturbed” frequencies of the orbitchange from Ω′ = ∂H ′/∂J′ to Ω′′ = ∂H ′′/∂J′′, and sooner or later we willfind that n · Ω′′ is very close to zero for some n, with the consequence thatthe corresponding term in the generating function (3.295) becomes large.


This is the problem of small divisors. Fortunately, in many applicationsthe coefficients hn in the numerators of (3.295) decline sufficiently quickly as

|n| increases that for most orbits |β2khn/n · Ω| is small for all n.

Box 3.5 outlines how the so-called KAM theory enables one to over-come the problem of small divisors for most tori, and for them constructa convergent series of canonical transformations that yield the angle-actionvariables of Hβ to arbitrarily high accuracy for sufficiently small β.

3.7.2 Trapping by resonances

Figure 3.36, like Figure 3.35, is a surface of section for motion in a flattenedisochrone potential, but the axis ratio of the mass distribution that gener-ates the potential is now q = 0.4 rather than q = 0.7. The consequentsof two orbits are shown together with the approximations to the invariantcurves of these orbits that one obtains from the angle-action variables of thespherical isochrone potential with (full curves) and without (dotted curves)first-order perturbation theory. The inner full invariant curve is not very farremoved from the inner loop of orbital consequents, but the outer full invari-ant curve does not even have the same shape as the crescent of consequentsthat is generated by the second orbit. The deviation between the outer fullinvariant curve and the consequents is an example of resonant trapping,a phenomenon intimately connected with the problem of small divisors thatwas described above.

To understand this connection, consider how the frequencies of orbitsin the flattened isochrone potential are changed by first-order perturbationtheory. We obtain the new frequencies by differentiating equation (3.296b)with respect to the actions. Figure 3.37 shows the resulting ratio Ωr/Ωϑ as afunction of Jϑ at the energy of Figure 3.36. Whereas Ωr > Ωϑ for all unper-turbed orbits, for some perturbed orbit the resonant condition Ωr − Ωϑ = 0is satisfied. Consequently, if we attempt to use equation (3.295) to refine thetori that generate the full curves in Figure 3.36, small divisors will lead tolarge distortions in the neighborhood of the resonant torus. These distortionswill be unphysical, but they are symptomatic of a real physical effect, namelya complete change in the way in which orbital tori are embedded in phasespace. The numerical consequents in Figure 3.36, which mark cross-sectionsthrough two tori, one before and one after the change in the embedding,make the change apparent: one torus encloses the shell orbit whose singleconsequent lies along pR = 0, while the other torus encloses the resonantorbit whose single consequent lies near (R, pR) = (2.2, 0.38).

Small divisors are important physically because they indicate that aperturbation is acting with one sign for a long time. If the effects of aperturbation can accumulate for long enough, they can become important,even if the perturbation is weak. So if N · Ω is small for some N, then theterm hN in the Hamiltonian can have big effects even if it is very small.


Box 3.5: KAM theory

Over the period 1954–1967 Kolmogorov, Arnold and Moser demonstratedthat, notwithstanding the problem of small divisors, convergent pertur-bation series can be constructed for Hamiltonians of the form (3.289).The key ideas are (i) to focus on a single invariant torus rather than acomplete foliation of phase space by invariant tori, and (ii) to determineat the outset the frequencies Ω of the torus to be constructed (Lichten-berg & Lieberman 1992). In particular, we specify that the frequencyratios are far from resonances in the sense that |n · Ω| > α|n|−γ for alln and some fixed, non-negative numbers α and γ. We map an invarianttorus of H0 with frequencies Ω into an invariant torus of Hβ by meansof the generating function

S(θβ ,J0) = θβ · (J0 + j) + sβ(θβ,J0). (1)

This differs from (3.290) by the addition of a term θβ · j, where j is aconstant of order β. Proceeding in strict analogy with the derivation ofequations (3.295) and (3.296b), we find that if the Fourier coefficients ofsβ are chosen to be

sβn = −

βhn

n · Ω(n /= 0), (2)

then we obtain a canonical transformation to new coordinates (θ′,J′) interms of which Hβ takes the form (3.296a) with

H ′(J′) = H0(J′) + βh0(J′) − j · Ω. (3)

We now choose the parameter j such that the frequencies of H ′ are stillthe old frequencies Ω, which were far from any resonance. That is, wechoose j to be the solution of

β∂h0

∂Jj= j ·

∂Ω

∂Jj=∑

i

ji ·∂2H0

∂Ji∂Jj. (4)

This linear algebraic equation will be soluble provided the matrix ofsecond derivatives of H0 is non-degenerate. With j the solution to thisequation, the problem posed by Hβ in the (θ′,J′) coordinates differs fromour original problem only in that the perturbation is now O(β2). Conse-quently, a further canonical transformation will reduce the perturbationto O(β4) and so on indefinitely. From the condition |n ·Ω| > α|n|−γ onemay show that the series of transformations converges.

We now use this idea to obtain an analytic model of orbits near res-onances. Our working will be a generalization of the discussion of orbitaltrapping at Lindblad resonances in §3.3.3b. For definiteness we shall assume


Figure 3.36 The same as Figure 3.35 except that the density distribution generating thepotential now has axis ratio q = 0.4.

Figure 3.37 The ratio of the fre-quencies in first-order perturbationtheory for a star that moves in aflattened isochrone potential.

that there are three actions and three angles. The resonance of H0 is char-acterized by the equation N · Ω = 0, and (θ,J) are angle-action variablesfor the unperturbed Hamiltonian. Then in the neighborhood of the resonantorbit the linear combination of angle variables φs ≡ N · θ will evolve slowly,and we start by transforming to a set of angle-action variables that includesthe slow angle φs. To do so, we introduce new action variables Is, If1, andIf2 through the generating function

S = (N · θ)Is + θ1If1 + θ2If2. (3.297)

Then (eq. D.93)

φs =∂S

∂Is= N · θ

φf1 = θ1φf2 = θ2

J1 =∂S

∂θ1= N1Is + If1

J2 = N2Is + If2

J3 = N3Is.

(3.298)


Since the old actions are functions only of the new ones, H0 does not acquireany angle dependence when we make the canonical transformation, and theHamiltonian is of the form

H(φ, I) = H0(I) + β∑

n

hn(I)ein·φ, (3.299)

where it is to be understood that H0 is a different function of I than it wasof J and similarly for the dependence on I of hn. We now argue that anyterm in the sum that contains either of the fast angles φf1 and φf2 has anegligible effect on the dynamics—these terms give rise to forces that rapidlyaverage to zero. We therefore drop all terms except those with indices thatare multiples of n = ±(1, 0, 0), including n = 0. Then our approximateHamiltonian reduces to

H(φ, I) = H0(I) + β∑

k

hk(I) eikφs . (3.300)

Hamilton’s equations now read

Is = −iβ∑

k

khk(I) eikφs ; φs = Ωs + β∑

k

∂hk

∂Iseikφs

If1 = 0 ; If2 = 0,

(3.301)

where Ωs ≡ ∂H0/∂Is. So If1 and If2 are two constants of motion and wehave reduced our problem to one of motion in the (φs, Is) plane. EliminatingI between equations (3.298) and (3.301) we find that although all the oldactions vary, two linear combinations of them are constant:

N2J1 − N1J2 = constant ; N3J2 − N2J3 = constant. (3.302)

We next take the time derivative of the equation of motion (3.301) for φs.We note that Ωs, but not its derivative with respect to Is, is small becauseit vanishes on the resonant torus. Dropping all terms smaller than O(β),

φs (∂Ωs

∂IsIs = −iβ

∂Ωs

∂Is

∑

k

khkeikφs . (3.303)

If we define

V (φs) ≡ β∂Ωs

∂Is

∑

k

hk(I) eikφs , (3.304)

where I is evaluated on the resonant torus, then we can rewrite (3.303) as

φs = −dV

dφs. (3.305)


This is the equation of motion of an oscillator. If V were proportional to φ2s ,

the oscillator would be harmonic. In general it is an anharmonic oscillator,such as a pendulum, for which V ∝ cosφs. The oscillator’s energy invariantis

Ep ≡ 12 φ

2s + V (φs). (3.306)

V is a periodic function of φs, so it will have some maximum value Vmax,and if Ep > Vmax, φs circulates because equation (3.306) does not permit φs

to vanish. In this case the orbit is not resonantly trapped and the torus islike the ones shown in Figure 3.36 from first-order perturbation theory. IfEp < Vmax, the angle variable is confined to the range in which V ≤ Ep;the orbit has been trapped by the resonance. On trapped orbits φs librateswith an amplitude that can be of order unity, and at a frequency of order√β, while Is oscillates with an amplitude that cannot be bigger than order√β. Such orbits generate the kind of torus that is delineated by the crescent

of numerical consequents in Figure 3.36. We obtain an explicit expressionfor the resonantly induced change ∆Is by integrating the equation of motion(3.301) for Is:

∆Is = −(∂Ωs

∂Is

)−1∫

dφs∂V/∂φs

φs

= ±(∂Ωs

∂Is

)−1√2[Ep − V (φs)] ,

(3.307)

where (3.306) has been used to eliminate φs.The full curve in Figure 3.38 shows the result of applying this model

of a resonantly trapped orbit to the data depicted in Figure 3.36. Since themodel successfully reproduces the gross form of the invariant curve on whichthe consequents of the trapped orbit lie, we infer that the model has capturedthe essential physics of resonant trapping. The discrepancies between the fullcurve and the numerical consequents are attributable to the approximationsinherent in the model.

Levitation We now describe one example of an astronomical phenomenathat may be caused by resonant trapping of stellar orbits. Other examplesare discussed by Tremaine & Yu (2000). In our discussion we shall employJr, Jϑ and Jφ to denote the actions of a mildly non-spherical potential thatare the natural extensions of the corresponding actions for spherical systemsthat were introduced in §3.5.2.

The disk of the Milky Way seems to be a composite of two chemicallydistinct disks, namely the thin disk, to which the Sun belongs, and a thicker,more metal-poor disk (page 13). Sridhar & Touma (1996) have suggestedthat resonant trapping of the orbits of disk stars may have converted theGalaxy’s original thin disk into the thick disk. The theory of hierarchi-cal galaxy formation described in Chapter 9 predicts that the Galaxy wasoriginally dominated by collisionless dark matter, which is not highly con-centrated towards the plane. Consequently, the frequency Ωϑ at which a


Figure 3.38 Perturbation theory applied to resonant trapping in the flattened isochronepotential. The points are the consequents shown in Figure 3.36, while the full curves inthat figure are shown dotted here. The full curve shows the result of using (3.307) tomodel the resonantly trapped orbit.

star oscillated perpendicular to the plane was originally smaller than thefrequency Ωr of radial oscillations—see equation (3.82). As more and morebaryonic material accumulated near the Galaxy’s equatorial plane, the ratioΩϑ/Ωr rose slowly from a value less than unity to its present value. For starssuch as the Sun that are on nearly circular orbits within the plane, Ωr andΩϑ are equal to the current epicycle and vertical frequencies κ and ν, respec-tively, so now Ωϑ/Ωr ( 2 (page 167). It follows that the resonant conditionΩr = Ωϑ has at some stage been satisfied for many stars that formed whenthe inner Galaxy was dark-matter dominated.

Let us ask what happens to a star in the disk as the disk slowly growsand Ωϑ/Ωr slowly increases. At any energy, the first stars to satisfy theresonant condition Ωr = Ωϑ will have been those with the largest values ofΩϑ, that is, stars that orbit close to the plane, and have Jϑ ( 0. In an(R, pR) surface of section, such orbits lie near the zero-velocity curve thatbounds the figure (§3.2.2) because Jϑ increases as one moves in towards thecentral fixed point on pR = 0. Hence, the resonant condition will first havebeen satisfied on the zero-velocity curve, and it is here that the resonantisland seen in Figure 3.38 first emerged as the potential flattened. As massaccumulated in the disk, the island moved inwards, and, depending on thevalues of E and Lz, finally disappeared near the central fixed point.


Figure 3.39 A surface of section for motion in ΦL (eq. 3.103) with q = 0.6.

When the advancing edge of a resonant island reaches the star’s phase-space location, there are two possibilities: either (a) the star is trapped by theresonance and its phase-space point subsequently moves within the island,or (b) its phase-space point suddenly jumps to the other side of the island.Which of (a) or (b) occurs in a particular case depends on the precise phaseof the star’s orbit at which the edge reaches it. In practice it is most useful todiscard phase information and to consider that either (a) or (b) occurs withappropriate probabilities Pa and Pb = 1 − Pa. The value of Pa depends onthe speed with which the island is growing relative to the speed with whichits center is moving (Problem 3.43); it is zero if the island is shrinking.

We have seen that the resonant island associated with Ωr = Ωϑ firstemerged on the zero-velocity curve, which in a thin disk is highly populatedby stars. Most of these stars were trapped as the island grew. They thenmoved with the island as the latter moved in towards the central fixed point.The stars were finally released as the island shrank somewhere near thatpoint. The net effect of the island’s transitory existence is to convert radialaction to latitudinal action, thereby shifting stars from eccentric, planarorbits to rather circular but highly inclined ones. Hence, a hot thin diskcould have been transformed into a thick disk.


Figure 3.40 The appearance in realspace of a banana orbit (top) anda fish orbit (bottom). In the upperpanel the cross marks the center ofthe potential. Resonant box orbitsof these types are responsible forthe chains of islands in Figure 3.39.The banana orbits generate the outerchain of four islands, and the fishorbits the chain of six islands furtherin.

3.7.3 From order to chaos

Figure 3.39 is a surface of section for motion in the planar barred potentialΦL that is defined by equation (3.103) with q = 0.6 and Rc = 0.14. Itshould be compared with Figures 3.9 and 3.12, which are surfaces of sectionfor motion in ΦL for more nearly spherical cases, with q = 0.9 and 0.8. InFigure 3.39 one sees not only the invariant curves of loop and box orbits thatfill the other two figures, but also a number of “islands”: a set of four largeislands occupies much of the outer region, while a set of six islands of varyingsizes is seen further in. In the light of our discussion of resonant trapping,it is natural to refer to the orbits that generate these islands as resonantlytrapped box orbits. Figure 3.40 shows what these orbits look like in realspace. We see that the outer islands are generated by “banana” orbits inwhich the x- and y-oscillations are trapped in a Ωx:Ωy = 1:2 resonance (thestar oscillates through one cycle left to right while oscillating through twocycles up and down). Similarly, the inner chain of six islands is associatedwith a “fish” orbit that satisfies the resonance condition Ωx:Ωy = 2:3.

The islands in Figure 3.39 can be thought of as orbits in some underlyingintegrable Hamiltonian H0 that are trapped by a resonance arising from aperturbation. This concept lacks precision because we do not know what H0

actually is. In particular, Hamiltonians of the form Hq(x,v) = 12v2 + ΦL(x)

are probably not integrable for any value of the axis ratio q other thanunity. Therefore, we cannot simply assume that H0 = H0.8, say. On theother hand, Figure 3.12, which shows the surface of section for q = 0.8,contains no resonant islands—all orbits are either boxes or loops—which weknow from our study of Stackel potentials in §3.5.4 is compatible with anintegrable potential. So we can define an integrable Hamiltonian H0 thatdiffers very little from H0.8 as follows. On each of the invariant tori that


appears in Figure 3.12 we set H0 = H0.8, and at a general phase-space pointwe obtain the value of H0 by a suitable interpolation scheme from nearbypoints at which H0 = H0.8.

The procedure we have just described for defining H0 (and thus the per-turbation h = H −H0) suffers from the defect that it is arbitrary: why startfrom the invariant tori of H0.8 rather than H0.81 or some other Hamiltonian?A numerical procedure that might be considered less arbitrary has been de-scribed by Kaasalainen & Binney (1994). In any event, it is worth bearing inmind in the discussion that follows that H0 and h are not uniquely defined,and one really ought to demonstrate that for a given H the islands that arepredicted by perturbation theory are reasonably independent of H0. As faras we know, no such demonstration is available.

If we accept that the island chains in Figure 3.39 arise from box or-bits that are resonantly trapped by some perturbation h on a Stackel-likeHamiltonian H0, two questions arise. First, “are box orbits trapped aroundresonances other than the 1:2 and 2:3 resonances that generate the bananaand fish orbits of Figure 3.40?” Certainly infinitely many resonances areavailable to trap orbits because as one moves along the sequence of boxorbits from thin ones to fat ones, the period of the y-oscillations is steadilygrowing in parallel with their amplitude, while the period of the x-oscillationsis diminishing for the same reason.27 In fact, the transition to loop orbitscan be associated with resonant trapping by the 1:1 resonance, so betweenthe banana orbits and the loop orbits there is not only the 2:3 resonance thatgenerates the fishes, but also the 4:5, 5:6, . . . , resonances. In the potentialΦL on which our example is based, the width of the region in phase spacein which orbits are trapped by the m:n resonance diminishes rapidly with|m + n| and the higher-order resonances are hard to trace in the surface ofsection—but the 4:5 resonance can be seen in Figure 3.39.

The second question is “do resonances occur within resonant islands?”Consider the case of the banana orbits shown in Figure 3.40 as an example.Motion along this orbit is quasiperiodic with two independent frequencies.One independent frequency Ωb is associated with motion along the bow-shaped closed orbit that runs through the heart of the banana, while theother is the frequency of libration Ωl about this closed orbit. The librationfrequency decreases as one proceeds along the sequence of banana orbits fromthin ones to fat ones, so infinitely many resonant conditions Ωb:Ωl = m:nwill be satisfied within an island of banana orbits. In the case of ΦL thereis no evidence that any of these resonances traps orbits, but in another casewe might expect trapping to occur also within families of resonantly trappedorbits.

This discussion is rather disquieting because it implies that the degree towhich resonant trapping causes the regular structure of phase space inheritedfrom the underlying integral potential H0 to break up into islands depends

27 The period of a nonlinear oscillator almost always increases with amplitude.


Figure 3.41 Surface of section for motion in the potential ΦN of equation (3.309) withRe = 3. The inner region has been blanked out and is shown in expanded form inFigure 3.42.

on the detailed structure of the perturbation h. Since we have no unique wayof defining h we cannot compute its Fourier coefficients and cannot predicthow important islands will be.

We illustrate this point by examining motion in a potential that is closelyrelated to ΦL in which resonant trapping is much more important (Binney1982). In polar coordinates equation (3.103) for ΦL reads

ΦL(R,φ) = 12v2

0 ln[R2

c + 12R2(q−2 + 1) − 1

2R2(q−2 − 1) cos 2φ]. (3.308)

The potential

ΦN(R,φ) = 12v2

0 ln

[R2

c + 12R2(q−2 + 1)−1

2R2(q−2 − 1) cos 2φ

−R3

Recos 2φ

],

(3.309)

where Re is a constant, differs from ΦL only by the addition of (R3/Re) cos 2φto the logarithm’s argument. For R + Re this term is unimportant, but asR grows it makes the isopotential curves more elongated. Let us set Re = 3,Rc = 0.14, and q = 0.9, and study the surface of section generated by orbits


Figure 3.42 The inner part of the surface of section shown in Figure 3.41—the chain ofeight islands around the edge is the innermost chain in Figure 3.41. In the gap betweenthis chain and the bull’s-eyes are the consequents of two irregular orbits.

in ΦN that is most nearly equivalent to the surface of section for ΦL withthe same values of Rc and q that is shown in Figure 3.9. Figure 3.41 showsthe outer part of this surface of section. Unlike Figure 3.9 it shows severalchains of islands generated by resonantly trapped box orbits. The individualislands are smaller than those in Figure 3.39, and the regions of untrappedorbits between chains of islands are very thin. Figure 3.42 shows the innerpart of the same surface of section. In the gap between the region of regularbox orbits that is shown in Figure 3.41 and the two bull’s-eyes associatedwith loop orbits, there is an irregular fuzz of consequents. These consequentsbelong to just two orbits but they do not lie on smooth curves; they appear tobe randomly scattered over a two-dimensional region. Since the gap withinwhich these consequents fall lies just on the boundary of the loop-dominatedregion, we know that it contains infinitely many resonant box orbits. Hence,it is natural to conclude that the breakdown of orbital regularity, which therandom scattering of consequents betrays, is somehow caused by more thanone resonance simultaneously trying to trap an individual orbit. One saysthat the orbits have been made irregular by resonance overlap.

(a) Irregular orbits We now consider in more detail orbits whose con-sequents in a surface of section do not lie on a smooth curve, but appearto be irregularly sprinkled through a two-dimensional region. If we take the


Figure 3.43 Two orbits from the surface of section of Figure 3.42. The left orbit is notquasiperiodic, while the right one is.

Figure 3.44 Trapped and circulating orbits in a phase plane. The homoclinic orbit,shown by the heavy curve, divides the trapped orbits, which form a chain of islands, fromthe circulating orbits, whose consequents lie on the wavy lines at top and bottom.

Fourier transform of the time dependence of some coordinate, for examplex(t), along such an orbit, we will find that the orbit is not quasiperiodic; theFourier transform X(ω) (eq. B.69) has contributions from frequencies thatare not integer linear combinations of two or three fundamental frequen-cies. Figure 3.43 shows the appearance in real space of an orbit that is notquasiperiodic (left) and one that is (right). The lack of quasiperiodicity givesthe orbit a scruffy, irregular appearance, so orbits that are not quasiperiodicare called chaotic or irregular orbits.

There are generally some irregular orbits at the edge of a family ofresonantly trapped orbits. Figure 3.44 is a sketch of a surface of sectionthrough such a region of phase-space when all orbits are quasiperiodic. Theislands formed by the trapped orbits touch at their pointed ends and there areinvariant curves of orbits that circulate rather than librate coming right up tothese points. The points at which the islands touch are called hyperbolicfixed points and the invariant curves that pass through these points aregenerated by homoclinic orbits. In the presence of irregular orbits, the


islands of trapped orbits do not quite touch and the invariant curves ofthe circulating orbits do not reach right into the hyperbolic fixed point.Consequently there is space between the resonant islands and the region ofthe circulating orbits. Irregular orbits fill this space.

A typical irregular orbit alternates periods when it is resonantly trappedwith periods of circulation. Consequently, if one Fourier transforms x(t) overan appropriate time interval, the orbit may appear quasiperiodic, but thefundamental frequencies that would be obtained from the transform by themethod of Box 3.6 would depend on the time interval chosen for Fouriertransformation.

If the islands in a chain are individually small, it can be very hardto decide whether an orbit is librating or circulating, or doing both on anirregular pattern.

When it is available, a surface of section is the most effective way ofdiagnosing the presence of resonantly trapped and irregular orbits. Unfor-tunately, surfaces of section can be used to study three-dimensional orbitsonly when an analytic integral other than the Hamiltonian is known, as inthe case of orbits in an axisymmetric potential (§3.2). Two other methodsare available to detect irregular orbits when a surface of section cannot beused.

(b) Frequency analysis By numerically integrating the equations ofmotion from some initial conditions, we obtain time series x(t), y(t), etc.,for each of the phase-space coordinates. If the orbit is regular, these timeseries are equivalent to those obtained by substituting θ = θ0 + Ωt in theFourier expansions (3.191) of the coordinates. Hence, the frequencies Ωi

may be obtained by Fourier transforming the time series and identifying thevarious linear combinations n·Ω of the fundamental frequencies that occur inthe Fourier transform (Box 3.6; Binney & Spergel 1982). If a single systemof angle-action variables covers the entire phase space (as in the case ofStackel potentials), the actions Ji of the orbit that one obtains from a giveninitial condition w are continuous functions J(w) of w, so the frequenciesΩi = ∂H/∂Ji are also continuous functions of w. Consequently, if we chooseinitial conditions wα at the nodes of some regular two-dimensional grid inphase space, the frequencies will vary smoothly from point to point on thegrid. If, by contrast, resonant trapping is important, the actions of orbits willsometimes change discontinuously between adjacent grid points, because oneorbit will be trapped, while the next is not. Discontinuities in J give rise todiscontinuities in Ω. Moreover, the resonance that is entrapping orbits willbe apparent from the ratios ra ≡ Ω2/Ω1 and rb ≡ Ω3/Ω1. Hence a valuableway of probing the structure of phase space is to plot a dot at (ra, rb) foreach orbit obtained by integrating from a regular grid of initial conditionswα (Laskar 1990; Dumas & Laskar 1993).

Figure 3.45 shows an example of such a plot of frequency ratios. Theorbits plotted were integrated in the potential

Φ(x) = 12 ln[x2 + (y/0.9)2 + (z/0.7)2 + 0.1]. (3.310)


Box 3.6: Numerical determinationof orbital frequencies

The determination of orbital frequencies Ωi from a numerically integratedorbit is not entirely straightforward because (i) the orbit is integratedfor only a finite time interval (0, T ), and (ii) the function x(t) is sampledonly at discrete times t0 = 0, . . . , tK−1 = T , which we shall assume to beequally spaced. Let ∆ = ti+1−ti. Then a “line” Xeiωt in x(t) contributesto the discrete Fourier transform (Appendix G) an amount

xp = XK−1∑

k=0

eik∆(ω−ωp)

= Xeiαu sinπu

sin(πu/K),

where

ωp ≡2πp

K∆,

u ≡ K∆(ω − ωp)/(2π),

α ≡ π(K − 1)/K.

(1)

|xp| is large whenever the sine in the denominator vanishes, which occurswhen ωp ( ω + 2πm/∆, where m is any integer. Thus peaks can ariseat frequencies far from ω; a peak in |xp| that is due to a spectral linefar removed from ω is called an alias of the line. Near to a peak wecan make the approximation sin(πu/K) ( πu/K, so |xp| declines withdistance u from the peak only as u−1.

Orbital frequencies can be estimated by fitting equation (1) to thedata and thus determining ω. The main difficulty with this procedureis confusion between spectral lines—this confusion can arise either be-cause two lines are nearby, or because a line has a nearby alias. Oneway to reduce this confusion is to ensure a steeper falloff than u−1 bymultiplying the original time sequence by a “window” function w(t) thatgoes smoothly to zero at the beginning and end of the integration period(Press et al. 1986; Laskar 1990). Alternatively, one can identify peaks inthe second difference of the spectrum, defined by x′′

p = xp+1+ xp−1−2xp.One can show that for u/K + 1 the contribution to x′′

p of a line is

x′′p =

2XK

π

eiαu sinπu

u(u2 − 1), (2)

which falls off as u−3. The frequency, etc., of the line can be estimatedfrom the ratio of the x′′

p on either side of the line’s frequency.

Ωi was defined to be the non-zero frequency with the largest amplitude in thespectrum of the ith coordinate, and 10 000 orbits were obtained by droppingparticles from a grid of points on the surface Φ(x) = 0.5. Above and tothe right of the center of the figure, the points are organized into regularranks that reproduce the grid of initial conditions in slightly distorted form.


Figure 3.45 The ratios of orbital frequencies for orbits integrated in a three-dimensionalnon-rotating bar potential.

We infer that resonant trapping is unimportant in the phase-space regionsampled by these initial conditions. Running through the ranks we see severaldepopulated lines, while both within the ranks and beyond other lines areconspicuously heavily populated: orbits that have been resonantly trappedproduce points that lie along these lines. The integers ni in the relevantresonant condition n · Ω = 0 are indicated for some of the lines.

In some parts of Figure 3.45, for example the lower left region, the grid ofinitial conditions has become essentially untraceable. The disappearance ofthe grid indicates that irregular motion is important. In fact, the frequenciesΩi are not well defined for an irregular orbit, because its time series, x(t),y(t), etc., are not quasiperiodic. When software designed to extract thefrequencies of regular orbits is used on a time series that is not quasiperiodic,the frequencies returned vary erratically from one initial condition to the nextand the resulting points in the plane of frequency ratios scatter irregularly.

(c) Liapunov exponents If we integrate Hamilton’s equations for sometime t, we obtain a mapping Ht of phase space onto itself. Let Ht map thephase space point w0 into the point wt. Points near w0 will be mappedto points that lie near wt, and if we confine our attention to a sufficientlysmall region around w0, we may approximate Ht by a linear map of theneighborhood of w0 into a neighborhood of wt. We now determine this map.Let w′

0 be a point near w0, and δw(t) = Htw′0 − Htw0 be the difference


between the phase-space coordinates of the points reached by integratingHamilton’s equations for time t from the initial conditions w′

0 and w0. Thenthe equations of motion of the components of δw are

˙δx =(∂H

∂v

)

w′t

−(∂H

∂v

)

wt

(( ∂2H

∂w∂v

)

wt

· δw

˙δv = −(∂H

∂x

)

w′t

+(∂H

∂x

)

wt

( −( ∂2H

∂w∂x

)

wt

· δw,

(3.311)

where the approximate equality in each line involves approximating the firstderivatives of H by the leading terms in their Taylor series expansions. Equa-tions (3.311) are of the form

dδw

dt= Mt · δw where Mt ≡

∂2H

∂x∂v

∂2H

∂v∂v

−∂2H

∂x∂x−∂2H

∂v∂x

. (3.312)

For any initial vector δw0 these equations are solved by δwt = Ut · δw0,where Ut is the matrix that solves

dUt

dt= Mt · Ut. (3.313)

We integrate this set of ordinary coupled linear differential equations fromU0 = I in parallel with Hamilton’s equations of motion for the orbit. Thenwe are in possession of the matrix Ut that describes the desired linear mapof a neighborhood of w0 into a neighborhood of wt. We perform a “singular-value decomposition” of Ut (Press et al. 1986), that is we write it as a productUt = R2 · S · R1 of two orthogonal matrices Ri and a diagonal matrix S.28

Ut conserves phase-space volume (page 803), so it never maps any vectorto zero and the diagonal elements of S are all non-zero. In fact they are allpositive because Ut evolves continuously from the identity, and their productis unity. A useful measure of the amount by which Ut shears phase space isthe magnitude s of the largest element of S. The Liapunov exponent ofthe orbit along which (3.313) has been integrated is defined to be

λ = limt→∞

ln s

t. (3.314)

28 Any linear transformation of an N-dimensional vector space can be decomposed intoa rotation, a rescaling in N perpendicular directions, and another rotation. R1 rotatesaxes to the frame in which the coordinate directions coincide with the scaling directions.S effects the rescaling. R2 first rotates the coordinate directions back to their old valuesand then effects whatever overall rotation is required.


Since the scaling s is dimensionless, the Liapunov exponent λ has dimensionsof a frequency. In practice one avoids integrating (3.313) for long timesbecause numerical difficulties would be encountered once the ratio of thelargest and smallest numbers on the diagonal of S became large. Instead oneintegrates along the orbit for some time t1 to obtain a value s1, and thensets Ut back to the identity and continues integrating for a further time t2 toobtain s2, after which Ut is again set to the identity before the integrationis continued. After N such steps one estimates λ from

λ (∑N

i ln si∑Ni ti

. (3.315)

Using this procedure one finds that along a regular orbit λ → 0, while alongan irregular orbit λ is non-zero.

Angle-action variables enable us to understand why λ is zero for a regularorbit. A point near w0 will have angles and actions that differ from thoseof w0 by small amounts δθi, δJi. The action differences are invariant as wemove along the orbit, while the angle differences increase linearly in time dueto differences in the frequencies Ωi of the orbits on which our initial pointand w0 lie. Consequently, the scalings si associated with angle differencesincrease linearly in time, and, by (3.314), the Liapunov exponent is λ =limt→∞ t−1 ln t = 0.

If the Liapunov exponent of an orbit is non-zero, the largest scaling fac-tor s must increase exponentially in time. Thus in this case initially neigh-boring orbits diverge exponentially in time. It should be noted, however,that this exponential divergence holds only so long as the orbits remain closein phase space: the definition of the Liapunov exponent is in terms of thelinearized equations for orbital perturbations. The approximations involvedin deriving these equations will soon be violated if the solutions to the equa-tions are exponentially growing. Hence, we cannot conclude from the factthat an orbit’s Liapunov exponent is non-zero that an initially neighboringorbit will necessarily stray far from the original orbit.

3.8 Orbits in elliptical galaxies

Elliptical galaxies nearly always have cusps in their central density profilesin which ρ ∼ r−α with 0.3 ∼< α ∼< 2 (BM §4.3.1). Black holes with masses∼ 0.2% of the mass of the visible galaxy are believed to reside at the centersof these cusps (§1.1.6 and BM §11.2.2). Further out the mass distributionsof many elliptical galaxies are thought to be triaxial (BM §4.3.3). Thesefeatures make the orbital dynamics of elliptical dynamics especially rich,and illustrate aspects of galaxy dynamics that we have already discussed inthis chapter (Merritt & Fridman 1996; Merritt & Valluri 1999).

3.8 Orbits in elliptical galaxies 263

3.8.1 The perfect ellipsoid

A useful basic model of the orbital dynamics of a triaxial elliptical galaxy isprovided by extensions to three dimensions of the two-dimensional Stackelpotentials of §3.5.4 (de Zeeuw 1985). The simplest three-dimensional systemthat generates a Stackel potential through Poisson’s equation is the perfectellipsoid, in which the density is given by

ρ(x) =ρ0

(1 + m2)2where m2 ≡

x2 + (y/q1)2 + (z/q2)2

a20

. (3.316)

In this formula q1 and q2 are the axis ratios of the ellipsoidal surfaces ofconstant density, and a0 is a scale length. At radii significantly smaller thana0, the density is approximately constant, while at r ) a0 the density fallsoff ∝ r−4. Since these asymptotic forms differ from those characteristic ofelliptical galaxies, we have to expect the orbital structures of real galaxies todiffer in detail from that of the perfect ellipsoid, but nevertheless the modelexhibits much of the orbital structure seen in real elliptical galaxies.

By an analysis similar to that used in §3.5.4 to explore the potential ofa planar bar, one can show that the perfect ellipsoid supports four types oforbit. Figure 3.46 depicts an orbit of each type. At top left we have a boxorbit. The key feature of a box orbit is that it touches the isopotential surfacefor its energy at its eight corners. Consequently, the star comes to rest foran instant at these points; a box orbit is conveniently generated numericallyby releasing a star from rest on the equipotential surface. The potential’slongest axis emerges from the orbit’s convex face. The other three orbits areall tube orbits: stars on these orbits circulate in a fixed sense around thehole through the orbit’s center, and are never at rest. The most importanttube orbits are the short-axis loops shown at top right, which circulate aroundthe potential’s shortest axis. These orbits are mildly distorted versions of theorbits that dominate the phase space of a flattened axisymmetric potential.The tube orbits at the bottom of Figure 3.46 are called outer (left) and innerlong-axis tube orbits, and circulate around the longest axis of the potential.Tube orbits around the intermediate axis are unstable. All these orbits canbe quantified by a single system of angle-action coordinates (Jλ, Jµ, Jν) thatare generalizations of the angle-action coordinates for spherical potentials(Jr, Jϑ, Jφ) of Table 3.1 (de Zeeuw 1985).

3.8.2 Dynamical effects of cusps

The most important differences between a real galactic potential and thebest-fitting Stackel potential are at small radii. Box orbits, which alone pen-etrate to arbitrarily small radii, are be most affected by these differences. Thebox orbits of a given energy form a two-parameter family: the parameterscan be taken to be an orbit’s axis ratios. Resonant relations n · Ω = 0 be-tween the fundamental frequencies of an orbit are satisfied at various points


Figure 3.46 Orbits in a non-rotating triaxial potential. Clockwise from top left: (a) boxorbit; (b) short-axis tube orbit; (c) inner long-axis tube orbit; (d) outer long-axis tubeorbit. From Statler (1987), by permission of the AAS.

in parameter space, but in a Stackel potential none of these resonances trapsother orbits. We expect perturbations to cause some resonances to becometrapping. Hence it is no surprise to find that in potentials generated byslightly cusped mass distributions, significant numbers of orbits are trappedby resonances. (In Figure 3.45 we have already encountered extensive reso-nant trapping of box orbits in a triaxial potential that differs from a Stackelpotential.)


A regular orbit on which the three angle variables satisfy the conditionn · Ω = 0 is a two-dimensional object since its three actions are fixed, andone of its angles is determined by the other two. Consequently, the orbitoccupies a surface in real space. A generic resonantly trapped orbit is a three-dimensional structure because it has a finite libration amplitude around theresonant orbit. In practice the amplitude of the libration is usually small,with the result that the orbit forms a sheet of small but finite thicknessaround the resonant orbit. It is found that stable resonant box orbits arecentrophobic, that is, they avoid the galactic center (Merritt & Valluri1999).

Steepening the cusp in the galaxy’s central density profile enhances thedifference between the galactic potential and the best-fitting Stackel modeland thus the importance of resonances. More and more resonances overlap(§3.7.3) and the fraction of irregular orbits increases.

The existence of large numbers of irregular orbits in elliptical galaxiesis likely to have important but imperfectly understood astronomical impli-cations because irregular orbits display a kind of creep or diffusion. To un-derstand this phenomenon, imagine that there is a clean distinction betweenregular and irregular regions of 2N -dimensional phase space. The regularregion is occupied by regular orbits and is strictly off-limits to any irregu-lar orbit, while the irregular region is off-limits to regular orbits. However,while each regular orbit is strictly confined to its N -dimensional torus andnever trespasses on the territory of a different regular orbit, over time anirregular orbit explores at least some of the irregular region of phase space.In fact, the principal barrier to an irregular orbit’s ability to wander is wallsformed by regular orbits. In the case N = 2 of two-dimensional motion, theenergetically accessible part of phase space is three-dimensional, while thewalls formed by regular orbits are two-dimensional. Hence such a wall cancompletely bound some portion of irregular phase space, and forever excludean irregular orbit from part of irregular phase space. In the case N = 3that is relevant for elliptical galaxies, the energetically accessible region ofphase space is five-dimensional while the wall formed by a regular orbit isthree-dimensional. Since the boundary of a five-dimensional volume is a four-dimensional region, it is clear that no regular orbit can divide the irregularregion of phase space into two. Hence, it is believed that given enough timean irregular orbit with N ≥ 3 degrees of freedom will eventually visit everypart of the irregular region of phase space.

The process by which irregular orbits wander through phase space iscalled Arnold diffusion and is inadequately understood. Physically, itprobably involves repeated trapping by a multitude of high-order resonances.In elliptical galaxies and the bars of barred disk galaxies, the rate of Arnolddiffusion may be comparable to the Hubble time and could be a major factorin determining the rate of galactic evolution.

If the timescale associated with Arnold diffusion were short enough,galaxy models would need to include only one irregular orbit. The phase-


space density firr contributed by this orbit would be the same at all points onthe energy hypersurface H(x,v) = E except in the regular region of phasespace, where firr would vanish.29 It is not yet clear how galaxy modelingis best done when the timescale for Arnold diffusion is comparable to theHubble time.

3.8.3 Dynamical effects of black holes

Introducing even a small black hole at the center of a triaxial galaxy thathas a largely regular phase space destroys much of that regularity. There isa simple physical explanation of this phenomenon (Gerhard & Binney 1985;Merritt & Quinlan 1998).

Consider a star on the box orbit shown at top left in Figure 3.46. Eachcrossing time the star passes through the orbit’s waist on an approximatelyrectilinear trajectory, and is deflected through some angle θdefl by the blackhole’s gravitational field. If M is the mass of the hole, and v and b are,respectively, the speed and the distance from the galactic center at whichthe star would have passed the waist had the hole not deflected it, then fromequation (3.52) we have that

θdefl = 2 tan−1

(GM

bv2

). (3.317)

The speed v will be similar for all passages, but the impact parameter b willspan a wide range of values over a series of passages. For any value of M ,no matter how small, there is a chance that b will be small enough for thestar to be scattered onto a significantly different box orbit.

The tensor virial theorem (§4.8.3) requires that the velocity dispersionbe larger parallel to the longest axis of a triaxial system than in the per-pendicular directions. Repeated scattering of stars by a nuclear black holewill tend to make the velocity dispersion isotropic, and thus undermine theorbital support for the triaxiality of the potential. If the potential loses itstriaxiality, angular momentum will become a conserved quantity, and everystar will have a non-zero pericentric distance. Hence stars will no longer beexposed to the risk of coming arbitrarily close to the black hole, and starswill disappear from the black hole’s menu.

Let us assume that the distribution of a star’s crossing points is uniformwithin the waist and calculate the expectation value of the smallest valuetaken by r in N passages. Let the area of the waist be πR2. Then theprobability of there being n crossing points in a circle of radius r is given bythe Poisson distribution (Appendix B.8) as

P (n|r) =(Nr2/R2)n

n!e−Nr2/R2

. (3.318)

29 See Hafner et al. (2000) for a method of exploiting the uniformity of firr in galaxymodeling.


The probability that the closest passage lies in (r, r + dr) is the probabilitythat there are zero passages inside r and a non-zero number of passages inthe surrounding annulus, has area 2πrdr. Thus this probability is

dP =(1 − e−2Nrdr/R2)

e−Nr2/R2

(2Nrdr

R2e−Nr2/R2

. (3.319)

The required expectation value of r1 is now easily calculated:

〈r1〉 =

∫dr

2Nr2

R2e−Nr2/R2

=

√π

N

R

2. (3.320)

From equation (3.317) the deflection that corresponds to 〈r1〉 is

θdefl,max = 2 tan−1

(2√

NGM√πv2R

)

. (3.321)

Two empirical correlations between galactic parameters enable us toestimate θdefl,max for a star that reaches maximum radius Rmax in an el-liptical galaxy with measured line-of-sight velocity dispersion σ‖. First wetake the black hole’s mass M from the empirical relation (1.27). In thegalaxy’s lifetime τ we have N ( σ‖τ/2Rmax, and we relate Rmax to Dn, thediameter within which the mean surface brightness of an elliptical galaxy is20.75 magarcsec−2 in the B band: Dn is correlated with σ‖ such that (BMeq. 4.43)

Dn = 5.2( σ‖

200 km s−1

)1.33kpc. (3.322)

With these relations, (3.321) becomes

θdefl,max ( 2 tan−1

[

0.08D3/2

n

R3/2max

Rmax

R

σ2‖

v2

( σ‖200 km s−1

)0.5( τ

10 Gyr

)1/2]

.

(3.323)For the moderately luminous elliptical galaxies that are of interest here, Dn

is comparable to, or slightly larger than, the effective radius (Dressler etal. 1987), and thus similar to the half-mass radius rh = 1.3Re for the R1/4

profile. Thus for the majority of stars Dn/Rmax ( 1. From Figure 3.46we estimate Rmax/R ( 10. To estimate the ratio σ‖/v we deduce fromequations (2.66) and (2.67) that for a Hernquist model with scale radius athe potential drop ∆Φ = Φ(a) − Φ(0) between rh = 2.41a and the center is0.71GMgal/a, so v2 = 2∆Φ = 1.4GMgal/a. From Figure 4.4 we see that σ‖ (0.2√

GMgal/a, so (σ‖/v)2 ( 35. Inserting these values into equation (3.323)we find θdefl,max ( 2.6. Scattering by such a small angle will probably notundermine a galaxy’s triaxiality, but stars with smaller apocenter distancesRmax will be deflected through significant angles, so it is likely that the blackhole will erode triaxiality in the galaxy’s inner parts (Norman, May, & vanAlbada 1985; Merritt & Quinlan 1998).

268 Problems

Problems

3.1 [1] Show that the radial velocity along a Kepler orbit is

r =GMe

Lsin(ψ − ψ0), (3.324)

where L is the angular momentum. By considering this expression in the limit r → ∞show that the eccentricity e of an unbound Kepler orbit is related to its speed at infinityby

e2 = 1 +

„Lv∞GM

«2

. (3.325)

3.2 [1] Show that for a Kepler orbit the eccentric anomaly η and the true anomaly ψ−ψ0

are related by

cos(ψ − ψ0) =cos η − e

1 − e cos η; sin(ψ − ψ0) =

p1 − e2

sin η

1 − e cos η. (3.326)

3.3 [1] Show that the energy of a circular orbit in the isochrone potential (2.47) is E =−GM/(2a), where a =

√b2 + r2. Let the angular momentum of this orbit be Lc(E).

Show that

Lc =√

GMb“x−1/2 − x1/2

”, where x ≡ −

2Eb

GM. (3.327)

3.4 [1] Prove that if a homogeneous sphere of a pressureless fluid with density ρ is releasedfrom rest, it will collapse to a point in time tff = 1

4

p3π/(2Gρ). The time tff is called the

free-fall time of a system of density ρ.

3.5 [3] Generalize the timing argument in Box 3.1 to a universe with non-zero vacuum-energy density. Evaluate the required mass of the Local Group for a universe of aget0 = 13.7Gyr with (a) ΩΛ0 = 0; (b) ΩΛ0 = 0.76, h7 = 1.05. Hints: the energy densityin radiation can be neglected. The solution requires evaluation of an integral similar to(1.62).

3.6 [1] A star orbiting in a spherical potential suffers an arbitrary instantaneous velocitychange while it is at pericenter. Show that the pericenter distance of the ensuing orbitcannot be larger than the initial pericenter distance.

3.7 [2] In a spherically symmetric system, the apocenter and pericenter distances are givenby the roots of equation (3.14). Show that if E < 0 and the potential Φ(r) is generatedby a non-negative density distribution, this equation has either no root, a repeated root,or two roots (Contopoulos 1954). Thus there is at most one apocenter and pericenter fora given energy and angular momentum. Hint: take the second derivative of E − Φ withrespect to u = 1/r and use Poisson’s equation.

3.8 [1] Prove that circular orbits in a given potential are unstable if the angular momentumper unit mass on a circular orbit decreases outward. Hint: evaluate the epicycle frequency.

3.9 [2] Compute the time-averaged moments of the radius, 〈rn〉, in a Kepler orbit ofsemi-major axis a and eccentricity e, for n = 1, 2 and n = −1,−2,−3.

3.10 [2] ∆ψ denotes the increment in azimuthal angle during one complete radial cycleof an orbit.

(a) Show that in the potential (3.57)

∆ψ =2πL

p−2Erarp

, (3.328)

where ra and rp are the apo- and pericentric radii of an orbit of energy E and angular

momentum L. Hint: by contour integration one can show that for A > 1,R π/2−π/2 dθ/(A +

sin θ) = π/√

A2 − 1.

Problems 269

(b) Prove in the epicycle approximation that along orbits in a potential with circularfrequency Ω(R),

∆ψ = 2π

„4 +

d lnΩ2

d lnR

«−1/2

. (3.329)

(c) Show that the exact expression (3.328) reduces for orbits of small eccentricity to (3.329).

3.11 [1] For what spherically symmetric potential is a possible trajectory r = aebψ?

3.12 [2] Prove that the mean-square velocity is on a bound orbit in a spherical potentialΦ(r) is

〈v2〉 =

firdΦ

dr

fl, (3.330)

where 〈·〉 denotes a time average.

3.13 [2] Let r(s) be a plane curve depending on the parameter s. Then the curvature is

K =|r′ × r′′||r′|3

, (3.331)

where r′ ≡ dr/ds. The local radius of curvature is K−1. Prove that the curvature of anorbit with energy E and angular momentum L in the spherical potential Φ(r) is

K =L dΦ/dr

23/2r[E − Φ(r)]3/2. (3.332)

Hence prove that no orbit in any spherical mass distribution can have an inflection point(in contrast to the cover illustration of Goldstein, Safko, & Poole 2002).

3.14 [1] Show that in a spherical potential the vertical and circular frequencies ν and Ω(eqs. 3.79) are equal.

3.15 [1] Prove that at any point in an axisymmetric system at which the local densityis negligible, the epicycle, vertical, and circular frequencies κ, ν, and Ω (eqs. 3.79) arerelated by κ2 + ν2 = 2Ω2.

3.16 [1] Using the epicycle approximation, prove that the azimuthal angle ∆ψ betweensuccessive pericenters lies in the range π ≤ ∆ψ ≤ 2π in the gravitational field arising fromany spherical mass distribution in which the density decreases outwards.

3.17 [3] The goal of this problem is to prove the results of Problem 3.16 without usingthe epicycle approximation (Contopoulos 1954).

(a) Using the notation of §3.1, show that

E −Φ−L2

2r2= (u1 − u)(u − u2)

˘12L2 +Φ[u, u1, u2]

¯, (3.333)

where u1 = 1/r1 and u2 = 1/r2 are the reciprocals of the pericenter and apocenterdistances of the orbit respectively, u = 1/r, and

Φ[u, u1, u2] =1

u1 − u2

»Φ(u1) −Φ(u)

u1 − u−

Φ(u) −Φ(u2)

u − u2

–. (3.334)

This expression is a second-order divided difference of the potential Φ regarded as a func-tion of u, and a variant of the mean-value theorem of calculus shows that Φ[u, u1, u2] =12Φ

′′(u) where u is some value of u in the interval (u1, u2). Then use the hint in Prob-lem 3.7 and equation (3.18b) to deduce that ∆ψ ≤ 2π when the potential Φ is generatedby a non-negative, spherically symmetric density distribution.

270 Problems

(b) A lower bound on ∆ψ can be obtained from working in a similar manner with thefunction

χ(ω) =2ωΦ

L, where ω ≡

L

r2. (3.335)

Show that2ωE

L− χ(ω) − ω2 = (ω1 − ω)(ω − ω2) 1 + χ[ω,ω1,ω2] , (3.336)

where ω1 = L/r21, ω2 = L/r2

2 and χ[ω,ω1,ω2] is a second-order divided difference of χ(ω).Now deduce that ∆ψ ≥ π for any potential in which the circular frequency Ω(r) decreasesoutwards.

3.18 [1] Let Φ(R, z) be the Galactic potential. At the solar location, (R, z) = (R0, 0),prove that

∂2Φ

∂z2= 4πGρ0 + 2(A2 − B2), (3.337)

where ρ0 is the density in the solar neighborhood and A and B are the Oort constants.Hint: use equation (2.73).

3.19 [3] Consider an attractive power-law potential, Φ(r) = Crα, where −1 ≤ α ≤ 2 andC > 0 for α > 0, C < 0 for α < 0. Prove that the ratio of radial and azimuthal periods is

Tr

Tψ=

8<

:

1/√

2 + α for nearly circular orbits1/2, for α > 0

1/(2 + α), for α < 0for nearly radial orbits.

(3.338)

What do these results imply for harmonic and Kepler potentials?Hint: depending on the sign of α use a different approximation in the radical for vr . For

b > 0,R∞1 dx/(x

pxb − 1) = π/b (see Touma & Tremaine 1997).

3.20 [1] Show that in spherical polar coordinates the Lagrangian for motion in the poten-tial Φ(x) is

L = 12 [r2 + (rθ)2 + (r sin θφ)2] − Φ(x). (3.339)

Hence show that the momenta pθ and pφ are related to the the magnitude and z-componentof the angular-momentum vector L by

pφ = Lz ; p2θ = L2 −

L2z

sin2 θ. (3.340)

3.21 [3] Plot a (y, y), (x = 0, x > 0) surface of section for motion in the potential ΦL ofequation (3.103) when q = 0.9 and E = −0.337. Qualitatively relate the structure of thissurface of section to the structure of the (x, x) surface of section shown in Figure 3.9.

3.22 [3] Sketch the structure of the (x, x), (y = 0, y > 0) surface of section for motionat energy E in a Kepler potential when (a) the (x, y) coordinates are inertial, and (b)the coordinates rotate at 0.75 times the circular frequency Ω at the energy E. Hint: seeBinney, Gerhard, & Hut (1985).

3.23 [3] The Earth is flattened at the poles by its spin. Consequently orbits in its potentialdo not conserve total angular momentum. Many satellites are launched in inclined, nearlycircular orbits only a few hundred kilometers above the Earth’s surface, and their orbitsmust remain nearly circular, or they will enter the atmosphere and be destroyed. Why dothe orbits remain nearly circular?

3.24 [2] Let e1 and e2 be unit vectors in an inertial coordinate system centered on theSun, with e1 pointing away from the Galactic center (towards . = 180, b = 0) and e2

pointing towards . = 270, b = 90. The mean velocity field v(x) relative to the LocalStandard of Rest can be expanded in a Taylor series,

vi =2X

j=1

Hijxj + O(x2). (3.341)

Problems 271

(a) Assuming that the Galaxy is stationary and axisymmetric, evaluate the matrix H interms of the Oort constants A and B.

(b) What is the matrix H in a rotating frame, that is, if e1 continues to point to the centerof the Galaxy as the Sun orbits around it?

(c) In a homogeneous, isotropic universe, there is an analogous 3 × 3 matrix H that de-scribes the relative velocity v between two fundamental observers separated by x. Evaluatethis matrix in terms of the Hubble constant.

3.25 [3] Consider two point masses m1 and m2 > m1 that travel in a circular orbit abouttheir center of mass under their mutual attraction. (a) Show that the Lagrange point L4

of this system forms an equilateral triangle with the two masses. (b) Show that motionnear L4 is stable if m1/(m1 + m2) < 0.03852. (c) Are the Lagrange points L1, L2, L3

stable? See Valtonen & Karttunen (2006).

3.26 [2] Show that the leapfrog integrator (3.166a) is second-order accurate, in the sensethat the errors in q and p after a timestep h are O(h3).

3.27 [2] Forest & Ruth (1990) have devised a symplectic, time-reversible, fourth-orderintegrator of timestep h by taking three successive drift-kick-drift leapfrog steps of lengthah, bh, and ah where 2a + b = 1. Find a and b. Hint: a and b need not both be positive.

3.28 [2] Confirm the formulae for the Adams–Bashforth, Adams–Moulton, and Hermiteintegrators in equations (3.169), (3.170), and (3.171), and derive the next higher orderintegrator of each type. You may find it helpful to use computer algebra.

3.29 [1] Prove that the fictitious time τ in Burdet–Heggie regularization is related to theeccentric anomaly η by τ = (Tr/2πa)η + constant , if the motion is bound (E2 < 0) andthe external field g = 0.

3.30 [1] We wish to integrate numerically the motions of N particles with positions xi,velocities vi, and masses mi. The particles interact only by gravitational forces (the gravi-tational N-body problem). We are considering using several possible integrators: modified-Euler, leapfrog, or fourth-order Runge–Kutta. Which of these will conserve the total mo-mentum

PNi=1 mivi? Which will conserve the total angular momentum

PNi=1 mixi ×vi?

Assume that all particles are advanced with the same timestep, and that forces are calcu-lated exactly. You may solve the problem either analytically or numerically.

3.31 [2] Show that the generating function of the canonical transformation from angle-action variables (θi, Ji) to the variables (qi, pi) discussed in Box 3.4 is

S(q, J) = ∓ 12 qp

2J − q2 ± J cos−1„

q√

2J

«. (3.342)

3.32 [1] Let ε(R) and .(R) be the specific energy and angular momentum of a circularorbit of radius R in the equatorial plane of an axisymmetric potential.

(a) Prove that

d.

dR=

Rκ2

2Ω;

dε

dR= 1

2Rκ2, (3.343)

where Ω and κ are the circular and epicycle frequencies.

(b) The energy of a circular orbit as a function of angular momentum is ε(.). Show thatdε/d. = Ω in two ways, first from the results of part (a) and then using angle-actionvariables.

3.33 [2] The angle variables θi conjugate to the actions Ji can be implicitly defined by thecoupled differential equations dwα/dθi = [wα, Ji], where wα is any ordinary phase-space

272 Problems

coordinate. Using this result, show that the angle variable for the harmonic oscillator,H = 1

2 (p2 + ω2q2), may be written

θ(x, p) = − tan−1„

p

ωq

«. (3.344)

Hint: the action is J = H/ω.

3.34 [2] Consider motion for Lz = 0 in the Stackel potential (3.247).

(a) Express I3 as a function of u, v, pu, and pv.

(b) Show that H cos2 v + I3 = 12 (p2

v/∆2) − V .

(c) Show that [H, I3] = 0.

(d) Hence show that Ju and Jv are in involution, that is [Ju, Jv] = 0. Hint: if f(a, b)is any differentiable function of two variables, and A is any differentiable function of thephase-space variables, then [A, f ] = [A,a](∂f/∂a) + [A, b](∂f/∂b).

3.35 [2] A particle moves in a one-dimensional potential well Φ(x). In angle-action vari-ables, the Hamiltonian has the form H(J) = cJ4/3 where c is a constant. Find Φ(x).

3.36 [2] Obtain the Hamiltonian and fundamental frequencies as functions of the actionsfor the three-dimensional harmonic oscillator by examining the limit b → ∞ of equations(3.226).

3.37 [2] For motion in a potential of the form (3.247), obtain

pu =2E sinhu cosh u − dU/du

sinh2 u + sin2 v+

L2z cosh u

∆2 sinh3 u(sinh2 u + sin2 v), (3.345)

where (u, v) are the prolate spheroidal coordinates defined by equations (3.242), by (a)differentiating equation (3.249a) with respect to t and then using u = ∂H/∂pu, and (b)from pu = −∂H/∂u.

3.38 [2] For the coordinates defined by equation (3.267), show that the integral definedby equations (3.268) can be written

I2 =sinh2 u[ 12 (p2

v/∆2) − V ] − sin2 v[ 12 (p2u/∆2) + U ]

sinh2 u + sin2 v. (3.346)

Show that in the limit ∆ → 0, u → ∞ we have ∆ sinh u → ∆ cosh u → R and v → π/2−φ,where R and φ are the usual polar coordinates. Hence show that in this limit 2∆2I2 → L2

z.

3.39 [2] Show that the third integral of an axisymmetric Stackel potential can be takento be

I3(u, v, pu, pv, pφ) =1

sinh2 u + sin2 v×

»sinh2 u

„p2

v

2∆2− V

«− sin2 v

„p2

u

2∆2+ U

«–+

p2φ

2∆2

„1

sin2 v−

1

sinh2 u

«.

(3.347)

Hint: generalize the work of Problem 3.38.

3.40 [1] Show that when orbital frequencies are incommensurable, adiabatic invarianceof actions implies that closed orbits remain closed when the potential is adiabaticallydeformed. An initially circular orbit in a spherical potential Φ does not remain closedwhen Φ is squashed along any line that is not parallel to the orbit’s original angular-momentum vector. Why does this statement remain true no matter how slowly Φ issquashed?

Problems 273

3.41 [2] From equations (3.39b) and (3.190), show that the radial action Jr of an orbitin the isochrone potential (2.47) is related to the energy E and angular momentum L ofthis orbit by

Jr =√

GMbhx− 1

2 − f(L)i

, (3.348)

where x ≡ −2Eb/(GM) and f is some function. Use equation (3.327) to show thatf(L) = (

√l2 + 1 − l)−1 =

√l2 + 1 + l, where l ≡ |L|/(2

√GMb), and hence show that the

isochrone Hamiltonian can be written in the form (3.226a).

3.42 [2] Angle-action variables are also useful in general relativity. For example, therelativistic analog to the Hamilton–Jacobi equation (3.218) for motion in the point-masspotential Φ(r) = −GM/r is

E2

1 + 1

4 rS/r

1 − 14 rS/r

!2

= c4 +c2

(1 + 14 rS/r)4

»“∂S

∂r

”2+“1

r

∂S

∂ϑ

”2+“ 1

r sinϑ

∂S

∂φ

”2–

, (3.349)

where rS ≡ 2GM/c2 is the Schwarzschild radius, the energy per unit mass E includesthe rest-mass energy c2, and the equations are written in the isotropic metric, i.e., ds2 atany point is proportional to its Euclidean form (Landau & Lifshitz 1999).

(a) Show that the Hamiltonian can be written in the form

H(pr , pϑ, pφ) =1 − 1

4 rS/r

1 + 14 rS/r

s

c4 +c2p2

(1 + 14 rS/r)4

, (3.350)

where p2 = p2r + p2

ϑ/r2 + p2φ/(r sinϑ)2.

(b) For systems in which relativistic effects are weak, show that the Hamiltonian can bewritten in the form

H = c2 + HKep + Hgr + O(c−4), (3.351)

where HKep = 12p2 − GM/r is the usual Kepler Hamiltonian and

Hgr =1

c2

„G2M2

2r2−

p4

8−

3GMp2

2r

«. (3.352)

(c) To investigate the long-term effects of relativistic corrections on a Kepler orbit, we mayaverage Hgr over an unperturbed Kepler orbit. Show that this average may be written

〈Hgr〉 =G2M2

c2a2

„158 −

3√

1 − e2

«, (3.353)

where a and e are the semi-major axis and eccentricity. Hint: use the results of Prob-lem 3.9.

(d) Show that relativistic corrections cause the argument of pericenter ω to precess by anamount

∆ω =6πGM

c2a(1 − e2)(3.354)

per orbit. Hint: convert 〈Hgr〉 to angle-action variables using Table E.1 and use Hamilton’sequations.

3.43 [2] The Hamiltonian H(x, p; λ), where λ is a parameter, supports a family of resonantorbits. In the (x1, p1) surface of section, the family’s chain of islands is bounded byorbits with actions J1 ≡ (2π)−1

Hdx1 p1 = J±(λ), where J+ > J−. Let λ increase

sufficiently slowly for the actions of non-resonant orbits to be conserved, and assume thatJ ′+ > J ′

− > 0, where a prime denotes differentiation with respect to λ. Show that, as λgrows, an orbit of unknown phase and action slightly larger than J+ will be captured bythe resonance with probability Pc = 1−J ′

−/J ′+. Hint: exploit conservation of phase-space

volume as expressed by equation (4.10).

the orbits of stars - university of california, berkeley

Documents