statistical mechanics -...

Statistical Mechanics

Contents

Chapter 1. Ergodicity and the Microcanonical Ensemble 1

1. From Hamiltonian Mechanics to Statistical Mechanics 1

2. Two Theorems From Dynamical Systems Theory 6

3. The Microcanonical Ensemble and the Ergodic Hypothesis 9

4. Density Operators in Quantum Mechanics 13

5. Discussion 18

3

CHAPTER 1

Ergodicity and the Microcanonical Ensemble

The pressure in an ideal gas, recall, is proportional to the average kinetic energy

per molecule. Since pressure may be understood as an average over billions upon

billions of microscopic collisions, this simple relationship illustrates how statistical

techniques may be used to supress information about what each individual molecule

is doing in order to extract information about what the molecules do on average

as a whole. Our first task, as we examine the foundations of statistical mechanics,

is to understand more precisely why this suppression is necessary and how exactly

it is to be accomplished with more precision. We must, therefore, begin by con-

sidering the laws of microscopic dynamics. In physics, there are two choices here

— the laws of classical mechanics and the laws of quantum mechanics. Remark-

ably, the choice is not important; in either case, detailed solutions to the dynamical

equations are completely unnecessary. We will consider both cases, but follow the

classical route through Hamiltonian mechanics first, as this provides the clearest

introduction to the structure of statistical mechanics. In this section, we will review

the essential elements of Hamiltonian mechanics and discuss the need for and ba-

sic elements of a probabilitistic framework...THIS NEEDS TO BE REWRITTEN

TO BETTER TAKE ACCOUNT OF THE CONTENTS AND HIGHLIGHTS OF

THIS CHAPTER

1. From Hamiltonian Mechanics to Statistical Mechanics

Newton’s second law for a particle of mass m,

Ftotal = mq,

is a second-order ordinary differential equation. Therefore, given the instantaneous

values of the particle’s position q and momentum p = mq at some time t = 0, the

particle’s subsequent motion is uniquely determined for all t > 0. For this reason,

the state of a classical system consisting of n configurational degrees of freedom can

1

2 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE

be thought of as a point (q1, . . . , qn, p1, . . . , pn) in a 2n-dimensional space called the

phase space of the system. As the state evolves in time, this point will trace out

in phase space a trajectory defined by the tangent vector,

(1) v(t) = (q1(t), . . . , qn(t), p1(t), . . . , pn(t)) .

A Hamiltonian system evolves according the canonical equations of motion,

qi =∂

∂piH(q,p, t),(2)

pi = − ∂

∂qiH(q,p, t),(3)

where the function

H(q,p, t) = H(q1, . . . , qn, p1, . . . , pn, t)

is called the Hamiltonian of the system. These equations represent the full content

of Newtonian mechanics. Note that exactly one trajectory passes through each

point in the phase space; the classical picture is completely deterministic.

Example (single particle dynamics). Find the canonical equations of motion

for a single particle of mass m in an external potential V (q).

Solution. The Hamiltonian for this system is simply

H(q, p) =p2

2m+ V (q),

which we recognize as the sum of the kinetic and potential energies. This leads to

the following dynamical equations:

q =p

m

p = − ∂

∂qV (q).

A system of many interacting particles has a similar solution, though the potential

term becomes much more complicated.

We see, therefore, that the first canonical equation (2) generalizes the relationship

between velocity and momenta (in a more complicated system, the i-th momentum

may depend on several of the qi and qi). Similarly, the second canonical equation

(3) generalizes the rule that force may be expressed as a gradient of an energy

function.

1. FROM HAMILTONIAN MECHANICS TO STATISTICAL MECHANICS 3

In a Hamiltonian system, the time dependence of any function of the momenta

and coordinates

f = f(q1, . . . , qn, p1, . . . , pn, t)

can be written,

(4)dfdt

={f,H

}+∂f

∂t,

where{f,H

}is the Poisson bracket of the function f and the Hamiltonian.

The Poisson bracket of two functions f1 and f2 with respect to a set of canonical

variables is defined as

(5){f1, f2

}=

n∑j=1

(∂f1∂qj

∂f2∂pj

− ∂f1∂pj

∂f2∂qj

).

The Poisson bracket is important in Hamiltonian dynamics because it is indepen-

dent of how the various coordinates and momenta are defined; that is,{u, v

}takes

the same value for any set of canonical variables q and p. Furthermore, the canon-

ical equations of motion can be re-written in the following form,

qi ={qi,H

},(6)

pi ={pi,H

}.(7)

This is known as the Poisson bracket formulation of classical mechanics. It is impor-

tant to recognize that very similar expressions arise in quantum mechanics (we’ll

look at these in Section 4). Indeed, every classical expression involving Poisson

brackets has a quantum analogue employing commutators. This elegant correspon-

dence principle, first pointed out by Dirac, has deep significance for the relationship

between classical and quantum physics. It also provides our first glimpse of why

statistical mechanics transcends the details of the microscopic equations of motion.

For now, we return to the classical route into the heart of statistical mechanics...

Examining a physical system from the classical mechanical point of view, one

first constructs the canonical equations of motion and then integrates these from

known initial conditions to determine the phase trajectory. If the system of in-

terest involves a macroscopic number of particles, this approach condemns one to

numerical computations involving matrices of bewildering size. Yet system size is

not the major obstacle: The canonical equations of motion are in general nonlin-

ear and, as a result, small changes in system parameters or initial conditions may


lead to large changes in system behavior. In particular, neighboring trajectories in

many nonlinear systems diverge from one another at an exponential rate, a phe-

nomenon known as sensitive dependence on initial conditions or, more popularly,

as the butterfly effect, the idea being that a flap of a butterfly’s wings may make

the difference between sunny skies and snow two weeks later. Systems exhibiting

sensitive dependence on initial conditions are said to be chaotic. Calculations of

chaotic trajectories are intolerant of even infinitesimal errors, such as those aris-

ing from finite precision and uncertainties in the state of the system. Therefore,

setting aside the impractical integration problem of calculating a high-dimensional

phase trajectory, our necessarily incomplete knowledge of initial conditions in a

macroscopic system seriously compromises our ability to predict future evolution.

Though the prospects for dealing directly with the phase trajectories of a macro-

scopic system of particles seem hopeless, it is not the case that we must discard

all knowledge of the microscopic physics of the system. There are many macro-

scopic phenomena which cannot be understood from a purely macroscopic point

of view. What is combustion? What determines whether a solid will be a metal

or an insulator? What are the energy sources in stellar and galactic cores? These

questions are best dealt with by appealing to various microscopic details. On the

other hand, given the success of the laws of thermodynamics, it is evident that

macroscopic systems exhibit a collective regularity where the exact details of each

particle’s motion and state are nonessential. This suggests that we may envision

the time evolution of macroscopic quantities in a Hamiltonian system as some sort

of average over all of the microscopic states consistent with available macroscopic

knowledge and constraints. For this reason, one abandons the mechanical approach

of computing the exact time evolution from a single point in phase space in favor

of a statistical approach employing averages over an entire ensemble of points in

phase space. This is accomplished as follows:

Consider a large collection of identical copies of the system, distributed in phase

space according to a known distribution function,

ρ(q,p, t) = ρ(q1, . . . , q3N , p1, . . . , p3N , t),

1. FROM HAMILTONIAN MECHANICS TO STATISTICAL MECHANICS 5

where

(8)∫ρ(q,p, t) dq dp = 1 for all t.

ρ(q,p, t) is the density in phase space of the points representing the ensemble,

normalized according to (8), and may be interpreted as describing the probability

of finding the system in various different microscopic states. Once ρ(q,p, t) is

specified, we can compute the probabilities of different values of any quantity f

which is a function of the canonical variables. We can also compute the mean value

〈f〉 of any such function f by averaging over the probabilities of different values,

(9) 〈f(t)〉 =∫f(q,p) ρ(q,p, t) dp dq.

Thus, instead of following the time evolution of a single system through many

different microscopic states, we consider at a single time an ensemble of copies of

the system distributed into these states according to probability of occupancy. This

shift is one of the cornerstones of statistical mechanics.

Exercise 1.1. Derive equation (4). HINT: Use the chain rule

dfdt

=∑

i

∂f

∂qi

∂qi∂t

+∑

i

∂f

∂pi

∂pi

∂t+∂f

∂t

Exercise 1.2. Show that H(q,p, t) is a constant of the motion if and only if

it does not depend explicitly on time.

Exercise 1.3. Show that the canonical equations of motion can be re-written

in the following form,

qi ={qi,H

},(10)

pi ={pi,H

}.(11)

This is known as the Poisson bracket formulation of classical mechanics.

Exercise 1.4. Compute the following Poisson brackets:

(1){qi, qj

}(2)

{qi, pj

}Are your results in any way familiar, given your knowledge of quantum mechan-

ics? If so, how do the interpretations of these results differ from their quantum

mechanical analogues?


Exercise 1.5. Show that the canonical equations of motion can be written in

the symplectic form,

x = M∂

∂xH(q,p, t),

where x ={q,p

}(what’s M in this expression?)

2. Two Theorems From Dynamical Systems Theory

One is often interested in general qualitative questions about a system’s dy-

namics, such as the existence of stable equilibria or oscillations. In discussing such

questions, mathematicians often speak of the flow of a dynamical system: Any

autonomous system of ordinary differential equations can be written in the form

(12) x = f(x)

(changes of variables may be required if the equations involve second-order and

higher derivatives). If we interpret a general system of differential equations (12)

as representing a fluid in which the fluid velocity at each point x is given by the

vector f(x), then we may envision any particular point x0 as flowing along the

trajectory φ(x0) defined by the velocity field. More precisely, we define

φt(x0) = φ(x0, t),

where φ(x0, t) is a point on the trajectory φ(x0) passing through the initial condition

x0; φt maps the starting point x0 to its location after moving with the flow for a

time t. It is important to note that φt defines a map on the entire phase space —

we may envision the entire phase space flowing according the velocity field defined

by (12). Indeed, we shall see in this section that this fluid metaphor is especially

appropriate in statistical mechanics.

The notion of the flow of a dynamical system very naturally accomodates a shift

towards considering how whoIe regions of phase space participate in the dynamics,

a shift away from the language of initial conditions and trajectories. This shift is

what enables mathematicians to state and prove general theorems about dynamical

systems. It also turns out that this shift provides the natural setting for several of

the central concepts of statistical mechanics. In the previous section, we motivated

a statistical framework in which, rather than follow the time evolution of a single

system, we consider at a single time an ensemble of copies of that system distributed

2. TWO THEOREMS FROM DYNAMICAL SYSTEMS THEORY 7

in phase space according to probability of occupancy. The main player in this new

framework is the distribution function ρ(q,p, t) describing the ensemble. ρ allows

us to take into account take into account which states in phase space a system is

likely to occupy1. In this section, we examine how the ensemble interacts with the

flow defined by a set of canonical equations. It turns out that, in a Hamiltonian

system, the time evolution of ρ has several interesting properties, which are the

subject of two important theorems from dynamical systems theory.

We begin with a simple calculation of the rate of change of ρ. We know from

(4), which describes the time evolution of any function of the canonical variables q

and p, that

(13)dρdt

=∂ρ

∂t+

{ρ,H

}.

However, we also know from local conservation of probability that ρ must satisfy a

continuity equation,

(14)∂ρ

∂t+ ∇ · (ρv) = 0,

where

∇ =(

∂

∂q1, . . . ,

∂

∂qn,∂

∂p1, . . . ,

∂

∂pn

)is the gradient operator in phase space and v is defined in (1). Applying the chain

rule, we see that

(15) ∇ · (ρv) ={ρ,H

}+ ρ(∇ · v).

Since the ∇·v = 0 vanishes for a Hamiltonian system, (13) and (14) are equal and

therefore

(16)∂ρ

∂t+

{ρ,H

}= 0.

and

(17)dρdt

= 0.

This result is known as Liouville’s theorem. The partial derivative term in (16)

expresses the change in ρ due to elapsed time dt, while the (∇ρ) · v ={ρ,H

}term expresses the change in ρ due to motion along the vector field a distance vdt.

1Mathematicians include this as part of a more general approach, called measurable dynamics,

which we need not go into here.


Thus, Liouville’s theorem tells us that the local probability density — as seen by

an observer moving with the flow in phase space — is constant in time; that is, ρ is

constant along phase trajectories. The theorem can also be interpreted as stating

that, in a Hamiltonian system, phase space volumes are conserved by the flow or,

equivalently, that ρ moves in phase space like an incompressible fluid.

From the incompressible fluid analogy, we see that while Hamiltonian systems

can exhibit chaotic dynamics, they cannot have any attractors! Liouville’s theorem

has other important consequences when combined with system constraints, such

as conservation laws. Conservation laws constrain the flow to lie on families of

hypersurfaces in phase space. These surfaces are bounded and invariant under the

flow:

(18) φt(X) = X

for each hypersurface X defined by a conservation law. When volume-preserving

flows are restricted to bounded, invariant regions of phase space, a surprising result

emerges: Let X be a bounded region of phase space which is invariant under a

volume-preserving flow. Take any region S which occupies a finite fraction of the

total volume in X (this specifically excludes what mathematicians call sets of mea-

sure zero: sets with no volume). Then any randomly selected initial condition x in

S generates a trajectory φt(x) which returns to S infinitely often — this is known

as the Poincare recurrence theorem.

In order to understand where this theorem comes from and what it means, we

consider how the region S moves under the flow. Define a function f which maps

S along the flow for a time T ,

f(S) = φT (S).

Subsequent iterations of this time-T map produce a sequence of subsets of X,

f2(S) = φ2T (S), f3(S) = φ3T (S), and so on, all with finite volume in X. Each

iteration takes a bite out ofX and so, if we iterate enough times, eventually we must

exhaust all of the volume in X. As result, two of these subsets must intersect; i.e.

there must exist integers i and j, with i > j, such that f i(S)∩ f j(S) is non-empty.

This implies that f i−j(S)∩S is also non-empty. S must fold back on itself repeatedly

under this time-T flow map. By considering small subsets of S, which must also have

3. THE MICROCANONICAL ENSEMBLE AND THE ERGODIC HYPOTHESIS 9

this property, we can convince ourselves that a randomly selected point in S does

indeed return to S infinitely often (for a precise proof of the theorem, see references

at end of chapter). The Poincare recurrence theorem as stated implies that almost

every initial condition x0 in the bounded region X generates a trajectory which

returns arbitrarily close to x0 infinitely many times. This recurrence property is

truly remarkable when you consider the bewildering array of nonlinear Hamiltonian

systems to which it may be applied. Indeed, the Poincare recurrence theorem is

considered the first great theorem of modern dynamics; we will have more to say

about its role in statistical mechanics later on.

3. The Microcanonical Ensemble and the Ergodic Hypothesis

As discussed earlier, the role of the ensemble in statistical mechanics is to

provide a probabilistic method of extracting important information about a macro-

scopic system. Naturally, the choice of a particular ensemble depends on the phys-

ical problem of interest but quite often one is interested in equilibrium properties

of a physical system. In this special case of equilibrium statistical mechanics, we

expect that ensemble averages (9) do not depend explicitly on time. This implies

(19)∂

∂tρ(q,p, t) = 0.

An ensemble satisfying (19) is said to be stationary. Note that a stationary

ensemble satisfying Liouville’s Theorem (17) has a vanishing Poisson Bracket with

the Hamiltonian,

(20){ρ(q,p),H(q,p)

}= 0.

Since{qi, pj

}= δij , no function of q or p alone will satisfy (20). The general

solution for a stationary ensemble therefore has the form

ρ(q,p) = ρ(H(q,p)).

The Hamiltonian plays an important role in determining the form of the distribution

function.

The simplest example of a stationary ensemble is the microcanonical ensem-

ble, in which the probabilities are uniformly distributed across the hypersurfaces


in phase space defined by energy conservation:

(21) ρ(q,p) = constant.

The microcanonical ensemble is one of the cornerstones of equilibrium statistical

mechanics, Accordingly, many introductory textbooks begin with the assumption

that this ensemble is valid, that all probabilities are equal a priori. It is not difficult,

however, to construct low-dimensional examples for which this ensemble is clearly

not valid. Why then, beyond the demonstrable success of statistical mechanics

as a physical theory, do we believe in a priori equal probabilities? The answer

comes again from dynamical systems theory. In this section, we expose some of

the dynamical mechinery underlying the microcanonical ensemble. In particular,

we will introduce what physicists call the ergodic hypothesis and discuss how this

hypothesis places statistical mechanics on firm ground.

Recall from the preceeding section that conservation of energy constrains the

flow to lie on families of hypersurfaces in phase space; for what follows, we consider

one such surface X. Recall also that X is bounded and is invariant (18). Since the

flow is Hamiltonian, we know that almost every point is recurrent. What we don’t

know is how “intertwined” the phase trajectories are. That is, it’s conceivable that

X can be broken up into a number of different invariant subspaces X1, X2,. . . ,

φt(Xi) = Xi for all i,

without violating the Poincare recurrence theorem. When this partitioning into

invariant subspaces is not possible, the flow is said to be ergodic. More precisely,

the flow is ergodic if and only if the only invariant subspaces of X occupy a volume

equal to that of X or else occupy zero volume. This means that in an ergodic flow

almost every trajectory wanders almost everywhere on X. Note how much stronger

this is than the recurrence property. The Poincare recurrence theorem is always

true for a general Hamiltonian system but we are not given ergodicity a priori. In

physics, we assume that the Hamiltonian systems used to describe the macroscopic

physical world have ergodic flows2; this is known as the ergodic hypothesis.

2Remember that we’re talking about enormous systems involving over 1023 particles here

and not low-dimensional systems such as those used to describe rigid body motion. More about

this later.

3. THE MICROCANONICAL ENSEMBLE AND THE ERGODIC HYPOTHESIS 11

In order to better understand the consequences of the ergodic hypothesis and

how it connects with the microcanonical ensemble, we need two more theorems

from dynamical systems. These will be stated without proof; interested readers

can refer to the end of the chapter for recommendations on where to find a more

precise treatment of what follows. The first theorem we need states very simply

that a flow is ergodic if and only if the only functions which are invariant under the

flow are constants,

(22) φt is ergodic ⇐⇒ whenever f(φt(x) = f(x), f = constant.

This is plausible. Ergodic flows, recall, have trajectories wandering almost every-

where in X. And so, having a function be constant on some trajectory means that

it must be constant almost everywhere in X. This result leads us straight to the

microcanical ensemble. From Liouville’s theorem, we know that the distribution

function ρ is invariant under the flow. When we add the ergodic hypothesis, (22)

tells us right away that ρ must be a constant on each energy surface X. Thus, the

ergodic hypothesis replaces the need to accept on faith a separate assumption of a

priori equal probabilities.

Next we want to consider how functions of the dynamical variables behave when

averaged along trajectories in a Hamiltonian flow (and we mean any flow here, not

just an ergodic one). Does the following have a well defined limit:

limn→∞

1T

∫ ∞

0

f(φt(x)) dt?

The answer is yes. The ”Birkhoff pointwise ergodic theorem” states that, for almost

all x, these time averages converge to something :

limn→∞

1T

∫ ∞

0

f(φt(x)) dt = f∗(x)

The limit may depend on x, which is why f∗ is written above as a function of x

(and is why mathematicians call this a “pointwise” theorem), but the limit almost

always exists3 — this is the kind of thing a physicist is usually willing to take on

faith. This theorem also, however, makes some interesting statements about the

3We have to say “for almost all x” because, though the limit exists for a randomly selected

x, there may be a set of measure zero for which convergence fails and we want to be careful


limiting function f∗(x). First, this limiting function is invariant under the flow,

(23) f∗(φt(x) = f∗(x).

Even more surprising, the ensemble average of the limiting function simply equals

ensemble average of the original function f,

(24)∫f∗(x) ρ(x) dx =

∫f(x) ρ(x) dx;

somehow time averaging under the integral sign doesn’t affect the value of the

ensemble average!

The Birkhoff theorem does not asume that the flow is ergodic. When we com-

bine this theorem with the ergodic hypothesis of statistical mechanics, we get a

major result: Begin with any function of the dynamical variables f(x). We’re

interested in the long time average of this function,

limn→∞

1T

∫ ∞

0

f(φt(x)) dt = f∗(x)

We know from (23) that the limiting function f∗(x) is invariant under the flow.

Since the flow is ergodic, (22) implies that f∗(x) is a constant (almost everywhere).

We can actually compute this constant by integrating over the ensemble,∫f∗ ρ(x) dx = f∗

∫ρ(x) dx = f∗.

However, we know from (24) the term on the left is equal to the ensemble average

of the original function f . Therefore,

(25) limn→∞

1T

∫ ∞

0

f(φt(x)) dt =∫f(x) ρ(x) dx ;

statistical averaging over the entire ensemble at fixed time is equivalent to time-

averaging a single member of the ensemble. This consequence of the ergodic hy-

pothesis is the justification for replacing macroscopic averages over computed tra-

jectories with an ensemble theory. Consider how when we compute the pressure in

an ideal gas using kinetic theory, we ignore time evolution and consider only what

a typical gas molecule is doing on average. This works precisely because of the

equivalence of ensemble averaging and time averaging. Indeed, all of equilibrium

statistical mechanics may be understood in terms of this result. It gives physicists

great confidence that the ergodic hypothesis has not led them astray. Furthermore,

to the extent that all measurements in the lab are time averages, ergodicity and

4. DENSITY OPERATORS IN QUANTUM MECHANICS 13

the microcanonical ensemble firmly ground macroscopic measurements in the mi-

croscopic dynamics of the system being investigated. No experiment to date has

shaken our confidence in the foundations of statistical mechanics.

EXERCISE: The indicator function.

4. Density Operators in Quantum Mechanics

In classical physics, the state of a system at some fixed time t is uniquely

defined by specifying the values of all of the generalized coordinates qi(t) and mo-

menta pi(t). In quantum mechanics, however, the Heisenberg uncertainty principle

prohibits simultaneous measurements of position and momentum to arbitrary pre-

cision. We might therefore anticipate some revisions in our approach. It turns

out, however, that the classical ensemble theory developed above carries over into

quantum mechanics with hardly revision at all. Most of the necessary alterations

are built directly into the edifice of quantum mechanics and all we need is to find

suitable quantum mechanical replacements for the density function ρ(q, p) and Li-

ouville’s Theorem. Understanding this is the goal of this section. Readers who are

unfamiliar with Dirac notation and the basic concepts of quantum mechanics are

referred to the references at the end of the chapter.

The uncertainty principle renders the concept of phase space meaningless in

quantum mechanics. The quantum state of a physical system is instead repre-

sented by a state vector, |ψ〉, belonging to an abstract vector space called the

state space of the system. The use of an abstract vector space stems from the

important role that superposition of states plays in quantum mechanics — lin-

ear combinations of states provide new states and, conversely, quantum states can

always be decomposed into linear combinations of other states. The connection

between these abstract vectors and experimental results is supplied by the formal-

ism of linear algebra, by operators and their eigenvalues. Dynamical variables,

such as position and energy, are represented by self-adjoint linear operators on the

state space and the result of any measurement made on the system is always rep-

resented by the eigenvalues of the appropriate operator (that is, the eigenvectors

of an observable physical quantity form a basis for the entire state space). This

use of operators and eigenvalues directly encodes many of the distinct hallmarks of

quantum mechanical systems: Discretization, such as that of angular momentum or


energy observed in observed in numerous experiments, simply points to an operator

with a discrete spectrum of eigenvalues. And wherever the order in which several

different measurements are made may affect the results obtained, the associated

quantum operators do not commute.

In quantum mechanics, the time evolution of the state vector is described by

Schrodinger’s equation,

(26) i~∂

∂t|ψ(t)〉 = H(t) |ψ(t)〉,

where H(t) is the Hamiltonian operator for the system; this evolution law replaces

the canonical equations of classical mechanics.

Exercise 1.6 (single particle dynamics). Write down, using wavefunctions

ψ(q, t), Schrodinger’s equation for a single particle of mass m in an external po-

tential V (q).

Solution. Recall, that the classical Hamiltonian for this system is simply

H(q, p) =p2

2m+ V (q).

We transform this into a quantum operator by replacing q and p with the appropriate

quantum operators: q is the position operator and

p =~i

∂

∂q

is the momentum operator for a wavefunction ψ(q, t). Then, Schrodinger’s equation

(26) becomes the following partial differential equation,

(27) i~∂

∂tψ(q, t) =

(− ~2

2m∇2 + V (q)

)ψ(q, t).

Schrodinger’s equation has a number of nice properties. First, as a linear

equation, it directly expresses the principle of superposition built into the vector

structure of the state space — linear combinations of solutions to (26) provide new

solutions. In addition, it can be shown that the norm of a state vector 〈ψ|ψ(t)〉

is invariant in time; this turns out to have a nice interpretation in terms of local

conservation of probability. On the other hand, Schrodinger equation is not easy

to solve directly. Even a system as simple as the one-dimensional harmonic os-

cillator requires great dexterity. For a macroscopic system, (26) generates either

an enormous eigenvalue problem or a high-dimensional partial differential equation


(consider the generalization of (27) to a many-body system). Either way, we see

that direct solution is hopeless. The situation is essentially identical with that of

macroscopic classical mechanics — the mathematics and, more importantly, our

lack of information about the microscopic state (quantum numbers, in this case)

necessitate a statistical approach.

We would like to find a quantum mechanical entity that replaces the classical

probability density ρ(q,p), which uses probabilities to represent our ignorance of

the true state of the system. Unfortunately, the usual interpretation of quantum

mechanics already employs probabilities on a deeper level: If the measurement of

some physical quantity A in this system is made a large number of times (i.e. on

a large ensemble of identically prepared systems), the average of all the results

obtained is given by the expectation value

(28) 〈A〉 = 〈ψ|A|ψ〉,

provided the quantum state |ψ(t)〉 is properly normalized to satisfy 〈ψ|ψ(t)〉 = 1.

In order to understand the consequences of this, we introduce a basis of eigenstates

for the operator A. Let |ai〉 be the eigenvector corresponding to the eigenvalue ai.

Since the |ai〉 form a basis, we can expand the identity operator as follows,

(29) 1 =∑

i

|ai〉〈ai|.

Inserting this operator into (28) twice, we obtain

(30) 〈A〉 =∑

i

ai

∣∣〈ai|ψ〉∣∣2.

Comparing this result to the definition of the expectation value,

(31) 〈A〉 =∑

i

ai p(ai),

we see that∣∣〈ai|ψ〉

∣∣2 must be interpreted as represented the probability p(ai) of ob-

taining ai as the result of the measurement. This probabilistic framework replaces

the classical notion of a dynamical variable having a definite value. While the ex-

pectation value of A is a definite quantity, particular measurements are indefinite

— in quantum mechanics we can only talk about the probabilities of different out-

comes of an experiment. Now we can introduce an ensemble. Instead of considering

a single state |ψ〉, let pk represent the probability of the system being in a quantum


state represented by the normalized state vector |ψk〉. If the system is actually in

state |ψk〉, then the probability of measuring ai is simply∣∣〈ai|ψk〉

∣∣2. If, however,

we are uncertain about the true state then we have to average over the ensemble.

In this case, the total probability of measuring ai is given by

(32) p(ai) =∑

k

pk

∣∣〈ai|ψk〉∣∣2 = 〈ai|

( ∑k

|ψk〉pk〈ψk|)|ai〉.

The object in parentheses in this last expression,

(33) ρ =∑

k

|ψk〉pk〈ψk|,

is known as the density operator. (33) turns out to be exactly what we’re look-

ing for, the quantum mechanical operator corresponding to the classical density

function ρ(q, p). Recall, that the classical density satisfies the following properties:

(1) Non-negativity of probabilities: ρ(q, p) must be non-negative for all points

in the phase space.

(2) Normalization of probabilities:∫ρ(q, p) dq dp = 1.

(3) Expectation values: The average value of a dynamical variable A(p, q)

across the entire ensemble represented by ρ(q, p) is given by

〈A〉 =∫A(q, p)ρ(q, p) dq dp.

These properties carry over into the quantum mechanical setting, with appropriate

modification (see exercises). In particular, it can be shown that

〈A〉 = trace{Aρ

}.

Apart from traces over a density operator replacing integration over the classical

ensemble, the statistical description of a complex quantum system is essentially no

different than that of a complex classical system. The time evolution of the density

operator ρ will be given by a quantum version of Liouville’s Theorem and will lead

to the same notions of a microcanonical ensemble and ergodicity.

First, we derive the quantum evolution law for ρ. Using the chain rule, we can

write

(34) i~∂ρ

∂t=

∑k

i~[( ∂∂t|ρ〉

)pk〈ρ|+ |ρ〉

)pk

( ∂∂t〈ρ|

)].


Substituting the Schrodinger equation, this reduces to

(35) i~∂ρ

∂t=

∑k

[(H|ρ〉

)pk〈ρ|+ |ρ〉

)pk

(H〈ρ|

)]= Hρ− ρH.

Thus,

(36)∂ρ

∂t= − 1

i~[ρ,H],

where [ρ,H] = ρH −Hρ is called the commutator of ρ and H. Note the striking

resemblance between (36) and Liouville’s Theorem — the commutator of the density

and Hamiltion operators has replaced the classical Poisson bracket of the density

and Hamiltonian functions but the expressions are otherwise identical. This is a

special case of a correspondence first pointed out by Dirac:

classical Poisson bracket,{u, v

}−→ quantum commutator,

1i~

[u, v

].

As in the classical setting, a stationary ρ should be independent of time; for an

equilibrium quantum system, ρ must therefore be a function of the Hamiltonian,

ρ(H). The simplest choice is again a uniform distribution,

(37) ρ =∑

k

|ψk〉1n〈ψk|,

where n is the number of states |ψk〉 in the ensemble. This the quantum micro-

canonical ensemble. It is essentially the same as the classical one, except discrete.

...THE SAME STATISTICAL PRINCIPLES APPLY, WE JUST HAVE TO

SWITCH TO A DISCRETIZED FORMALISM (TRACES OVER OPERATORS

INSTEAD OF...)

Exercise 1.7. Show that the eigenvalues of the density operator are non-

negative.

Solution. Let ρ′ represent any eigenvalue of ρ and let |ρ′〉 be the eigenvector

associated with this eigenvalue. Then∑k

|ψk〉pk〈ψk|ρ′〉 = ρ|ρ′〉 = ρ′|ρ′〉

Multiplying on the left by 〈ρ′|, we obtain∑k

pk

∣∣〈ψk|ρ′〉∣∣2 = ρ′〈ρ′|ρ′〉.


It follows that, since the pk are positive and 〈ρ′|ρ′〉 is non-negative, ρ′ cannot be

negative. Since eigenvalues in the quantum setting represent measurements in the

classical setting, this result mirrors property (1) above.

Exercise 1.8. Show that the matrix representation of ρ in any basis satisfies

(38) trace{ρ}

= 1.

Solution. Consider a basis of eigenstates |ai〉 of the operator A. The matrix

elements ρij = 〈ai|ρ|aj〉 are the representation of ρ in this basis. Then,

trace{ρ}

=∑

i

〈ai|ρ|ai〉 =∑

i

∑k

pk

∣∣〈ψk|ai〉∣∣2

=∑

k

pk

( ∑i

∣∣〈ψk|ai〉∣∣2) =

∑k

pk = 1

Since the trace is invariant under a change of basis, this result holds for any basis.

The condition trace{ρ}

= 1 should be compared to the normalization property (2)

above.

Exercise 1.9. Show that, in a quantum ensemble represented by the operator

ρ, the expectation value of an operator A satisfies

(39) 〈A〉 = trace{Aρ

}.

Solution.

〈A〉 =∑

k

pk〈ψk|A|ψk〉 =∑k,i

pk〈ψk|ai〉〈ai|A|ψk〉

=∑i,k

〈ai|A|ψk〉pk〈ψk|ai〉 =∑i,k

〈ai|Aρ|ai〉 = trace{Aρ

}.

This result should be compared to the classical definition of expectation value, prop-

erty (3) above.

5. Discussion

NOT SURE WHAT TO DO HERE...

One important feature of Hamiltonian dynamics is the equal status given to

coordinates and momenta as independent variables, as this allows for a great deal of

freedom in selecting which quantities to designate as coordinates and momenta (the

5. DISCUSSION 19

qi and pi are often called generalized coordinates and momenta). Any set of vari-

ables which satisfy the canonical equations (2-3) are called canonical variables.

One may transform between different sets of canonical variables; these changes of

variables are called canonical transformations. Note that while the form of the

Hamiltonian depends on how the chosen set of canonical variables are defined, the

form of the canonical equations are by definition invariant under canonical trans-

formations. . .

Hamiltonian systems have a great deal of additional structure. The quantity,

(40)∮

γ

p · dq =n∑

i=1

∮γ

pi dqi,

known as Poincare’s integral invariant, is independent of time if the evolution of

the closed path γ follows the flow in phase space. The left-hand side of (40) is also

known as the symplectic area. This result can be generalized if we extend our phase

space by adding a dimension for the time t. Let Γ1 be a closed curve in phase space

(at fixed time) and consider the tube of trajectories in the extended phase space

passing through points on Γ1. If Γ2 is another closed curve in phase space enclosing

the same tube of trajectories, then

(41)∮

Γ1

(p · dq − H dt) =∮

Γ2

(p · dq − H dt).

This result that the integral ∮(p · dq − H dt)

takes the same value any two paths around the same tube of trajectories is called

the Poincare-Cartan integral theorem. Note, if both paths are taken at fixed time,

then (41) simply reduces to (40).

Structure of this sort, as well as the presence of additional invariant quantities,

greatly constrains the flow in phase space and one may wonder whether this struc-

ture is compatible with the ergodic hypothesis and the microcanonical ensemble.

The most extreme illustration of the conflict is the special case of integrable Hamil-

tonian systems. A time-independent Hamiltonian system is said to be integrable

if it has n indepedent global constraints of the motion (one of which is the Hamil-

tonian itself), no two of which have a non-zero Poisson bracket. The existence

of n invariants confinements the phase trajectories to an n-dimensional subspace


(recall that the entire phase space is 2n-dimensional; this is a significant reduction

of dimension). The independence of these invariants guarantees that none can be

expressed as a function of the others. The last condition, that no two of the in-

variants has a non-zero Poisson bracket, restricts the topology of the manifold to

which the trajetories are confined — it must be a n-dimensional torus. A canonical

transformation to what are known as action-angle variables, for which

Ii =12π

∮γi

p · dq

provides the canonical momenta and the angle θi around the loop γi provides the

canonical coordinates, simplifies the description immensely: Each Ii provides a

frequency for uniform motion around the loops defined by the γi, generating tra-

jectories which spiral uniformly around the surface of the n-torus. For most choices

of the Ii, a single trajectory will fill up the entire torus; this is called emphquasi-

periodic motion. The microcanonical ensemble, for which the trajectories wander

ergodically on an (2n− 1)-dimensional energy surface, captures none of this struc-

ture. On one hand, highly structured Hamiltonian systems appear to exist in

Nature, the premiere example being our solar system. On the other hand, we have

the remarkable success of the statistical mechanics (and its underlying hypotheses

of ergodicity and equal a priori probabilities) in providing a foundation for thermo-

dynamics and condensed matter physics. This success remains a mystery.

statistical mechanics -...

Documents