statistical mechanics - université paris-sud · with in the statistical mechanics ii course or in...

STATISTICAL MECHANICS

Lecture Notes L3 Physique Fondamentale

Universite Paris-Sud

H.J. Hilhorst

2016-2017

Laboratoire de Physique Theorique, batiment 210

CNRS and Universite Paris-Sud, 91405 Orsay Cedex, France

PREFACE

These Notes were written for the benefit of the students of the Statistical MechanicsI course at level L3 of the “Fundamental Physics” track. They can in no way replaceattending the course, participating in the problem sessions, and reading books.

The choice of subjects has been chiefly determined by the requirement of coherencewith the French version of the course. Several important subjects left out will be dealtwith in the Statistical Mechanics II course or in advanced courses at the M1 and M2 level.

These Notes should be considered as an incomplete draft. Unavoidably, there mustbe errors and misprints that have escaped my attention. Criticism from colleagues andstudents is welcome.

H.J. HilhorstNovember 2016

3

TABLE OF CONTENTS

I. FROM MECHANICS TO STATISTICAL MECHANICS

II. MICROCANONICAL ENSEMBLE

III. CANONICAL ENSEMBLE

IV. HARMONIC OSCILLATOR

V. GRAND-CANONICAL ENSEMBLE

VI. STATISTICAL MECHANICS OF QUANTUM SYSTEMS

VII. IDEAL QUANTUM GASES

VIII. IDEAL FERMI GAS

IX. IDEAL BOSE GAS

X. PHOTON GAS

XI. CORRELATION FUNCTIONS

XII. SYSTEMS OF CHARGED PARTICLES

XIII. CHEMICAL EQUILIBRIUM

Appendix A. PROBABILITY DISTRIBUTIONS

Appendix B. MISCELLANEOUS MATHEMATICS

4

I. FROM MECHANICS TO STATISTICAL MECHANICS

I.1. Reminders about classical mechanics

I.1.1. Single particle

The motion of a classical particle of mass m is given by Newton’s law

F = ma. (1)

Here a = v = r is the particle’s acceleration and F is the sum of all forces acting on it. Theseinclude

– forces due to other particles;– forces due to external electric, magnetic, and/or gravitational fields;– forces due to the walls, if any, between which the particle is enclosed.

We will set p = mv. In general F will be a function of r,p, and t. The equation of motion isthen equivalent to the pair of equations

r(t) =p

m,

p(t) = F(r(t),p(t), t

), (2)

which are of first order in the derivatives with respect to time. If at some arbitrary initial instantone knows

r(t0) = r0, p(t0) = p0, (3)

then one can calculate, in principle, r(t) and p(t) at every later instant t > t0. In orderto convince yourself of this, discretize time into small intervals (“time steps”) ∆t and write,neglecting terms of order ∆t2,

r(t0 + ∆t) = r(t0) + ∆tp(t0)

m,

p(t0 + ∆t) = p(t0) + ∆tF(r(t0),p(t0), t0

). (4)

These equations express the position and momentum at time t0 + ∆t in terms of those at t0. Byiterating them one obtains the solution (r(t); p(t)) for all t > t0. This solution may be visualizedas a curve (a “trajectory”) parametrized by t in a space of dimension six. This space is calledthe particle’s phase space.

Exercise. Show that you may also find the trajectory at all earlier times t < t0.

I.1.2. N particles

Let N particles, all of mass m, be numbered by i = 1, 2, . . . , N . Newton’s equations read

Fi = mai, i = 1, 2, . . . , N, (5)

where ai = vi = ri is the acceleration of the ith particle. Since it is not our purpose to treatthe most general case, let us suppose that the forces at work are due only to

5

Figure 1: The Lennard-Jones ‘6-12’ potential between neutral atoms.

(i) An external potential Vext(ri) acting on each of the particles i = 1, 2, . . . , N . This forcemay represent for example a gravitational field

Vext(ri) = mgri3 (6)

where ri = (ri1, ri2, ri3) and g is the acceleration of gravity, or the effects of, for example,impenetrable walls,

Vext =

0 if 0 ≤ ri1, ri2, ri3 ≤ L,

∞ elsewhere,(7)

corresponding to particles enclosed in a cubic container of edge length L.

(ii) An interaction potential between the particles. More specifically we will suppose thatthere is only a spherically symmetric pair potential Vint(|ri − rj |) between them.

Remark. This potential applies to spherically symmetric particles. For example, the potentialVint(r) between two atoms of a noble gas at distance r = |r| has in good approximation theshape indicated in Fig. 1. For argon (Ar) we have σ = 3.4× 10−10 m and ε = 1.65× 10−21 J.

Suppose we know the ri and pi for all i = 1, 2, . . . , N . The kinetic and potential energy ofthis system are then given by

Ekin =

N∑i=1

p2i

2m, (8)

Epot =∑

1≤i<j≤NVint(|ri − rj |) +

N∑i=1

Vext(ri), (9)

so that its total energy H becomes

H = H(

Γ︷︸︸︷r1, r2, . . . , rN︸︷︷︸

rN

; p1,p2, . . . ,pN︸︷︷︸pN

)

= Ekin + Epot. (10)

6

The force on the ith particle is equal to minus the gradient of the total potential energy withrespect to ri which gives

Fi = −∂Epot

∂ri,

∂

∂ri≡(∂

∂ri1,∂

∂ri2,∂

∂ri3

). (11)

Exercise. Show that

∂Vint(|ri − rj |)∂ri

= rijV′

int(|ri − rj |), (12)

in which V ′int(r) ≡ dVint/dr and rij = (ri − rj)/|ri − rj |. Define the external forceFext(ri) and show that

Fi =

N∑j=1j 6=i

rijV′

int(rij) + Fext(ri). (13)

The equations of motion for the N -particle system can now be written as

ri(t) =pi(t)

m,

pi(t) = − ∂

∂riEpot(r1(t), r2(t), . . . , rN (t)) (14)

for i = 1, 2, . . . , N . This is a closed system of 6N coupled first order ordinary differentialequations. For a given set of initial values ri(0),pi(0) they have a unique solution ri(t),pi(t),which represents a trajectory in the 6N -dimensional phase space of the N -particle system.

One easily verifies that Eq. (14) is equivalent to

ri =∂H∂pi

, pi = −∂H∂ri

, (15)

which are called the Hamilton or Hamilton-Jacobi equations of the particle system. In thiscontext H is called the Hamiltonian.

The advantage of this slightly more abstract form of the equations of motion is that itremains valid in more general cases, in particular when ri and pi are replaced with an arbitrarypair of “canonically conjugate” coordinates.1 Furthermore, Eqs. (15) are easy to use in formalproofs, some of which will follow.

Exercise. Argue why two trajectories in phase space cannot intersect. Can there beclosed loop trajectories?

Exercise. The Hamiltonian of a classical one-dimensional oscillator of mass m andangular frequency ω is

H =p2

2m+

1

2mω2x2. (16)

1The definition of “canonically conjugate” is given in your Classical Mechanics course; here it sufficesto know that ri and pi satisfy the condition of being canonically conjugate.

7

Figure 2: A trajectory in phase space, confined to a surface of constant energy.

Check that the Hamilton equations for this oscillator are the usual equations of motionthat you know. Find the trajectories in phase space.

In order to discuss the consequences of Eqs. (15) more easily we will set

rN (t) = (r1(t), . . . , rN (t)),

pN (t) = (p1(t), . . . ,pN (t)), (17)

andΓ(t) = (rN (t); pN (t)) (18)

Now Γ(t) is a trajectory in a 6N -dimensional phase space. For a given Γ(0) the Hamilton-Jacobi equations fully determine Γ(t) at all later (and earlier) times. This is a purely mechanicalproblem.

Remark. In classical mechanics the notation (qi,pi) instead of (ri,pi) is usual.

I.2. Ensembles: general

I.2.1 Probability distributions in phase space

The number of gas molecules in a cm3 is of the order of 1023, and it is of course in practicepossible neither to determine the initial ri(0) and pi(0) nor, even if we knew them, to calculateall the ri(t) and pi(t). And moreover, we would not be interested in so much detail.

We will use the terminology

microstate = point Γ in phase space

Since we can never know the microstate of a system, we now pass from simple mechanics tostatistical mechanics by introducing the concept of a probability density P (Γ) = P (rN ; pN ) forthe system to be in microstate Γ = (rN ; pN ). We will call

macrostate = probability distribution P (Γ) in phase space

8

The probability distributions that we are going to use will always depend on a few parameters,such as the volume, the temperature, etc. of the system; so the macrostate will depend on theseparameters. We should evidently have the normalization∫

dΓP (Γ) ≡∫

drNdpNP (rN ; pN ) = 1, (19)

where drN ≡ dr1dr2 . . . drN and dpN = dp1dp2 . . . dpN . The actual choice of P (Γ) will bemotivated by the experimental conditions: the way the system has been prepared, the absenceor presence of contacts with its environment, etc. In general an initial P (Γ) will evolve in timeand we will write it as P (Γ, t). For a physical system in equilibrium P (Γ) is independent oftime. We will soon see examples.

I.2.2. Link with experiment

Let A be a measurable physical quantity (such as the pressure or the temperature of a gas)whose value in microstate Γ is equal to A(Γ). Suppose we measure A at some instant of timet. Such a measurement necessarily takes a short but finite time τ . The true experimental resultobtained will be the value of A(Γ) averaged along the segment of the system’s trajectory betweentwo times t± τ/2,

Aexper =1

τ

∫ t+τ/2

t−τ/2dt′A(Γ(t′)). (20)

Although τ is short at the macroscopic (human) scale, it will be a very long time at the micro-scopic scale. The essential hypothesis now is that this segment of a trajectory passes througha sequence of points of phase space that are sufficiently representative for the full region ofphase space in which P (Γ) is sensibly different from zero. The average along the trajectory maytherefore be replaced with the phase space average

〈A〉 =

∫dΓP (Γ)A(Γ). (21)

We consider this average as the theoretical prediction for the result of the experimental mea-surement.

We will henceforth consider the equivalence between (21) and (20) as established. This iscalled the ergodic hypothesis. Systems that can be proven to satisfy this hypothesis are calledergodic; however, for the vast majority of Hamiltonians of interest such proofs do not exist.

I.2.3. Energy conservation

Can a trajectory go everywhere in phase space? The answer is “no”, for the simple reasonthat – if the system is isolated from its environment (as we will suppose), and if the Hamiltoniandoes not depend explicitly on the time t – the total energy E of the system remains constant.Now the equation

H(Γ) = E (22)

represents a hypersurface of dimension 6N−1 in phase space. Hence every trajectory is confinedto one of these hypersurfaces. This is also expressed by saying that H is an invariant.

One should wonder if there are maybe other invariants. These would then divide everyhypersurface H = cst into smaller sectors to which the trajectories would be confined. Ingeneral, except for special cases that one recognizes as such, it is supposed that H is the onlyinvariant.

9

The proof that H is an invariant takes a single line,

dHdt

=

N∑i=1

[∂H∂ri· ri +

∂H∂pi· pi]

= 0, (23)

where the second equality results from the equations of motion (15).

I.2.4. Ensembles

We may consider a plain density ρ(Γ) in phase space such that ρ(Γ) ≥ 0, but which is notnormalized. One passes easily from ρ(Γ) to a probability density P (Γ) by means of the relation

P (Γ) =ρ(Γ)∫dΓ ρ(Γ)

. (24)

One may represent any positive function ρ(Γ) on phase space as a collection of points suchthat the larger ρ(Γ) is, the denser the points are locally. At this stage of our discussion thetotal number of points has no significance. Then, abandoning the original interpretation of aprobability law for a single physical system, we imagine that each of these points correspondsto a different system. Whence the name ensemble for ρ(Γ). This alternative interpretation isoften useful. The phase space average (21) will also be referred to as the ensemble average.

Remark. The ensemble formalism that we are about to develop does not exclude the study ofa precise microstate. If you know the microstate Γ0 of the system of interest to you, then youcan express this knowledge by setting

ρ(Γ) = δ(Γ− Γ0). (25)

Remark. The phase space of a classical physical system is a continuum: there is a uncountableinfinity of microstates Γ. However, quantum mechanically the energy levels and eigenstates ofa particle (harmonic oscillator, particle in a box) are discrete and countable. As a consequence,so is the phase space of a quantum system: it is discrete. This observation does not play a roleright away – for the moment we will study only classical systems – but we have to keep it inmind.

Concerning ensembles, a more limited question is the following one. Suppose we have letthe system relax to equilibrium. We then expect it to de described by a stationary (= time-independent) density, that we will denote by ρ(rN ; pN ). Can we find such a density? See sectionI.4.

I.3. The flow in phase space

Let there be given at a certain time t0 an ensemble of points (microstates) in phase space,described by a density ρ(Γ, t0). When the time evolution is “turned on”, each of these microstatesstarts moving along a trajectory determined by the Hamilton equations. At every later instant oftime t > t0 we therefore have a new set of microstates, described by a density ρ(Γ, t). The relationbetween ρ(Γ, t) and ρ(Γ, t0) is, evidently, completely determined by the Hamilton equations.Question: how can we obtain it explicitly?

However, even before invoking these equations, we may observe that every point in theensemble present at t0 will still be present at time t, and no new point will have appeared. It

10

follows that the density ρ is conserved, that is, satisfies the continuity equation

∂ρ

∂t+ div ρV = 0, (26)

where V is the 6N -dimensional velocity in phase space (note: only 3-dimensional vectors will beprinted in boldface). For later use we write Eq. (26) as

∂ρ

∂t+ V · grad ρ+ ρdivV = 0. (27)

These equations require the following explanation. If at some instant t a microstate of theensemble is at Γ = (r1, . . . , rN ; p1, . . . ,pN ), then it has at that moment along its trajectory thevelocity

V(Γ) =dΓ

dt

= (r1, . . . , rN ; p1, . . . , pN ). (28)

The current density in phase space is therefore equal to ρV . Furthermore, in the continuityequation the operators “div” and “grad” are the 6N -dimensional generalizations of those thatyou know,

grad ρ =

(∂ρ

∂r1, . . . ,

∂ρ

∂rN;∂ρ

∂p1, . . . ,

∂ρ

∂pN

), (29)

divV =N∑i=1

3∑α=1

[∂riα∂riα

+∂piα∂piα

]. (30)

If now we employ the Hamilton-Jacobi equations (15) we may write (28) as

V =

(∂H∂p1

, . . . ,∂H∂pN

;−∂H∂r1

, . . . ,− ∂H∂rN

). (31)

With the help of definition (30) one sees from (31) that

divV = 0, (32)

which means that the flow in phase space is “incompressible”, this term being borrowed fromthe flow of a real three-dimensional fluid. From (32) and (27) we obtain another equation thatexpresses this incompressibility, namely

∂ρ

∂t+ V · grad ρ ≡ Dρ

Dt= 0. (33)

The symbol D/Dt defined here is the hydrodynamic 2 derivative: the equation says that thedensity stays constant for an observer traveling along with the moving points. Equations (32)and (33) are equivalent formulations of what is called the Liouville theorem.

2It has many other names: material derivative, convective derivative, . . .

11

If in Eq. (33) we substitute for V the explicit expression (31), we find that the density ρ(Γ, t)obeys

∂ρ

∂t=

N∑i=1

[∂H∂ri· ∂ρ∂pi− ∂H∂pi· ∂ρ∂ri

], (34)

which is called the Liouville equation. This answers the question asked at the beginning of thissection.

Summary. Let the Hamiltonian of an N -particle system be given. Eqs. (15) then show howto calculate the time evolution of an individual system; and similarly, Eq. (34) shows how tocalculate the time evolution of an ensemble of systems.

Eq. (34) is linear in ρ. Therefore the Liouville equation may be written succinctly as

∂ρ

∂t= Lρ, (35)

where L, called the Liouville operator, is a very complicated but linear operator acting in thespace of all functions ρ(Γ). Suppose that at some initial instant of time t = 0 we know ρ(Γ, 0).The linearity of Eq. (35) then allows us to write the solution ρ(Γ, t) at any later time t > 0 as

ρ(Γ, t) = eLtρ(Γ, 0). (36)

This solution, however, is only formal: it is impossible, in general, to calculate the operator eLt

and no practical conclusion can be drawn directly from (36). Nevertheless, Eq. (35) is the startingpoint for the statistical mechanics of (classical) time dependent systems. In this introductorycourse, however, we will be concerned almost exclusively with systems at equilibrium: theirproperties do not change with time.

Exercise. Show that the invariance of the energy, expressed by (23), may also bewritten as

V · gradH = 0. (37)

Show furthermore that|V| = |gradH|. (38)

Interpret both relations.

I.4. Stationary ensembles

Stationarity

In general an ensemble ρ(Γ, t) depends on time and therefore any average 〈A〉 calculatedin that ensemble will also depend on time. We know experimentally that an isolated systemwill tend to an equilibrium state in which its macroscopic properties have stationary (= timeindependent) values. This is why we are interested in stationary ensembles ρ(Γ).

It is not hard to find such ensembles. Whenever we can write ρ as some function φ of H(Γ),

ρ(Γ) = φ(H(Γ)

), (39)

then upon substituting it in the Liouville equation (34) we find that ∂ρ/∂t = 0 whatever φ; ormore rapidly,

∂ρ

∂t=

dφ

dHdHdt

= 0, (40)

12

where the second equality sign is due to (23). If the system were to have still other invariantsthan H, we might of course let φ depend on those, too.

It might seem unsatisfactory that there is an arbitrary function φ here. Now, as we will see,statistical mechanics is at the basis of thermodynamics, and it will appear – however surprisingthis may seem at first – that any choice of φ leads to the same thermodynamics. This is whywe say that all stationary ensembles are equivalent. Different choices of φ, however, do haveconsequences when we ask questions that go beyond simple thermodynamics (and which may,for example, concern fluctuations). The choice of the appropriate ensemble will be dictated byexperimental considerations. All this will become clear step by step in the following chapters.

In the remainder of this section we address various points that concern all ensembles.

• Partition function

In stationary ensembles the key role is played by a quantity Z, called the partition function,and defined as

Z =

∫dΓ ρ(Γ). (41)

You might think that this quantity, already encountered in Eq. (24), is nothing but a simplenormalization constant without great interest. A normalization constant indeed it is, but itsinterest goes far beyond what is clear at this stage.

• Three important ensembles

We discuss in the following chapters of these Notes the three most common statistical en-sembles, namely

– the microcanonical ensemble (index “m”);– the canonical ensemble (index “c”);– the grand-canonical ensemble (index “g”).

• Walls

We call system the part of nature that we select for our investigation. We call wall theseparation between a system and its surroundings. The nature of the wall plays a role inchoosing the ensemble.

A wall is said to be (completely) isolating if it does not allow for the exchange of eitherheat of particles. We call it adiabatic if we want to emphasize that it does not allow for theexchange of heat. A wall is said to be diathermal if it allows for the exchange of heat. It is calledpermeable if it allows for the exchange of one or more species of particles (and necessarily forthe energy that comes along with them). Finally, we may consider a wall able to slide withoutfriction, which is then called a piston.

The same terminology applies to walls that separate two subsystems of a total system.

• Intensive and extensive variables

When you double a thermodynamic system by putting a copy of it next to it, the volume,number of particles, energy, . . . of the total system will also double; they are called extensivevariables. However, the pressure, density, . . . will remain the same; they are called intensivevariables.

The ratio of two extensive variables is intensive. In particular, if N is the number of particlesin a system and Xk is another extensive variable, then xk ≡ Xk/N is intensive: it represents

13

the amount of Xk per particle; similarly xk ≡ Xk/V represents the amount of Xk per unit ofvolume.

A consequence is the following. Let N and X1, X2, . . . be extensive variables. Let Y beanother extensive variable that may be expressed in terms of them, and set y = Y/N .

Y (N,X1, X2, . . .)

N= y(x1, x2, . . .), (42)

that is, an intensive variable can depend only on ratios of extensive variables.Thermodynamics is valid only in the limit of a system that is very large with respect to

the microscopic scales of length, energy, etc. This is referred to as the thermodynamic limit.

Theoretically it may be taken, for example, by letting N → ∞ while keeping all intensivevariables fixed. The above relations between intensive and extensive variables hold, strictlyspeaking, only in the thermodynamic limit. In statistical mechanics it will often be tacitlyunderstood that the limit N →∞ has been, or is to be taken.[*** To be elaborated in class.]

Exercise. What contradictions would arise if, for example, the energy were superex-tensive? Subextensive? This shows that extensivity is a necessary condition for ther-modynamic stability.

14

II. MICROCANONICAL ENSEMBLE

II.1. The ensemble

II.1.1. Boltzmann’s postulate

The microcanonical ensemble is based on Boltzmann’s postulate:

If an isolated macroscopic system has an energy in the interval (E,E+δE) and isin equilibrium, then, in the absence of any further information, all its microstateshaving an energy in that interval have the same probability.

The mathematical expression of this postulate leads us directly to the definition of the micro-canonical ensemble, namely

ρm(Γ) =

C if E < H(Γ) < E + δE,

0 otherwise,(1)

in which C is a constant.The simplest case occurs when the system is composed of N identical particles enclosed in

a volume V . One then calls the triplet

(E, V,N) (2)

the independent variables of the microcanonical ensemble. These are the control parametersthat an experimentalist can vary. The other ensembles that we will consider are characterizedby other triplets of independent variables.

The constant C

Its dimension [C] is the inverse of [dΓ] (why?) and hence of [dridpi]N . One easily finds,

therefore, that [C] = (Js)−3N = [action]−3N . Its definition: C is the number of microstates inthe ensemble per unit of volume in phase space. This value might seem without importance,since the constant C drops out of the probability density P (Γ) defined by (24). By convention,and for good reasons, one chooses C such that ρm(Γ)dΓ is equal to the number of quantummechanical microstates that correspond3 to the classical phase space volume dΓ. Hence

∫E<H(Γ)<E+δE

dΓ ρm(Γ) =number of microstates of the correspondingquantum system having an energy betweenE and E + δE

. (3)

In this classical discussion we therefore fix C by appealing to the underlying quantum mechanics.We state here without proof that this leads to

C =1

N !h3N, (4)

where h is Planck’s constant (remember that [h]=action),

h = 6.63× 10−34Js. (5)

3In a sense that we will not elaborate upon here.

15

The presence of the factor 1/h3N may be made plausible for the case of free particles (done inclass). The appearance of the factor 1/N ! in Eq. (4) is• on the one hand, necessary to guarantee the extensivity of the thermodynamic quantities,

as we will soon see;• on the other hand, a consequence of quantum mechanics, which says that the exchange

of two identical particles (in a many-particle system) does not lead to a new microstate, thetwo particles being indistinguishable (or: indiscernible). However, our classical equations (tobe precise, the integral

∫dΓ) count two such configurations as distinct microstates. The factor

1/N ! corrects this redundancy.

Remarks.(1) For a system with s different species of particles with numbers N1, . . . , Ns the factor

1/N ! should obviously be replaced with 1/(N1! . . . Ns!).(2) We will encounter physical systems in which it is impossible for two particles to exchange

positions. In such cases the factor 1/N ! is absent. The best known example is a crystal lattice,where each particle is bound by elastic forces to a well-defined lattice site.

II.1.2. Partition function and entropy

In what follows we denote by an index “m” all quantities defined in the microcanonicalensemble (for certain quantities this index will be suppressed later on). There is, first of all, themicrocanonical partition function,

Zm(E, V,N) =

∫dΓ ρm(Γ)

=1

N !h3N

∫(E,δE)

dΓ, (6)

where (E, δE) is shorthand notation for the shell E < H(Γ) < E + δE. Let us now introducethe auxiliary quantity

Φ(E) =1

N !h3N

∫H(Γ)<E

dΓ (7)

(where for convenience we have suppressed the variables V and N), which represents the (di-mensionless) volume of phase space corresponding to a system energy at most equal to E. Thenobviously

Zm(E, V,N) = δE Φ′(E), (8)

where Φ′ ≡ (∂Φ/∂E)N,V is the density of states,4 that is, the number of energy levels per unitinterval on the energy axis.

With each partition function one associates a thermodynamic potential, called that way be-cause one deduces all other thermodynamic quantities from it by differentiations. The thermo-dynamic potential associated with the microcanonical partition function is the (microcanonical)entropy defined by

Zm = ekB−1Sm or Sm(E, V,N) = kB logZm(E, V,N), (9)

4Often the symbol ρ(E) is used for this quantity; if so, it should not be confused with the ensemblesρ(Γ). Furthermore, Ω(E, V,N) is an alternative notation for Zm(E, V,N).

16

Figure 3: Microcanonical ensemble.

where kB is Boltzmann’s constant,

kB = 1.38× 10−23 JK−1. (10)

This entropy is a function on the three-dimensional space of coordinates (E, V,N). It is not clearat this point – but will soon become so – that this Sm is the same thing as the thermodynamicentropy S.

II.2. Examples: ideal gas and harmonic oscillator

II.2.1. Partition function and entropy of the ideal gas

Is it possible to calculate a microcanonical partition function? Yes, but only for certainmodel systems.

Let us consider an ideal (= noninteracting, Vint(r) = 0) gas5 of N particles enclosed in avolume V and having a total energy E. The microcanonical partition function of this system maynow be calculated as follows. It is convenient to start from Eq. (8) and to begin by calculatingΦ(E),

Φ(E) =1

N !h3N

∫ri∈V

dr1 . . . drN

∫∑i p

2i<2mE

dp1 . . . dpN

=V N

N !h3N(2mE)

32N π

32N

Γ(32N + 1)

, (11)

where Γ(x) is the Gamma function and where we have used the formula for the volume of ad-dimensional sphere. We see that

Φ′(E) =3N

2EΦ(E). (12)

5More rarely called a perfect gas.

17

Upon applying Stirling’s formula,

N ! = NNe−N√

2πN[

1 + O(N−1)], N →∞, (13)

to the factorial and the Gamma function in (11) we obtain

Φ(E) =e

52N

π√

6N

(V

N

)N ( mE

3π~2N

) 32N [

1 + O(N−1)]. (14)

When substituting (14) in (8) and (8) in (9) we obtain an explicit expression for the micro-canonical entropy of the ideal gas,

Sm(E, V,N) = NkB

[5

2+ log

V

N

(mE

3π~2N

)3/2]

+ O(logN) + O(

logNδE

E

). (15)

In the thermodynamic limit the O terms may be ignored; Eq. (15) is then called the Sackur-Tetrode formula (1912). Note that the entropy is obviously extensive and that the RHS ofEq. (15) has the structure of Eq. (I.42),

Sm(E, V,N) = Nsm(ε, v), (16)

where we have set ε = E/N and v = V/N .

Remark. In Eq. (11) the logarithm of neither V N nor N ! is extensive, but the log V N/N ! is.This shows once more the need for the factor 1/N ! when all particles are free to move through

the whole volume. Furthermore, the logarithm of neither (2mE)32N nor Γ(3

2N + 1) is extensive,but the logarithm of their ratio is.

II.2.2. Harmonic oscillator. Will be done in the problem session.

II.3. Identification of thermodynamics and statistical mechanics

After these examples we continue to develop the general theory.Thermodynamics deals with the behavior of macroscopic bodies (gases, liquids, magnets,

. . . ) (and of radiation) when subjected to temperature changes and external forces. Ther-modynamics is a self-contained theory. It relates directly observable quantities to each otherand to new more abstract quantities (such as entropy, internal energy,. . . ). Thermodynamicsdescribes matter without referring to its microscopic structure, and without making any as-sumptions about it. A certain thermodynamic phenomenon (for example, melting) may resultfrom molecular rearrangements, but it will never be possible to identify such microscopic causesfrom thermodynamics alone. Thermodynamics was developed principally on the basis of theexperimental and theoretical study of gases (motivated by their application to heat engines). Itwas then gradually realized that thermodynamic theory applies in the same way6 to a wholerange of other systems, whence its great value as a general macroscopic theory. In these Noteswe will develop a large collection of statistical mechanical expressions. We will then see thatthermodynamics results from statistical mechanics if one is willing to identify certain of theseexpressions with well-known thermodynamic quantities.

6In many cases by just making the proper substitution of variables.

18

Recall now the following from your thermodynamics course. Let there be a gas ofN particles7

enclosed in a volume V and having a total (“internal”) energy8 E. These are three parametersthat we may control in an experiment. Thermodynamics says that the triplet (E, V,N) com-pletely characterizes the state of this system. Well then, how about this system’s temperature,pressure, and so on? Thermodynamics asserts that there is a state function S(E, V,N) calledentropy from which all these may be derived by simple differentiations9. Expressed differently,if its variables undergo changes dN , dV , and dN , then the entropy change dS is given by

dS =1

TdE +

p

TdV − µ

TdN, (17)

in which T , p, and µ are the temperature, the pressure, and the chemical potential of the system,respectively. It follows that

1

T=

(∂S

∂E

)V,N

,p

T=

(∂S

∂V

)E,N

, −µT

=

(∂S

∂N

)E,V

. (18)

Returning to statistical mechanics we consider the derivatives of Sm(E, V,N) along thethree main axes in EV N space. By analogy to Eqs. (18) we now define the (microcanonical)temperature Tm, pressure pm, and chemical potential µm according to the three formulas

1

Tm≡ ∂Sm(E, V,N)

∂E, (19)

pm

Tm≡ ∂Sm(E, V,N)

∂V, (20)

−µm

Tm≡ ∂Sm(E, V,N)

∂N. (21)

If the three independent variables undergo simultaneously infinitesimal changes dN, dV, and dE,then the resulting entropy change dSm will be given by

dSm =1

TmdE +

pm

TmdV − µm

TmdN, (22)

which is the exact microcanonical analog of the thermodynamical relation (17).

We may now continue and find the microcanonical equivalents of all other thermodynamicquantities of interest. Let us take the heat capacity as an example.

In thermodynamics one defines the heat capacity10 at constant volume CV by CV = (∂E/∂T )N,V .The microcanonical analog is to define

CV,m ≡∂E

∂Tm=

(∂Tm

∂E

)−1

, (23)

7In thermodynamics the quantity of matter may be expressed in terms of its mass rather than in the“number of particles”, if one wishes to avoid all reference to microscopic structure.

8In thermodynamics a very common symbol for this quantity is U .9This is why, on the basis of a superficial analogy to electrostatics, the entropy is called a thermody-

namic potential.10The specific heat cV is the intensive counterpart of the heat capacity, cV = CV /N .

19

the derivations being carried out at constant N and V . Using that Tm = 1/(∂Sm/∂E) we getfrom (23) after some rewriting the expression for CV,m entirely in terms of derivatives of theentropy Sm(E, V,N),

CV,m = −

(∂Sm

∂E

)2

∂2Sm

∂E2

= − 1

T 2m

(∂2Sm

∂E2

)−1

. (24)

We note that as a consequence of Eq. (23) and the fact that CV,m > 0 we also have that∂Tm/∂E > 0 and that Sm is a concave function of E.

II.4. Example: the ideal gas, again

II.4.1. Thermodynamics of the ideal gas

In section II.2 we have calculated the partition function Zm(E, V,N) of the ideal gas andfound its entropy Sm(E, V,N). We will now proceed to the thermodynamics of the ideal gas,based on the relation established in section II.3.

By deriving Eq. (15) it successively with respect to each of its three variables one findsstraightforwardly

∂Sm

∂E=

3NkB

2E, (25)

∂Sm

∂V=

NkB

V, (26)

∂Sm

∂N= kB log

V

N

(mE

3π~2N

)3/2. (27)

By comparing Eqs. (19) and (25) we find an explicit expression for the microcanonical temper-ature,

E =3

2NkBTm ; (28)

and by comparing Eqs. (20) and (26) we obtain the ideal gas law

pmV = NkBTm . (29)

Both relations are familiar from thermodynamics, and this shows the consistency of our identi-fication of kB logZm with the thermodynamic entropy.

Exercise. Check that Eqs. (25)-(29) all have both members either extensive or inten-sive.

Exercise. Check that for an ideal gas Eq. (23) leads to CV,m = 32NkB, another formula

well-known from thermodynamics.

Exercise. Show that

sm(ε, v) = kB

[5

2+ log

v

λ3

]

= kB

[5

2− log nλ3

], (30)

20

where λ ≡ h/(2πmkBT )12 is the de Broglie wavelength11 of a thermal particle of mass

m, we have as before v = V/N , and in the second line we have introduce the particledensity n = v−1 = N/V . The dimensionless combination nλ3 will appear again inother problems.

II.4.2. Limits on the validity

There are two physical effects that set a limit to the range of validity of Eqs. (15) and (30).

1. Interactions.In real (= nonideal) gases there are interactions between the atoms or molecules. When

these interactions are short-ranged, as we assume throughout, they will play a role only whenparticles are close to one another. We may therefore rightfully expect that the expressions thatwe found above for the ideal gas are also valid for sufficiently dilute real gases. When the typicaldistance between particles becomes of the order of the range of the interaction, corrections tothe ideal gas expressions should be expected. There exist several methods of calculating suchcorrections perturbatively, most of them developed in the first half of the 20th century. We willsee an example in Sec. V.5.

Remark. It is clear that for systems with long-range interactions we should reason differently.Examples are (i) plasmas (= fluids of which all or an important fraction of the particles areelectrically charged) and (ii) systems with only gravitational interactions. We will leave suchsystems aside for the moment.

2. Quantum effects.The real world is quantum mechanical and this is a good reason for considering a quantum

mechanical ideal gas. When we will do this, in later chapters, it will turn out to be essential todistinguish between bosons and fermions. We restrict ourselves here to the following remarks.

We have argued that in the quantum limit Zm should reduce to the count of the numberof microstates in a narrow energy interval. The result of such a count can only be a positiveinteger (1,2,3,. . . ). Therefore we have the inequality Sm = kB logZm ≥ 0. Obviously Eqs. (15)and (30) violate this inequality: when Tm becomes small, or n large, these expressions for theentropy become negative. The reason is that they have been obtained as integrals on a phasespace approximated as a continuum, instead of the actual space of discrete microstates. We mayexpect the classical approximation to be a good one as long as it gives a positive entropy. Thisleads to

nλ3 1 (31)

as the condition of validity for the classical expressions. Again, they appear to certainly hold

in the dilute limit. [We may write nλ3 =

(h

∆p∆x

)3

, although with an interpretation of ∆p

and ∆x slightly different from the one in quantum mechanics. This suggests that quantummechanics comes in when the uncertainty relations are in danger of being violated. Discussedin class.]

II.5. Microcanonical systems in contact

Study here Appendix B.

11Recall that λ = 2π/k and that in quantum mechanics p = ~k, whence λ = h/p.

21

Exercise. The entropy of two independent (= noninteracting) systems is the sumof their individual entropies. Show this for Sm (easy). Why is it plausible that thisadditivity remains true for two systems in contact only via short-range interactions?

II.5.1. Thermal contact

Equilibrium condition. We consider an isolated system composed of two subsystems 1 and2 that are separated by an adiabatic wall. The systems are therefore microcanonical. We willdenote their parameters by E1, V1, N1 and E2, V2, N2 and their partition functions by Zm,1 andZm,2, respectively.

Let us now replace the adiabatic wall by a diathermal one. The two systems will startexchanging energy until a again an equilibrium is reached. Our purpose here is to study thisnew equilibrium.

We will assume that the phase spaces of the individual systems have not changed,12 that is,we neglect any interactions between these systems. This will certainly be a good approximationfor interaction forces that are short-ranged. We now ask what the properties are of this newequilibrium state.

Let us set E = E1+E2, V = V1+V2, and N = N1+N2. The partition function Zm,12(N,V,E)of the total system involves a sum on all ways of distributing the total energy between its twocomponents and is given by

Zm,12(E, V,N) = δE

∫dE1 Φ′1(E1, N1, V1)

∫dE2 Φ′2(E2, N2, V2) δ(E − E1 − E2)

= δE

∫dE1 Φ′1(E1, N1, V1) Φ′2(E − E1, N2, V2)

=

∫dE1

δEZm,1(E1, V1, N1)Zm,2(E − E1, V2, N2)

=

∫dE1

δEexp

[k−1

B Sm,1(E1, V1, N1) + k−1B Sm,2(E − E1, V2, N2)

]︸︷︷︸k−1B Sm,12(E,V,N ;E1)

, (32)

where Sm,12(E, V,N ;E1) is the entropy13 for a given arbitrary partioning of the energy intoportions E1 and E−E1 for the subsystems 1 and 2, respectively. The integral on E1 is stronglydominated by the value of the integrand in its maximum, that is, by values E1 ≈ Emax

1 . Itfollows – as we will show in detail in section II.5.2. – that

Zm,12(E, V,N) = exp[k−1

B Sm,1(Emax1 , V1, N1) + k−1

B Sm,2(E − Emax1 , V2, N2)︸︷︷︸

k−1B Sm,12(E,V,N)=k−1

B Sm,12(E,V,N ;Emax1 )

+O(logN)], (33)

in which the maximum value Emax1 is the solution of

∂Sm,12(E, V,N ;E1)

∂E1= 0, (34)

whence∂Sm,1

∂E1(Emax

1 , V1, N1) =∂Sm,2

∂E2(E − Emax

1 , V2, N2). (35)

12In quantum mechanical language: that the energy levels of the two systems have not changed.13C. Texier calls this quantity the entropie reduite. I am not aware of a specific English term for it.

22

Invoking now definition (19) of the microcanonical temperature Tm we see that (35) is equivalentto

Tm,1 = Tm,2 , (36)

which is the equilibrium condition that we know from thermodynamics (and daily practice!).This confirms the consistency of interpreting in the microcanonical ensemble the quantitykB logZm as the entropy. It shows, moreover, that the equilibrium state of the total systemis determined by the condition that its entropy be maximized as a function of E1, which afterthe contact is established becomes a free parameter.

II.5.2. Proof of Eq. (35)

The exact reasoning goes as follows. Write ν1 = N1/N and ν2 = N2/N . By the “ther-modynamic limit” we will mean here the limit N → ∞ with all the following ratios fixed:ν1, ν2, v1 = V1/N1, v2 = V2/N2, and ε = E/N . We let ε1 = E1/N . Using (32) we may thenwrite

ek−1B Sm,12(E,V,N ;E1) = eNg(ε1) (37)

whereg(ε1) = k−1

B ν1 sm,1

(ν1ε1, ν1v1

)+ k−1

B ν2 sm,2

(ν2(ε− ε1), ν2v2

). (38)

The quantity Sm,12 defined here is the entropy of the total system constrained to have its energypartitioned into two quantities E1 and E − E1; after establishment of the thermal contact E1

becomes a free parameter. Eq. (32) may then be rewritten as

Zm,12(E, V,N) =N

δE

∫dε1 eNg(ε1), (39)

which is an integral exactly of the form discussed in Appendix B. We therefore know withoutany further calculation that

Zm,12(E, V,N) = eNg(εmax1 )+O(logN), (40)

where εmax1 = Emax

1 /N is the value of its argument for which g(ε1) is maximal. The equationdg

dε1(εmax

1 ) = 0, when the original variables are restored, is identical to Eq. (35). Furthermore

Sm,12(E, V,N) = Sm,12(E, V,N ;Emax1 ).

II.5.3. Fluctuations

The quantity Sm,12(E, V,N ;E1) is the entropy of the total system subject to the conditionthat the energies of the two subsystems be E1 and E − E1. In order that there be an Emax

where this entropy is maximal (rather than just stationary) it is needed that

∂2Sm,12

∂E21

∣∣∣∣∣E1=Emax

< 0. (41)

This condition translates into CV > 0, that is, the heat capacity should be positive.

From the properties of the steepest descent integral it follows that the range of ener-gies around Emax

1 that contribute effectively to the result is determined by 〈(ε1 − εmax1 )2〉

12 =

23

1/√Ng′′(εmax), whence 〈(E1 − Emax

1 )2〉12 =

√N/g′′(εmax). Hence this rms deviation is neither

extensive nor intensive; it is negligibly small on the scale of the total energy E1 but very largeon the scale of the energy per particle ε1.

There is something more to be learned. The functions Tm,1(E1) and Tm,2(E − E1) areincreasing and decreasing, respectively, with E1. Since their point of intersection determines theequilibrium of the total system, we see that the system with the initially higher temperaturefurnishes energy to the one having the initially lower temperature: heat flows from high to lowtemperature (draw a figure!).

II.5.4. The piston: a sliding wall

Suppose now that the two systems 1 and 2 are separated by a piston: an adiabatic wall thatmay slide without friction (which is certainly an idealization). This piston is initially blocked andthe two microcanonical subsystems are characterized by the triplets (E1, V1, N1) and (E2, V2, N2)as before. At some point we will

– release the blocking;– make the wall diathermal.14

The piston will then relax to a new equilibrium position: this may be considered as an “exchangeof volume” between the two subsystems; and furthermore there will be exchange of energy.Therefore E1 and V1 become free parameters. Our purpose is again to study the new equilibriumstate. Let

Sm,12(E, V,N ;E1, V1) = Sm,1(E1, V1, N1) + Sm,2(E − E1, V − V1, N −N1) (42)

be its total entropy under the constraint of a fixed partitioning of the energy and the volumebetween the two subsystems determined by the pair (E1, V1) of free variables. The partitionfunction Zm,12 of the total system may again be expressed in terms of those of its subsystems,now as a convolution with respect to E1 and to V1. We will not write these down explicitly. Theequilibrium conditions may be derived as for the previous case and we find

∂Sm,12(E, V,N ;E1, V1)

∂E1= 0,

∂Sm,12(E, V,N ;E1, V1)

∂V1= 0. (43)

Its solution is (E1, V1) = (Emax1 , V max

1 ). The equilibrium condition of therefore finally takes theform

Tm,1 = Tm,2 , pm,1 = pm,2 . (44)

Convexity in the variable V1 at the point (Emax1 , V max

1 ) means

∂2Sm,2

∂V 2< 0 ⇐⇒ ∂pm,2

∂V< 0. (45)

The second one of these relations says that when the volume increases (at constant energy andparticle numbers), the pressure goes down. Again we see that the equilibrium state is the onethat maximizes the total entropy.

Exercise. Verify by a method analogous to the one employed above that the systemwith the initially higher pressure will increase its volume at the cost of the volume ofthe system with the initially lower pressure.

14A different problem arises if we keep the wall adiabatic. That problem is more difficult and we leaveit aside.

24

II.5.5. Permeable wall

Consider again the same two subsystems, initially described by the triplets (E1, V1, N1) and(E2, V2, N2). Let us now make the wall (now immobile, again) permeable for particles, whichwe take to be of the same kind in the two subsystems. Let

Sm,12(E, V,N ;E1, N1) = Sm,1(E1, V1, N1) + Sm,2(E − E1, V − V1, N −N1) (46)

be its total entropy under the constraint of a fixed partitioning of the energy and the numberof particles between the two subsystems determined by the pair (E1, N1) of free variables. Theequilibrium conditions may be derived as for the previous case and we find

∂Sm,12(E, V,N ;E1, N1)

∂E1= 0,

∂Sm,12(E, V,N ;E1, N1)

∂N1= 0. (47)

Its solutions are (E1, N1) = (Emax1 , Nmax

1 ). The equilibrium condition therefore finally takes theform

Tm,1 = Tm,2 , µm,1 = µm,2 . (48)

Exercise. Verify by methods analogous to the ones used above that the system withthe lower chemical potential gains particles at the cost of the system with the higherone.

Example. All particles initially in one half of a box.

II.5.6. Arbitrary free variable X

We consider the general case of an arbitrary extensive state variable X in a microcanonicalsystem of energy E. Let Sm(E;X1) be its the entropy subject to the condition of X taking thevalue X1; that is, Sm(E;X1) = kB logZm(E;X1), where Zm(E;X1)dX1 is the volume of phasespace (the number of microstates) having X = X1. If X1 is unconstrained, the probabilitydistribution of X1 then is

P (X1) =Zm(E;X1)

Zm(E). (49)

Since X is extensive, the probability distribution P (X1) will be strongly peaked about its max-imum, which occurs for the value X = Xmax

1 that solves

∂Sm(E;X1)

∂X1

∣∣∣∣∣Xmax

1

= 0. (50)

Let ∆X1 = X1 − Xmax1 . Then the rms deviation of X1 from its equilibrium value is 〈∆X2

1 〉12 ,

and this quantity is ∝√N for the same reasons as above. That is, for large N the rms deviation

is negligible on the scale of the variable itself, or put otherwise, in the thermodynamic limit therandom variable X1 takes the sure (= nonrandom) value Xmax

1 . This shows how determinismat the macroscopic scale arises out of randomness in the microscopic world.

Finally the entropy of the system is given by Sm(E) = Sm(E;Xmax1 ) up to a subextensive

correction that is O(logN).

Remark. We call X and ∂S/∂X a pair of conjugate variables. The three pairs that we haveencountered are E and 1/Tm, V and pm/Tm, and N and −µm/Tm. Loosely speaking one says

25

that T is conjugate to E, p to V , and µ to N . The first one of each pair may be considered asa “generalized force” capable of changing the second one.

II.5.7. Irreversible increase of entropy

We have found here a general rule: when in an isolated system a constraint X = X1 isreleased, then the system will tend to a macrostate where X has the value X = Xmax

1 thatmaximizes the constraint-dependent entropy Sm(X). Put more concisely,

When in an isolated system in equilibrium a constraint is released, the systemevolves towards a new equilibrium state that maximizes the entropy (subjectto the remaining constraints).

The fact that the entropy does not decrease implies that at the macroscopic (thermodynamic)level the laws of nature are not time reversal invariant: they tell us that certain phenomenaoccur irreversibly. This is in contrast to the microscopic (Newton, Hamilton-Jacobi) equationsfrom which we started. In our above argument we have concluded to this irreversibility only bycomparing an initial equilibrium state (with a constraint) to a final equilibrium state (obtainedwhen waiting long enough after release of the constraint). We have not derived the equationsof motion that would describe how the initial state evolves into the final one. Such equationsshould be irreversible, and much literature has been devoted to the problem of how it is possible,if at all, to derive irreversible macroscopic equations from reversible microscopic ones: in theprocess a symmetry gets lost.

The macroscopic irreversibility is sometimes referred to as the arrow of time.

II.5.8. Gibbs’ paradox

Let the same two subsystems as before now contain gases at the same temperature andpressure. What happens if the wall is removed? They mix! Or do they? Two cases must bedistinguished:

(a) 1 and 2 contained particles of the same kind;(b) 1 and 2 contained particles of different kinds.

In case (b) the gases start mixing. This is an irreversible process and is accompanied by anincrease ∆S of entropy (the increase is called the “entropy of mixing”). It may be calculatedwith the aid of the Sackur-Tetrode formula applied to the initial and to the final state (in thelatter case, consider the volume as filled by two separate gases and add their entropies).

Gibbs’ paradox arises when we argue that in case (a) the gases also start mixing. False!When we put the wall back in place, then because of the indistinguishability of the particles wewill again be in the initial state; and since the entropy can only increase, the state with the wallremoved has ∆S = 0 compared to the initial state. In case (a), too, the Sackur-Tetrode formulagives the correct answer for ∆S.

This formula takes the quantum indiscernability of identical particles correctly into account.The term “paradox” dates from the time (end of the 19th century, before the advent of quantumtheory) when in a purely classical context this was a serious problem.

26

II.6. More about entropy

II.6.1. Gibbs entropy

We will reason now on the basis of a discrete state space of microstates i = 1, 2, . . . ,M .Suppose a system may be in a microstate i with probability pi. Around 1875 Gibbs defined theentropy SG of this system as

SG = −kB

∑i

pi log pi . (51)

For pi = 0 it is understood that pi log pi = limε→0

ε log ε = 0. It follows that SG ≥ 0.

Relation to the microcanonical entropy. The microcanonical partition function Zm is a sumon all microstates in a given small energy interval (E,E+ δE), all with unit weight. If there areM such microstates i, and we assign equal probability M−1 to each of them, then (51) becomesSG = −kB

∑Mi=1M

−1 logM−1 = kB logM . This is exactly equal to what we defined as Sm. Wewill return to Gibbs’ expression for the entropy when studying the canonical ensemble.

II.6.2. Shannon entropy

If one sets kB = 1, the entropy of a bit (= two-state variable) as used in computers is− log 1

2 = log 2. Let pi be the probability that a sequence of M bits be equal to the specificsequence i. Introducing a prefactor 1/ log 2 we then define the entropy of this random sequenceas

SS = − 1

log 2

∑i

pi log pi = −∑i

pi log2 pi , (52)

which is called the Shannon entropy. So normalized, the entropy of a single bit that is in eitherof its two states with probability one half is equal to unity. Shannon (1948) used expression(52) to develop a theory of the information content of a stream of symbols. Since that time theShannon entropy plays a role in many interdisciplinary applications.

In information theory this entropy is interpreted as a quantification of the “lack of informa-tion.” To see this, consider a variable that may be in any one of a set of states i. If we knowits exact state, say i = i0, then pi = δii0 and (52) gives SS = 0: no information is lacking. Inthe opposite case where all information is lacking, the best we can do is to consider all states ashaving the same probability pi = 1/M , which gives SS = (logM)/(log 2) = log2M . It may beshown that this is the maximum value that SS can take: all information is lacking.

Exercise. Show that the Gibbs or Shannon entropy is nonnegative; and that for twoindependent systems it is additive.

II.6.3. Origin of SG

Let ν = 1, 2, . . . , n be single-particle states (as opposed to microstates) for structurelessparticles. Hence ν is a point (r,v) in the six-dimensional single-particle phase space. When Nindependent particles are distributed over these states with the constraints that there be exactlyNν in state ν, then, in Boltzmann’s notation, the number of ways W to realize this distributionis

W =N !∏ν Nν !

. (53)

27

The Boltzmann entropy of this distribution is SB(N1, N2, . . .) = −kB logW . Upon using Stir-ling’s formula for the factorials and setting pν = Nν/N we get

SB = −NkB

∑ν

pν log pν , (54)

where subextensive terms have been neglected. Eq. (51) may be seen as the generalization of(54) to microstates i and a probability distribution pi in an N -particle phase space. Since each ofthe W realizations is a microstate, we may also consider this SB as an entropy of the same typeas Sm (in the sense that it is based on counting microstates), but with constraints ν = 1, 2, . . . , ninstead of the (E, V,N).

Note. Eq. (51) may be used also when there are correlations between the particles due tointeraction terms in their Hamiltonian.

II.6.4. The Maxwell-Boltzmann distribution

Suppose we consider N independent particles in an external potential Vext(r). Let the systembe isolated, hence microcanonical, with a total energy E = Nε.

We imagine the six-dimensional single-particle phase space (r,v) partitioned into volumeelements labeled by an index ν, every element large enough to contain many particles but smallenough so that the potential energy of a particle in that volume element is practically constantand given by εν = 1

2mv2ν + Vext(rν).

In Eq. (54) the quantity pν is the fraction of particles, out of a set of N independent particles,that are in the single-particle state ν. Boltzmann maximized the entropy (54) by allowing theconstraints pν to vary subject only to the conditions∑

ν

pν = 1,∑ν

pνεν = ε. (55)

Introducing Lagrange multipliers a and b (see Appendix C) we are led to maximize

F (pν) = −∑ν

pν log pν − a(∑

ν

pν − 1)− b(∑

ν

ενpν − ε). (56)

From ∂F/∂pν = 0 we get log pν + 1 + a+ bεν = 0, whence

pν = e−1−a−bεν . (57)

The constants a and b follow from the two conditions (55). Returning to the original coordinatesand setting C = e−1−a we get the Maxwell-Boltzmann distribution

p(r,v) = Ce−12bmv2−bVext(r), (58)

which is the joint distribution of the velocity and the position coordinates of a single particle ina gas of N noninteracting ones. For Vext = 0 it is the well-known Maxwell distribution of theparticle velocities. It appears that b has the interpretation of 1/kBT (see next chapter).

II.6.5. Continuum limit

Let p(x) be a probability distribution of a continuous variable x, taken here one-dimensional.Can we associate an entropy with this distribution just as we did with the distribution pi ?

28

Suppose we partition the x axis into intervals ∆x that we label by an index i. Then pi =p(xi)∆x is the probability content of the ith interval, supposed to be located around xi. Thisleads to an entropy

S = −kB

∑i

[p(xi)∆x] log[p(xi)∆x]

= −kB

∫dx p(x) log p(x)− kB log ∆x, (59)

where in the second line we have taken the limit ∆x → 0 and now see that this leads to adiverging additive constant − log ∆x. The first term in (59) is often retained as the definitionof the entropy associated with the distribution of a continuous variable. As long as we areinterested in differences of such entropies, the infinite constant cancels out and plays no role. Ofcourse, if x is a physical variable and we know how to treat it quantum mechanically, this willprovide a physical value for ∆x.

II.6.6. Time dependent phenomena and H theorem

It is not evident, whether in thermodynamics or in statistical mechanics, how the entropyshould be defined for a time dependent system. However, if one accepts the “

∑p log p” formula,

then time dependent probabilities may be substituted:

S(t) = −kB

∑i

pi(t) log pi(t). (60)

The time evolution of S(t) depends on the time evolution equations that one has derived orpostulated for the pi(t). For several types of equations15 one may show that in isolated systems

dS(t)

dt≥ 0, (61)

with the equality sign holding only in equilibrium. Such a relation is called a Boltzmann Htheorem. However, time dependent phenomena are outside the scope of this course.

15Among which first of all the Boltzmann equation (see elsewhere).

29

III. CANONICAL ENSEMBLE

III.1. The ensemble

The microcanonical ensemble applies to systems whose energy E is strictly constant, thatis, to systems that are perfectly isolated from their environment.

The canonical ensemble applies to systems that are in thermal contact with their environ-ment. That is to say, they may release or absorb energy (heat). We will consider the idealizedcase of an environment having an infinite heat capacity, so that its temperature T cannot change.Such an idealized system is called a thermostat.

The energy E of a system in contact with a thermostat will fluctuate, even at equilibrium,around some average value 〈E〉 determined by the properties of the system itself and the tem-perature T of the thermostat; the latter will also be the temperature of the system. This is whythe canonical ensemble is governed by a new independent parameter, the temperature T , whichreplaces the energy E of the microcanonical ensemble.

The canonical ensemble is defined by the following choice of the density ρ(Γ),

ρc(Γ) =1

N !h3Ne−βH, (1)

in which the exponential is called the Boltzmann factor. Here β ≡ 1/kBT is a parameter aboutwhich we make no a priori assumptions. The canonical distribution may be viewed as a sum ofjuxtaposed microcanonical distributions, the one on (E,E + δE) being weighted by the factore−βE . The canonical partition function Zc (for a classical system) is therefore

Zc(T, V,N) =1

N !h3N

∫dΓ e−βH, (2)

in which the integration is over all of phase space. It depends on the triplet of independentvariables

(T, V,N). (3)

We may relate Zc to Zm by rewriting (2) as

Zc(T, V,N) =

∫dE

δEe−βE Zm(E, V,N)

=

∫dE e−βE Φ′(E), (4)

which is the Laplace transform (see Appendix D) of the density of states Φ′(E).For a quantum mechanical system the phase space integral in (4) is of course replaced with

a sum on all eigenstates of the system. If ` is such a state and if we denote its energy by E`,then

Zc =∑`

e−βE` = tr e−βH. (5)

Calculating this sum requires knowing the energy levels, but does not require knowing thecorresponding quantum mechanical states. Note that in the sum (5) the same energy level willoccur as many times as it is degenerate. It follows that the probability for the system to be ineigenstate ` is

Pc(`) =1

Zce−βE` . (6)

30

Figure 4: Canonical ensemble.

III.2. Helmholtz free energy

III.2.1. Legendre transformation between Sm and Fc

The thermodynamic potential associated with Zc is the (canonical) free energy 16 Fc definedby

Zc = e−βFc ⇐⇒ Fc(T, V,N) = −kBT logZc(T, V,N). (7)

If Fc is extensive as we expect it to be, then we must have

Fc(T, V,N) = Nf(T, v), N →∞, (8)

where as before v = V/N .

Can we establish a relation between this canonical thermodynamic potential and the micro-canonical one? We will do this now. For similar reasons as before, the integral on E occurringin (7) is again strongly peaked when N is large. Let the maximum occur at E = Ec; this willthen also be the average value of E in the canonical ensemble. Using that Zm = exp(kB

−1Sm)we then have from (4) (for simplicity not indicating the dependence on V and N)

Zc(T ) =

∫dE

δEe−βE Zm(E)

= maxE

e−βE+k−1B Sm(E)

= e−βEc+k−1B Sm(Ec), (9)

16It is also called the Helmholtz free energy. Another symbol in use is A instead of F .

31

where in the second step we have neglected all subextensive terms in the exponential. It followsthat

Fc = Ec − TSm. (10)

This relation between Fc and Sm is called a Legendre transformation. It is a kind of transforma-tion that makes sense especially for convex functions, as we know Sm(E) is. Indeed, the relationF = E − TS is well-known in thermodynamics.17

The value Ec(E, V,N) is the solution of

∂

∂E

[− βE + Sm(E)

]∣∣∣E=Ec

= 0, (11)

from which we see directly that the parameter β ≡ 1/kBT of the canonical ensemble is given by

T = Tm . (12)

When its natural variables T, V,N undergo variations dT, dV, and dN , the resulting infinites-imal change in Fc is

dFc =

(∂Fc

∂T

)V,N︸︷︷︸

−Sc

dT +

(∂Fc

∂V

)T,N︸︷︷︸

−pc

dV +

(∂Fc

∂N

)T,V︸︷︷︸

µc

dN. (13)

In this equation we have called the three partial derivatives that appear by the names −Sc,−pc,and µc. To show that they actually do correspond to the thermodynamic entropy, pressure, andchemical potential we do not any more this time have to appeal to thermodynamics. In fact,the three derivatives may be calculated directly in terms of the microcanonical quantities thatwe know from chapter II, and we find that when (12) is satisfied

Sc = Sm , pc = pm , µc = µm . (14)

This shows that the canonical and microcanonical ensemble lead to the same thermodynamics.

Proof. (Done in class.)

III.2.2. Other thermodynamic quantities

It is easy to show (done in class) that the average energy of a system described by thecanonical ensemble may be obtained as

〈E〉c = −∂ logZc

∂β. (15)

Because of the peaked nature of the energy distribution we must have 〈E〉c = Ec. One showsfurthermore that, with ∆E ≡ E − Ec,

〈∆E2〉c ≡ 〈E2〉c − 〈E〉2c︸︷︷︸σ2E

=∂2 logZc

∂β2. (16)

17Another frequently used notation is F = U − TS.

32

Since logZc is extensive, this implies that the typical deviation ∆E from its average 〈E〉c is

proportional to N12 . Hence the relative fluctuations ∆E/〈E〉c of the total energy are vanishingly

small in the thermodynamic limit.

Remark. Relation (16) is not surprising from the following perspective. If in the last line ofEq. (4) we consider Φ′(E) as an (unnormalized) probability distribution on the E axis, thenZc(T, V,N) is its generating function; here −β plays the role of ik in (A.26). It follows thatlogZc(T, V,N) is the moment generating function of the energy variable, whence (16).

But we also have

CV,c =∂〈E〉c∂T

= −kBβ2∂〈E〉c∂β

= kBβ2∂

2 logZc

∂β2. (17)

Comparison of (17) and (16) establishes a proportionality between the heat capacity CV,c andthe fluctuations of the energy,

CV,c = kBβ2〈∆E2〉c . (18)

By the way, notice the conversion between T and β derivatives,

∂

∂T= −kBβ

2 ∂

∂β,

∂

∂β= −kBT

2 ∂

∂T. (19)

III.3. Derivation of the canonical ensemble

We consider a microcanonical system T of N particles and volume V , and a subsystem S ofN1 particles and volume V1 that may exchange heat with T . We will now do something differentfrom what we did in section II.5.1, where we let N and N1 tend to infinity in a fixed ratio. Herewe let N → ∞ while keeping N1 fixed (which does not exclude that at a later stage we maylet N1 become very large). We will show that T then acts as a thermostat for S and that S isdescribed by the canonical ensemble. To do so we write the microcanonical partition functionof the total system as

Zm(E, V,N) =

∫dE

δE

eO(N1)︷︸︸︷Zm(E1, V1, N1)

eO(N)︷︸︸︷Zm(E − E1, N −N1, V − V1)

= ekB−1

O(N)︷︸︸︷Sm(E, V − V1, N −N1)

∫dE

δEZm(E1, V1, N1) e

O(N1)︷︸︸︷−k−1

B E1∂Sm∂E

+O(N21 /N)

= ekB−1Sm(E,V−V1,N−N1)

∫dE1

δEZm(E1, V1, N1) e−βmE1︸︷︷︸Zc(Tm,V1,N1)

(20)

where in the last line contributions of O(N21 /N) in the exponent have been neglected; and

where we see that the parameter β = 1/kBT of the canonical ensemble appears to have the

value βm = 1/kBTm = k−1B

∂Sm

∂Eof the thermostat.

Remarks.The interaction Hamiltonian between the system S and the thermostat T has been neglected.We may now let N1 →∞ if we wish (but need not).

33

III.4. Example: the ideal gas

We consider N noninteracting particles of mass m in a volume V . The Hamiltonian is

H =N∑i=1

p2i

2m. (21)

The canonical partition function becomes

Zc(T, V,N) =1

N !h3N

∫dΓ e−βH(Γ)

=V N

N !h3N

∫dp1 . . .

∫dpN e−β(p21+...+p2N )/2m

=V N

N !h3N

(2πm

βh2

) 3N2

=1

N !

(V

λ3

)N(22)

with as before λ = h/√

2πmkBT . Note that V/λ3 is just the single-particle canonical partitionfunction. Use Stirling’s formula and neglect nonextensive terms. Then (22) leads to

Fc(T, V,N) = NkBT[

log nλ3 − 1]. (23)

Of course the same reservations concerning its validity hold as in the microcanonical case (sectionII.4.2). We derive from (23) that for the ideal gas

Ec = −∂ logZc(T, V,N)

∂β=

3

2NkBT ,

Sc = −(∂Fc(T, V,N)

∂T

)V,N

= −NkB

[log nλ3 − 5

2

],

pc = −(∂Fc(T, V,N)

∂V

)T,N

=NkBT

V= nkBT . (24)

All these relations agree with the microcanonical ones and with thermodynamics. All furtherthermodynamic quantities may be obtained. We present again the example of the heat capacityCV = ∂Ec/∂T , which is related to the thermodynamic potential Fc(T, V,N) by

CV = kBβ2 ∂

2

∂β2βFc , (25)

and which you may compare to the microcanonical expression (II.24). The indices “m” and “c”indicate only how the expression for a quantity has been derived, but do not alter its expression;to keep things simple we have dropped the index “c” on CV .

34

III.5. Independent particles

Suppose we have a system of N classical identical noninteracting particles. Let their Hamil-tonian be

H =N∑j=1

Hj , Hj = h(rj ,pj , . . .), (26)

where the dots in h stand for any further internal degrees of freedom that particle j may have.We may distinguish two cases.

Case (a): each particle is confined to the neighborhood of a specific position in space (a “site”).The particles are distinguishable. This happens, for example, for an atom at a fixed position ina crystal. The partition function should then be defined without the factor 1/N ! and we get,

Zc(T, V,N) =1

h3N

∫dr1...drNdp1 . . . dpN e−βH

=

[1

h3

∫dr dp . . . e−βh(r,p,...)

]N≡ zN (27)

in which z is called the single-particle (canonical) partition function. The free energy becomes

Fc(T, V,N) = −NkBT log z. (28)

Clearly Zc is the product of N single-particle partition functions and Fc is the sum of N single-particle free energies. Clearly, Fc is extensive if z is intensive.

Case (b): each particle may move through the whole volume.The particles are indistinguishable. The canonical partition function then is, with the sameexpression for z as before,

Zc(T, V,N) =zN

N !. (29)

The free energy becomes

Fc(T, V,N) = −NkBT logz

N−NkBT (30)

and in order that it be properly extensive we must have that z ∼ N in the thermodynamiclimit; which will be true exactly because the integration on the volume will cause a factor V toappear in z. The ideal gas of the previous section is the simplest example of this case; we willsee another one in the next section.

Quantum mechanics.

Case (a) follows straightforwardly from the underlying quantum mechanics. If λ labels theeigenstates of the single-particle Hamiltonian h in (26) and ελ stands for their energies, thenz =

∑λ exp(−βελ) and Zc is the product of N factors z.

Case (b) arises from the underlying quantum mechanics in a more complicated way. Thefundamental reason is that identical particles that occupy the same volume are indistinguishableand have to be treated according to quantum statistics. The quantum case will be dealt within detail in chapters VII and VIII.

35

III.6. Example: independent particles in a gravitational field

This is an example of case (b) of the previous section. Let N noninteracting particles ofcoordinates rj = (xj , yj , zj), momenta pj , and mass m be placed in a gravitational field ofacceleration g, in equilibrium at a temperature T . This is an approximate model for the earth’satmosphere.

We study a system in which we confine the particles to a half-way infinite vertical column ofcross section A = L2. To be specific, we let 0 < xj , yj < L and zj > 0. The system Hamiltonianthen is

H =N∑j=1

p2j

2m+

N∑j=1

mgzj , (31)

which is of the form (26). We easily calculate

z =1

h3N

∫dp1 e−

βp21

2m

∫ L

0dx1

∫ L

0dy1

∫ ∞0

dz1 e−βmgz1

= A×(√

2πmkBT

h

)3

× 1

βmg, (32)

whence, see (29),

Zc(T,A,N) =1

N !

(√2πmkBT

h

)3N (A

βmg

). (33)

In the thermodynamic limit N →∞ the free energy Fc = −kBT logZc becomes

Fc(T,A,N) = NkBT[

log(n2λ

2 · λβmg)− 1

], (34)

where we have used Stirling’s formula and taken the limit N →∞ at fixed n2 ≡ N/L2 = N/A.Interpretation: n2 is the average number of particles on the earth’s surface contained in a columnof unit cross section.

Exercise.a. Derive the probability distribution p(z1) of the height z1 of particle 1 as a

marginal distribution of the single-particle Boltzmann distribution. Calculate the av-erage height 〈z1〉.

b. Interpret the factors n2λ2 and λβmg that occur inside the logarithm in (34).

Exercise.a. Calculate the average particle density 〈n(z)〉 at height z above the earth’s

surface.b. Interpret the quantity −(∂Fc(T,A,N)/∂A)T,N . Show that it may be cast in

the form

−(∂Fc

∂A

)T,N

=

∫ ∞0

dz n(z)kBT . (35)

Let p(z) stand for the atmospheric pressure at altitude z. Argue that (if the hypothesesof these exercises are right) the ideal gas law holds locally in the atmosphere. Find anexplicit expression for p(z) (called the barometric height formula).

36

Exercise. Estimate the thickness of the earth’s atmosphere. Some data: mN2 =4.65× 10−26 kg, g = 9.8 ms−2, T = −50 C, kBT = 1 eV ⇔ T = 11600 K.

III.7. Example: one-dimensional gas of hard rods

(Done in class)

III.8. Magnetism

Introduction. Analogy of (p, V ) to (M,B). Elaborate.

III.8.1. General

Let ~S1, ~S2, . . . , ~SN , be a collection of N spins, such that ~Sj ≡ (Sxj , Syj , S

zj ) is attached to

site j of a crystal lattice. Let H0(~S1, ~S2, . . . , ~SN ) be the system Hamiltonian; it describes theinteraction between the spins. Typically, there will be an interaction between spin pairs onnearest neighbor sites and weaker interactions between spins further apart. When the crystal isplaced in an external magnetic field ~B, the total Hamiltonian becomes

H = H0 − γµBN∑j=1

~B · ~Sj︸︷︷︸µ~B· ~M

(36)

in which the gyromagnetic factor γ and the Bohr magneton µB are the quantities you knowfrom quantum mechanics, and ~M is the total magnetization. It is always possible to rotate thecoordinate system such that we get ~B = (0, 0, B) and the magnetic part of the Hamiltonianbecomes −γµBB

∑j S

zj .

Since spin is a quantum phenomenon, the partition function associated with H involves asum on its discrete eigenstates, which in the general case are impossible to find (this wouldrequire diagonalizing H, i.e., solving the Schrodinger equation H|ψ〉 = ε|ψ〉). The crystalhas, nevertheless, a canonical partition function18 Zc(T,B,N) and a free energy Fc(T,B,N) =−kBT logZc(T,B,N) that derives from it.

Replacement of (V,−p) by (B,M).We interrupt here the general development in order to first study now a simple special case.

III.8.2. Example 1: Paramagnetic crystal

Let us take for the ~Sj spins 1/2 that are independent i.e., H0 = 0. The z component Szi ofeach spin may take the values ±1

2~. We will choose new dimensionless units such that these zcomponents are represented by the variable si = ±1. and we will denote the associated magneticmoment by µsi (we have µ = 1

2γµB~). 19 When placed in a magnetic field B the energy of spin

18We are discussing here the magnetic partition function, tacitly supposing that the spins are notcoupled to the other degrees of freedom of the crystal.

19Do not confuse this µ with the chemical potential, denoted elsewhere by the same symbol.

37

i will be −µBsi. The Hamiltonian of this system of independent spins therefore is

H = −µBN∑i=1

si . (37)

A microstate of the system may be specified by s ≡ (s1, s2, . . . , sN ) (how many microstates arethere?). It is easy to show that each microstate is an eigenstate of H, so the diagonalizationproblem that renders the general case difficult, has disappeared. Hence the Hamiltonian (37) isto magnetism what the ideal gas Hamiltonian is to fluids.

A. Canonical calculation.

The canonical partition function at inverse temperature β is

Zc(T,B,N) =∑s

e−βH. (38)

Iis calculation is very easy and runs as follows.

Zc(T,B,N) =∑s1=±1

∑s2=±1

. . .∑

sN=±1

eβµB∑Ni=1 si

=[ ∑s1=±1

eβµBs1]N

=(

2 cosh(βµB))

︸︷︷︸z

N, (39)

in which now z = 2 cosh(βµB) is the single-particle partition function. The free energy is

Fc(T,B,N) = −NkBT log z

= −NkBT log coshβµB. (40)

The average energy in the canonical ensemble is now obtained as

〈E〉c = −∂ logZc

∂β= . . . = −NµB tanhβµB. (41)

Another quantity of great interest is the total magnetization M = µ∑N

i=1 si, or equivalentlythe magnetization per spin m ≡ M/N . It so happens that in this system the total energy andmagnetization are the same variable, up to the constant µB; this is a peculiarity of this toosimple model and not the general situation. Using the result (41) we therefore get

〈m〉c = µ tanhβµB (42)

whence the magnetic susceptibility

χ(B) ≡ ∂〈m〉c∂B

= βµ2(1− tanh2 βµB). (43)

The zero-field susceptibility is given by χ(0) = βµ2 and diverges for T → 0.

38

Exercise. Take the limit T → 0 at fixed B 6= 0 and explain your finding intuitively.

B. Microcanonical calculation.

It is instructive to see now how we would we deal with this system in the microcanonicalensemble. The microcanonical partition function is Zm(E,B,N) in which the energy has aprescribed value, say E. The microcanonical temperature Tm is then calculated according to(see the problem session)

1

Tm=∂Sm

∂E= . . . = − kB

2µBlog

1 + εµB

1− εµB

= − kB

µBartanh

ε

µB. (44)

You now see that if we identify E = 〈E〉c and Tm = T , equations (41) and (44) are identical.

Exercise. Calculate the entropies Sc and S|rmm of this system and show that theyare identical.

III.8.3. General (continued)

Magnetization and magnetic susceptibility

Having obtained the canonical partition function (or equivalently, the free energy) we easilyfind the average value of the total magnetization M = µ

∑Ni=1 si by carrying out a derivation

with respect to B,

〈M〉 = kBT∂ logZc

∂B

= −∂Fc

∂B(45)

Deriving logZc once more in the same way gives

(kBT )2 ∂2 logZc

∂B2= . . .

= 〈M2〉 − 〈M〉2

≡ 〈∆M2〉, (46)

a relation to which we will return in a moment.

The magnetic susceptibility

χ ≡ 1

N

∂〈M〉∂B

(47)

measures how easily the system responds to a change in the magnetic field. Note that χ hasbeen defined such as to be intensive. From the two preceding equations we have

χ =kBT

N

∂2 logZc

∂B2. (48)

Upon comparing relations (47) and (48) we find that

〈∆M2〉 = NkBTχ. (49)

39

It is worth reflecting a moment on the interpretation of this equation. It establishes a relationbetween the fluctuations of a physical quantity (in casu the magnetization) in a state of equi-librium and the way this equilibrium is modified when one of the system parameters (in casuB) is changed. Note that we have encountered a similar relation earlier, viz. Eq. (18).

Remark. Nowadays some people refer to equations of the type (18) and (49) as fluctuation-dissipation relations. This name, however, is traditionally reserved for the time-dependent ex-tensions of these relations, not discussed in this course, and whose proofs are highly nontrivial.

III.8.4. Example 2: Ising model

Let us consider a system of electrons j, each one carrying a spin ~Sj = (Sxj , Syj , S

zj ) and

attached to an atom occupying a site of a crystal. Heisenberg (1928) supposed that in a realcrystal the interaction between the electron spins is in good approximation limited to nearest-neighbor electrons. A general expression for the energy of such a spin system then is i and jis

HHeis = −∑〈i,j〉

∑α

JαSαi Sαj , (50)

known today as the (anisotropic) Heisenberg Hamiltonian, and in which 〈i, j〉 denotes a pair ofnearest-neighbor sites. Here the Jα are so-called exchange energies that depend on the specificsof the crystal; they must be calculated quantum mechanically.20 In statistical physics theseenergies are usually supposed to be given constants.

In case all Jα are equal, we speak of the isotropic Heisenberg interaction; in case Jz = 0,we speak of an “XY” interaction; and in case Jx = Jy = 0 we speak of an Ising interaction.

Consider now a crystal of N spins with an Ising interaction Jz between nearest neighbors.We introduce the spin variables si = ±1 with i = 1, 2, . . . , N as in section III.8.2 and setJ = (~/2)2Jz. Then the Hamiltonian of the electron spin system may be written as

HIsing = −J∑〈i,j〉

sisj . (51)

The Ising Hamiltonian has the special property that all “operators” commute and its spins maytherefore be treated as classical (although discrete) variables.

The Ising model has been studied for almost a century now, which is why it is mentionedhere. It will not be discussed any further in this course, however.

III.9. Energy distribution in the canonical ensemble

What is the probability density p(E) that a system described by the canonical ensemble ata temperature T have an energy E ? We have seen above [Eqs. (15) and (16)] that the mean〈E〉c and variance 〈E2〉c − 〈E〉2c of E follow form the partition function Zc. If we knew thatp(E) was a Gaussian, then it would be fixed completely by its mean and variance. We will nowshow that indeed p(E) is a Gaussian.

20This spin-spin interaction originates from the electrostatic Coulomb interaction between the electroncharges, combined with Pauli’s exclusion principle. The magnetic interaction between two electronmagnetic moments γµB ~Si and γµB ~Sj is very much weaker.

40

We start from Eq. (4) and refine the analysis that led us to Eq. (9). Upon expanding themicrocanonical entropy Sm(E) about E = Ec to second order we get

Zc(T ) = e−βEc+k−1B Sm(Ec)︸︷︷︸

independent of E

∫dE

δEexp

(1

2k−1

B (E − Ec)2 ∂

2Sm(E)

∂E2

∣∣∣Ec

+1

6k−1

B (E − Ec)3 ∂

3Sm(E)

∂E3

∣∣∣Ec

+ . . .

)(52)

We will now do this properly. We are interested in p(E) in the limit N →∞ and will thereforemake all N dependence under the integral sign in (52) explicit. The kth E-derivative of k−1

B Sm

is proportional to N1−k (why?) and we will write it for this occasion as N1−ks(k). Let us scalethe energy difference with respect to Ec as ∆E ≡ E − Ec = N1/2ε. Then we may rewrite (52)as

Zc(T ) = cst× 1

N1/2δE

∫dε exp

(1

2ε2s(2) +

1

6N−1/2ε3s(3) +O(N−1)

)(53)

in which “cst” stands for an expression independent of E. It is now clear that for N → ∞ thethird and higher order terms in the exponent vanish and that integrand draws its contributionfrom ε values that are typically of order N0. The expression under the integral tends towards

exp

(1

2ε2s(2)

)= exp

(1

2k−1

B (E − Ec)2∂

2Sm(E)

∂E2

∣∣∣Ec

)(54)

where we have restored the original variables. Eq. (54) represents, up to a normalization factor,the distribution of the energy E in the canonical ensemble.

The method applied here is a special case of our general discussion in Appendix B.

Exercise. Find p(E) including its normalization. Using (54) and your knowledge ofthe microcanonical ensemble, show that the fluctuations of the energy are given by

〈∆E2〉c = kBT2mCV,m. (55)

Compare to (18) and comment.

III.10. Marginal laws of the canonical distribution

Maxwell distribution for the velocity of a particle.Barometric height formula.Joint distribution for N or k particle positions.

III.11. General pair of conjugate variables

Let H(Γ) = H0(Γ) − φX(Γ) in which H0 is an “unperturbed” (but otherwise arbitrary)Hamiltonian, and φ a control parameter that couples to a physical quantity X(Γ), also arbitraryin the present discussion. In practice X will be the sum of some intensive quantity x on thevolume of the system, and will therefore be extensive.

Let Zc(T, V, φ,N) be the canonical partition function of this system. It is easy to show thatthe average of X is given by

〈X〉c = kBT∂ logZc

∂φ. (56)

41

Figure 5: Energy distribution in the canonical ensemble.

The “susceptibility” χφ is the per particle increase of this average for a unit increase of thecontrol parameter, that is

χφ ≡1

N

∂〈X〉c∂φ

. (57)

One derives almost directly〈∆X2〉c = NkBTχφ . (58)

This generalizes earlier relations of such type found for X equal to the energy and the magne-tization.

III.12. Canonical systems in contact

III.12.1. Equilibrium condition

Consider two canonical systems, 1 and 2, at equilibrium at a temperature T , and char-acterized by variables V1, N1, . . . and V2, N2, . . ., respectively. We may see these variables as“constraints.” When the two systems are brought into contact such as to be able to exchangevolume (by means of a piston), the constraints V1 and V2 are released and replaced with the sin-gle constraint V = V1 + V2. When they are allowed to exchange particles (through a permeablewall) the constaints N1 and N2 are released and replaced by the single constraint N = N1 +N2.And so on.

In the general discussion below we will consider extensive constraints X1 and X2 of arbitrarytype and let X = X1 + X2. Before the contact is established the systems have free energiesF1(T,X1) and F2(T,X2) and the partition function of the combined system is21

Zc,1(T,X1)Zc,2(T,X2) = e−βF1(T,X1)−βF2(T,X2). (59)

After release of the constraint (and once the new equilibrium has gotten established) the partition

21Any possible interaction terms between the two systems are assumed to be negligible.

42

function is

Zc,12(T,X) =∑X

e−βF1(T,X)−βF2(T,X−X)

=∑X

e−βF12(T,X;X) (60)

(if X may vary continuously, we will have an integral instead of a sum). Here F12 is the freeenergy when the two systems are constrained to have values X1 = X and X2 = X − X. Thetotal partition function is a sum on all possible values of the constraint X.

We apply once more the usual argument: F is extensive, that is, it is of the form F12 =

Nf(T, XN ; XN ). Hence the sum on X will have a sharp (and effectively Gaussian) peak occurring

at a maximum X = Xmax1 which is the solution of

∂F12(T,X; X)

∂X= 0. (61)

This equation is the condition that determines the new equilibrium. The free energy F12(T,X)of the combined system in this equilibrium will be given by

Zc,12(T,X) = e−β

F12(T,X)︷︸︸︷F12(T,X;Xmax

1 ) + o(N) (62)

The equilibrium condition (61) becomes a relation between the derivatives of F1 and F2 , namely

∂F1(T,Xmax1 )

∂X1=∂F2(T,X −Xmax

1 )

∂X2(63)

This condition may easily be worked out more explicitly for specific cases. We thus have

X = V : pc,1 = pc,2

X = N : µc,1 = µc,2

etc. (64)

We draw the following important conclusion:

When in a system in equilibrium in contact with a heat bath a constraint isreleased, the system evolves towards a new equilibrium state that minimizesthe free energy (subject to the remaining constraints).

Note the similarity and the differences with what you know about the microcanonical en-semble (section II.5.6-7).

III.12.2. Fluctuations (once more)

In order that (61) correspond to a maximum in the integrand we need the “stability” con-dition

∂2F12

∂X2

∣∣∣∣∣Xmax

1

> 0 ⇒ ∂2F12(T,X)

∂X2> 0. (65)

43

[***] In case X = V this leads to

∂2F12

∂V 2= −

(∂p

∂V

)N,T

= V κT > 0, (66)

in which κT is the isothermal compressibility.As before, the (effectively Gaussian) sum on X has a width determined by the second order

term in the expansion

F (T,X;X1) = F12(T,X) + 12(X −Xmax

1 )2∂2F12(T,X)

∂X2+ . . . , (67)

which shows that the width is

〈∆X2〉c =1

∂2F12∂X2

. (68)

Exercise Consider the two systems in contact through a permeable wall and let X =V . Argue why small (large) fluctuations of the two subvolumes V1 and V2 go hand inhand with a low (high) compressibility.

Write down the probability distribution p(X) of the variable X when the combinedsystem is in equilibrium.

III.13. The Gibbs or Shannon entropy

In section II.6 we exhibited a definition of the “Gibbs” or “Shannon” entropy (SG or SS,respectively) not based on thermodynamics but on the probabilities pi of a set of microstatesi. Working with a discrete set of microstates, we then showed the relation of SG to the micro-canonical entropy.

Let E` be the energy of microstate `. In the canonical ensemble we have that p` = e−βE`/Zc

and we may try to see what we get when we substitute this in the expression for the Gibbs orShannon entropy:

SG = −kB

∑`

p` log p` (69)

= −kB

∑`

e−βE`

Zclog

e−βE`

Zc

= . . .

=1

T

∑`

E` e−βE`

Zc︸︷︷︸〈E〉c

+kB logZc︸︷︷︸−βFc

∑`

e−βE`

Zc︸︷︷︸1

(70)

so thatFc = 〈E〉c − TSG . (71)

It follows that the Gibbs-Shannon expression (69) for the entropy, when combined with thecanonical microstate probabilities, yields the canonical entropy: SG = Sc = S.

Expression (69), although not directly necessary for statistical thermodynamics, provides animportant complementary view of the concept of entropy.

44

IV. HARMONIC OSCILLATOR

Applications. Discussed in class.

IV.1. The classical harmonic oscillator

The Hamiltonian of a classical one-dimensional oscillator of mass m and angular frequencyω is

H(x, p) =p2

2m+

1

2mω2x2. (1)

The calculation of its canonical partition function is straightforward. We have N = 1 particlesand the volume V plays no role (why?). Hence

Zc(T ) =1

h

∫ ∞−∞

dx

∫ ∞−∞

dp e−βH(x,p)

=1

h

√2πm

β

√2π

βmω2

=2πkBT

hω. (2)

We deduce from it the free energy,

Fc(T ) = −kBT logZc(T ) = −kBT log2πkBT

hω, (3)

and the energy

〈E〉c = −∂ logZc(T )

∂β= − ∂

∂βlog

2π

βhω

= kBT . (4)

From the above and the thermodynamic relation Fc = 〈E〉c − TSc one finds for the entropy ofthe oscillator

Sc = kB

(log

2πkBT

hω+ 1

). (5)

We observe that, according to (5), at sufficiently low temperature this entropy will becomenegative. As discussed in class, this is a sign that the classical calculation is no longer valid.

The heat capacity of this single oscillator, that we will continue to denote by CV , is givenby

CV =∂〈E〉c∂T

= kB . (6)

One may also calculate the average kinetic and potential energy, which are given by

〈12mv2〉c = 1

2kBT,

〈12mω2x2〉c = 1

2kBT. (7)

The quantity 〈E〉c of Eq. (4) is their sum.

45

Figure 6: Energy of the harmonic oscillator.

Figure 7: Specific heat of the harmonic oscillator.

Remark. Eqs. (7) are an example of what is called equipartition (which is short for: equipartitionof the energy between the degrees of freedom of the system). The equipartition theorem says thatan energy equal to 1

2kBT is stored in every degree of freedom of a classical system that occursquadratically in the Hamiltonian. One consequence is the well-known expression

〈12mv2〉 = 32kBT (8)

for the average kinetic energy of a real-world (three-dimensional) particle.More generally, as an important but approximate rule of thumb, an energy of the order of

the thermal energy kBT is stored in every classical degree of freedom.Exercise. Calculate the average kinetic and potential energy of a particle in a gravi-tational field, that is, described by the Hamiltonian

H =1

2m(p2

1 + p22 + p2

3) +mgx3 , x3 > 0. (9)

46

Make any useful comment.

Three-dimensional harmonic oscillator

Its Hamiltonian is

H(r,p) =p2

2m+

1

2mω2r2. (10)

It is equivalent to three decoupled one-dimensional oscillators and therefore its canonical parti-tion function is the cube of the partition function found above. As a consequence the free energyand its derivatives acquire a factor 3 with respect to those found above.

IV.2. The quantum harmonic oscillator

The quantum harmonic oscillator is described by Hamiltonian (1), but in which p and x areoperators. This Hamiltonian may be rewritten [see your Quantum Mechanics course] as

H(x, p) =p2

2m+

1

2mω2x2

= − ~2m

∂2

∂x2+

1

2mω2x2

= . . .

= ~ω(a†a+ 12), (11)

in which a† and a are the usual creation and destruction operators, respectively. The last lineshows that H has the energy levels

εn = (n+ 12)~ω, n = 0, 1, 2, . . . . (12)

The canonical partition function therefore is

Zc(T ) =

∞∑n=0

e−βεn

=e−

12β~ω

1− e−β~ω(13)

[which is also equal to 1/(2 sinh 12β~ω) ]. It follows that

〈E〉c = −∂ logZc(T )

∂β= . . .

=~ω2

1 + e−β~ω

1− e−β~ω(14)

(which is also equal to ~ω2 coth ~ω

2 ). At high temperature the quantum curve approaches theclassical curve asymptotically. At low temperature 〈E〉c tends to the ground state energy, whichis nonzero in quantum mechanics. See Fig. 6.

For the specific heat one finds

CV =∂〈E〉c∂T

= kBβ2~2ω2 eβ~ω

(eβ~ω − 1)2. (15)

47

We may now consider the two limits

T →∞ or β → 0 : CV ' kB,

T → 0 or β →∞ : CV ' kB

(~ωkBT

)2e−~ω/kBT .

(16)

The exponential suppression of CV for T → 0 is characteristic of an energy spectrum in which agap separates the ground state from the excited states. One says that for kBT . ~ω the degreesof freedom of the oscillator are “frozen.” This is a quantum effect. See Fig. 7.

IV.3. The harmonic crystal

IV.3.1. The classical crystal

We consider a crystal of N atoms described by the Hamiltonian

H =N∑i=1

p2i

2m+ V (r1, . . . , rN ). (17)

Let V be such that it is minimal when the atoms are located at the lattice positions ri = r0i

(positions of mechanical equilibrium). We know experimentally that this is what happens. Forsmall deviations of the ri = (rix, riy, riz) from their equilibrium values the potential energybecomes, with V0 = V (r0

1, . . . , r0N ),

V (r1, . . . , rN ) = V0 +1

2

N∑i,i′=1

∂2V

∂ri∂ri′

∣∣∣∣∣0

: (ri − r0i )(ri′ − r0

i′) + . . . . (18)

In order to simplify notation we set ι = (i, α) and introduce coordinates qι = riα − r0iα and

pι = mqι. Upon neglecting the higher order terms in Eq. (18) we obtain the crystal Hamiltonianin the harmonic approximation,

H =3N∑ι=1

p2ι

2m+

1

2m

3N∑ι,ι′=1

Ω2ιι′qιqι′ , (19)

in which the 3N×3N matrix Ω2 is given by mΩ2ιι′ = [∂2V/∂riα∂ri′α′ ]|0. Eq. (19) is a Hamiltonian

of 3N coupled harmonic oscillators. The matrix may be diagonalized by linearly transformingfrom the pι, qι to new variables pκ, qκ, where the index κ = (k, λ) consists of a wavevector k anda polarization index λ.22 We then get

H =∑k,λ

[p2kλ

2m+

1

2mω2

kλq2kλ

]. (20)

The wavevector k may take any of N discrete values in a finite region of the k plane called thefirst Brillouin zone (see your course on solid state physics); the polarization index λ may takethree values (for one longitudinal and two transverse polarizations). Eq. (20) shows thereforethat there are 3N independent vibrational modes.

IV.3.2. The quantum crystal

22Do not confuse this λ with a wavelength, often indicated by the same symbol.

48

The Hamiltonian has the same form as in Eq. (17), but in which pi = (~/i)∂/∂ri and ri areoperators. The Hamiltonian is diagonalized by the same transformation as in the classical case.In the quantum case this leads to

H =∑k,λ

~ωkλ(a†kλakλ + 12), (21)

which is a sum of 3N independent quantum oscillators. The energy levels of the crystal are∑k,λ

(nkλ + 12)~ωkλ with nkλ = 0, 1, 2, . . . (22)

One says that the system contains nkλ phonons (= quanta of vibrational energy) of type (k, λ).We return to the simplified notation κ = (k, λ), where κ runs through 3N distinct values.

The canonical partition function is

Zc(T ) =∏κ

e−12β~ωκ

1− e−β~ωκ. (23)

The average energy is given by

〈E〉c =∑κ

~ωκ2

1 + e−β~ωκ

1− e−β~ωκ(24)

and the heat capacity by

CV = kB

∑κ

(β~ω)2 eβ~ωκ

(eβ~ωκ − 1)2. (25)

For T →∞ one sees that CV ' 3NkB, as for the classical crystal. For T → 0 things are slightlymore complicated.

One expects that when N is large, the frequencies ωκ constitute a quasi-continuum on thefrequency axis. Let N (ω)dω be the number of frequencies in the interval [ω, ω + dω]; we maywrite this as

N (ω) =∑κ

δ(ω − ωκ) (26)

[to see this, integrate both sides between ω0 and ω0 + dω; by the way, what is the physicaldimension of N ?].

Different types of crystals will have different spectra, hence different functions N (ω). Wemay then write (25) as

CV = kB

∫ ∞0

dωN (ω)(β~ω)2 eβ~ω

(eβ~ω − 1)2. (27)

We must of course have ∫ ∞0

dωN (ω) = 3N. (28)

The true problem is how to find the spectrum and hence N (ω). The details of the heat capacitydepend on the details of N (ω), but there are some general features. For T → 0 the integral (27)

49

Figure 8: Debye density of frequencies.

draws its main contribution from the low frequencies. We may then use the fact that for ω → 0we have

N (ω) ' C1ω2 + . . . , (29)

(no proof given here) in which the dots stand for higher order terms in ω. Substitution of (29)in (27) gives

CV = kB

∫ ∞0

dω [C1ω2 + . . .](β~ω)2 eβ~ω

(eβ~ω − 1)2

' kBC1 · 25

4(β~)−3

∫ ∞0

dxx4

sinh2 x

= kB4π4C1

15

(kBT

~

)3

, (T → 0). (30)

In going from the first to the second line we set x = β~ω/2 and in passing to the third line weused that the integral on x equals π4/30. The important conclusion that appears here is thatthe phonon specific heat goes to zero as ∼ T 3 when T → 0, whatever the crystal may be.

We note that C1 must be a time-cube in order that everything be dimensionally consistent;however, a more elegant rewriting of (30) is possible, as we will see now.

The Debye model. Debye proposed in 1912 to model a crystal by introducing a cutofffrequency ωD and taking

N (ω) =

C1ω

2 ω < ωD ,

0 ω > ωD .(31)

Substituting this in the normalization condition (28) yields C1 = 9N/ω3D and hence (30)

becomes

CV ' NkB12π4

5

(kBT

~ωD

)3

= NkB12π4

5

(T

TD

)3

, T → 0, (32)

50

where in the last line we have introduced the Debye temperature defined by ~ωD = kBTD.The T 3 behavior of the phonon specific heat is indeed observed in experiments. By fitting

the data to expression (32) one may characterize each crystal by its Debye temperature. Wehave, for example, TD ≈ 170 K for gold, TD ≈ 215 K for silver, and TD ≈ 400 K for lead.Put simply, for each crystal this is the temperature below which there are almost no latticevibrations.

Exercise. Show that the dot terms in (30) lead to corrections having higher powersin T/TD as T → 0.

Note. We will see that the specific heat of the electrons is linear in T as T → 0. Hence itdominates the phonon specific heat at low enough temperature.

We will return to the subject of harmonic oscillators in the chapter about photons (= quantaof oscillation of the electromagnetic field).

Exercise. The quantization of (20), which leads to (21), is possible only if the pairs(pι, qι), when viewed as quantum mechanical operators, satisfy the appropriate com-mutation relations. State these relations and explain why they are satisfied.

51

V. GRAND-CANONICAL ENSEMBLE

V.1. The ensemble

V.1.1. Definition

The grand-canonical ensemble is not, strictly speaking, an ensemble of the type (39), but aslight generalization of it. We will now denote an N particle microstate by ΓN = (rN ,pN ). Thegrand-canonical ensemble is a density ρg(N,ΓN ) on both N and ΓN . It depends on a parameterµ and is defined by

ρg(N,ΓN ) =1

N !h3NeβµN−βH. (1)

The expression on the RHS is of course equal to eβµNρc(ΓN ). The grand-canonical distribution

may be viewed as a weighted sum of juxtaposed weighted canonical distributions. The associatedgrand-canonical partition function, Zg is therefore

Zg(T, V, µ) =

∞∑N=0

∫dΓ

1

N !h3NeβµN−βH, (2)

which depends on the triplet of independent variables

(T, V, µ). (3)

Eq. (2) may be rewritten in various ways, in particular as

Zg(T, V, µ) =∞∑N=0

eβµNZc(T, V,N) (4)

=

∞∑N=0

eβµN−Fc(T,V,N). (5)

The first line of this equation shows that Zg is the generating function of the Zc(T, V,N) forN = 0, 1, 2, . . .. Passing from Zc to its generating function is not very different from doing aLaplace transform, albeit a discrete one (sum on N instead of an integral).

The quantity eβµ ≡ z is called the fugacity of the system23. A high (low) fugacity orchemical potential corresponds to a high (low) tendency for the presence of particles. Thechemical potential µ can be positive, negative, or zero; it is measurable only indirectly.

V.1.2. Grand-canonical averages

The probability distribution Pg(N,ΓN ) associated with the grand-canonical ensemble is

Pg(N,ΓN ) =ρg(N,ΓN )

Zg. (6)

It is normalized to unity,∞∑N=0

∫dΓNPg(N,ΓN ) = 1. (7)

23In chemical thermodynamics it is rather called the activity or activity coefficient.

52

Let A(N,ΓN ) be an observable. Its grand-canonical average is

〈A〉g =∞∑N=0

∫dΓNA(N,ΓN )Pg(N,ΓN )

=1

Zg

∞∑N=0

1

N !h3NeβµN

∫dΓNA(N,ΓN )e−βH(N,ΓN ), (8)

in which H(N,ΓN ) is the N particle Hamiltonian. Hence 〈A〉g is the result of an average thatalso runs over a varying number of particles.

There may be several reasons for wanting to work in the grand-canonical ensemble. Amongthese:

(i) it allows for the description of the physical phenomenon of a fluctuating particle number,for example (but not only) in the presence of permeable walls;

(ii) for certain types of problems, usually easy to recognize, the thermodynamics is derivedmost easily in the grand canonical ensemble;

(iii) the grand-canonical ensemble provides a suitable framework for setting up a theory fornonideal gases – which is one of the final purposes of statistical mechanics.

V.2. The grand-canonical potential

V.2.1. Legendre transformation between Fc and Jg

The thermodynamic potential associated with Zg is the grand-canonical potential (or justgrand potential ) Jg defined by24

Jg(T, V, µ) = −kBT logZg(T, V, µ). (9)

If Jg is extensive, as we expect it is, then we must have

Jg(T, V, µ) = V j(T, µ), V →∞; (10)

notice that since N is no longer an independent variable, we have used here proportionality toV as the criterion for extensivity.

Eq. (5) shows that Zg is obtained as the sum over the exponential of an extensive expression.The sum is therefore strongly peaked and, for reasons discussed several times before, will beequal to the value of the summand taken in the peak. Therefore, if this peak occurs for N =Ng(T, V, µ), we have

Zg(µ) = maxN

eβµN−βFc(N)

= eβµNg−βFc(Ng), (11)

where subextensive terms in the exponential have been neglected and where, for simplicity, wehave not indicated the dependence on V and T . It follows that

Jg = Fc − µNg (12)

24There is no general agreement on the symbols. Christophe Texier uses Ξ instead of Zg. It is commonto use Ω instead of Jg. Many authors use z = eβµ for the fugacity.

53

This relation between Jg and Fc is again a Legendre transformation; it is the consequence of the“Laplace” transformation (4) and the fact that the thermodynamic limit is being taken. Indeed,the relation J = F − µN is well-known in thermodynamics.

The value Ng(T, V, µ) is the solution of ∂[µN − Fc(N)]/∂N |Ng = 0, from which we seedirectly that the parameter µ of the grand-canonical ensemble has the interpretation

µ = µc . (13)

When its natural variables T, V, µ undergo variations dT, dV, and dµ, the resulting infinitesimalchange in Jg is

dJg =

(∂Jg

∂T

)V,µ︸︷︷︸

−Sg

dT +

(∂Jg

∂V

)T,µ︸︷︷︸

−pg

dV +

(∂Jg

∂µ

)T,V︸︷︷︸

−Ng

dN. (14)

The three new partial derivatives that appear here may be calculated directly in terms of knowncanonical quantities and we find (reasoning in a similar way as in section III.2.1) that, when(13) is satisfied,

Sg = Sc , pg = pc . (15)

Hence the grand-canonical ensemble leads to the same thermodynamics as the canonical (andthe microcanonical) ensemble; we may suppress the indices m, c, and g on the thermodynamicquantities: these refer only to how they have been calculated.

Remark. If we compare the definition of the pressure pg in (14) to (10), we see that thederivation of Jg with respect to V actually amounts to a simple division. It follows that thegrand-canonical potential satisfies

Jg = −pV. (16)

V.2.2. Other thermodynamic quantities

It is easy to show that the average number of particles of a system described by the grand-canonical ensemble may be obtained as

〈N〉g = kBT∂ logZg

∂µ. (17)

Because of the peaked nature of the distribution of N we must have 〈N〉g = Ng and in fact (17)is just another way of expressing the third partial derivative in (14).

Note. In spite of N,V, T being the independent variables, one often sets β = 1/kBT (asalways) and α = −µ/kBT , and considers α and β as independent. Eq. (17) then takes the form

〈N〉g = −∂ logZg

∂α. (18)

One shows furthermore that∂2 logZg

∂α2= 〈N2〉g − 〈N〉2g︸︷︷︸

σ2N

. (19)

54

Since logZg is extensive (proportional to V ), this shows that the typical deviation ∆N = N −〈N〉g of the particle number from its average scales as V

12 . The relative fluctuation ∆N/〈N〉g ∼

V −12 of the particle number is therefore very small and negligible in the thermodynamic limit.

Exercise. Relate the fluctuations (19) in the number of particles to the isothermalcompressibility κT ≡ − 1

V∂V∂p . Discuss beforehand in what sense you expect 〈∆N2〉g to

vary with κT .

You may check for yourself that the average energy in the grand-canonical ensemble isobtained as

〈E〉g = −∂ logZg

∂β

∣∣∣∣∣α fixed

. (20)

[Remark on mixed derivatives with respect to α and β.]

V.3. Derivation of the grand-canonical ensemble

The grand-canonical ensemble applies to any subsystem (whether finite or thermodynam-ically large) of a canonical or microcanonical system with which it can exchange both energyand particles. The subsystem may be separated from the total system by permeable walls orsimply by an imaginary closed surface. Formal derivations of the grand-canonical ensemble goalong the same lines as the derivation of the canonical from the microcanonical one; we will notdo that here.

V.4. The ideal gas

Recall that the canonical partition function of the ideal gas, given by (III.22), reads

Zc(T, V,N) =1

N !

(V

λ3

)N(21)

Substituting this in (4) and doing the sum we find

Zg(T, V, µ) = exp

(eβµ

V

λ3

)(22)

whence

Jg(T, V, µ) = −kBT eβµV

λ3. (23)

It follows immediately that the grand-canonical pressure and average particle number of theideal gas are given by

pg = −∂Jg

∂V= kBT

eβµ

λ3, (24)

Ng = −∂Jg

∂µ= eβµ

V

λ3. (25)

For the particle density this gives ng = Ng/V = eβµ/λ3, which goes down with the chemicalpotential as expected. When µ is eliminated from equations (24) and (25) we find that

pg = ngkBT , (26)

55

which is again the equation of state of the ideal gas, expressed in terms of the usual physicalquantities, but now derived within the framework of the grand-canonical ensemble.

To obtain the average grand-canonical energy we may proceed by calculating

〈E〉g = −∂ logZg

∂β

∣∣∣∣∣α fixed

= − ∂

∂βe−α

V

λ3= −e−αV

∂

∂β

(2πm

βh2

) 32

= 32NgkBT . (27)

in agreement with what the other ensembles gave.

Execrcise. Rederive the Sackur-Tetrode formula (I.15) for the entropy within theframework of the grand-canonical ensemble.

V.5. Nonideal gases: the virial expansion

We have studied ideal gases in three ensembles. Real gases may be close to ideal, butdifferences will necessarily become more apparent as the density goes up. Suppose that for anonideal gas we may write the equation of state as

p

kBT= n+B2n

2 +B3n3 + . . . , (28)

in which the first term is the ideal gas expression and the higher order terms account for devia-tions from the ideal gas law as the density goes up. The series (28) is called a virial expansion ;the coefficients B`, which depend on temperature, are the virial coefficients. They may of coursebe determined from experimental data and will depend on the kind of gas under consideration.

Question: For a nonideal gas with a given interparticle potential U(|ri − rj |), can we deter-mine the coefficients of the correction terms? If so, the measured equation of state will tell ussomething about the interparticle potential and vice versa.

Answer: Yes, this can be done, and the grand-canonical ensemble is a very suitable frameworkfor the theory.

Let us consider the N particle Hamiltonian

H(N,ΓN ) =

N∑i=1

p2i

2m+

UN︷︸︸︷∑1≤i<i′≤N

U(rii′), (29)

where rii′ ≡ |ri − ri′ |. Let in this section QN denote the configurational partition function ofthe N particle system,

QN (T, V ) =

∫dr1 . . . drN e−βUN , (30)

with Q0 ≡ 1. [Briefly discuss here Q1 and q2 .] Since QN involves interacting particles, there isno hope of ever being able to calculate it exactly.

We will set ζ = eβµ/λ3. The grand-canonical partition function for the system described byHamiltonian (29) then takes the form

Zg(T, V, µ) =∞∑N=0

ζN

N !QN (T, V ). (31)

56

Let us now expand its logarithm as a series in powers of ζ. This gives

logZg(T, V, µ) = Q1ζ + 12(Q2 −Q2

1)ζ2 + 16(Q3 − 3Q1Q2 + 2Q3

1)ζ3 + . . . (32)

= V∞∑`=1

b`(T, V )ζ`, (33)

where the second line defines the b`(T, V ). You may check easily that Q1 = V whence b1 = 1.Notice that QN involves an N -fold volume integral: we expect it to be proportional to V N in thethermodynamic limit. However, since we know that logZg ∼ V , there must occur cancelationsof all nonlinear powers of V in the expressions for the b`. Or put differently, we expect that

limV→∞

b`(V, T ) ≡ b`(T ) (34)

exists.From (33) we have, if we use that logZg = pV/kBT and divide both members by V ,

p

kBT=∞∑`=1

b`(T, V )ζ`. (35)

Consider now the series for the grand-canonical particle density n = 〈N〉g/V (we set again−βµ ≡ α) that comes along with (33),

n = − 1

V

∂ logZg

∂α=

ζ

V

∂ logZg

∂ζ

=

∞∑`=1

` b`(T, V )ζ`. (36)

Upon eliminating ζ from Eqs. (35) and (36) we find for p/kBT a series of the exact type (28), inwhich, in the thermodynamic limit, the coefficients are given by

B2 = −b2(T ), B3 = 4b22(T )− 2b3(T ) , . . . (37)

We now investigate the second virial coefficient B2. From its definition we have

b2(T ) = limV→∞

1

2V

[ ∫V

∫V

dr1 dr2 e−βU(r12) − V 2

]= lim

V→∞

1

2V

∫V

∫V

dr1 dr2

[e−βU(r12) − 1

](38)

whence

B2(T ) = −12

∫ ∞0

4πr2 dr[e−βU(r) − 1

]. (39)

If needed, this integral is easily evaluated numerically. Virial coefficients have been measuredand tabulated by experimentalists for many real gases. Knowing the precise equation of stateof a gas is of immense importance for a multitude of industrial purposes.

Exercise. Discuss the sign of B2 for attractive and repulsive potentials.

57

Exercise. Derive the expression for B3 given in (37).

Exercise. Calculate the second virial coefficient for a d-dimensional gas of hardspheres of diameter a, for d = 3, 2, 1. Compare with the exercise about the one-dimensional hard rods that we did in chapter III.

Epilog to this section. Calculating the consequences of interactions between particlesis one of the fundamental and very difficult questions of statistical physics. It has noall-encompassing answer.

In many approaches to this problem there is a “zeroth order” Hamiltonian that oneknows how to treat exactly, and an interaction term that leads to a perturbation series.The virial expansion is but one implementation of this general scheme; here the zerothorder system is the ideal gas. In other cases the zeroth order system might be, for example,the ground state, or the harmonic crystal.

Most gases are known to have critical points (recall the typical phase diagram andwhat is meant by a critical point). It so happens that the critical point is not close to anyzeroth order Hamiltonian. The study of critical phenomena therefore presents specifictheoretical difficulties and has developed into a scientific domain of its own. We mustrefer you to other courses to learn more about this.

58

VI. STATISTICAL MECHANICS OF QUANTUM SYSTEMS

In chapter I we showed how for a classical system you can go from the basic equationsof motion – the Newton or Hamilton-Jacobi equations – to the Liouville equation for anensemble ρ in classical phase space.

In this chapter we will start all over again and do the same thing for a quantumsystem, starting from the Schrodinger equation. It will appear that the classical ensembleρ is replaced with a density matrix or density operator also called ρ, and the classicalLiouville equation of motion for ρ by its quantum mechanical counterpart, called the vonNeumann equation. The stationary solutions of this equation are the quantum analogs ofthe classical ensembles.

Our dealings with quantum systems in earlier chapters (in particular in chapter IV:“Harmonic oscillators”) anticipated upon the results of the present chapter.

The material of this chapter may be found in part in C. Texier, Chapitre 3, sectionsII.A and II.B. You may also consult B. Diu et al., Physique Statistique (ed. Herrmann,Paris, 1989), Complements I.H and V.E.

VI.1. Reminders about quantum mechanics

Let us consider the Schrodinger equation

i~∂Ψ

∂t= HΨ (1)

where Ψ(

rN︷︸︸︷r1, r2, . . . , rN ; t) is the wave function25 of a many-body system of N particles.

The number |Ψ(rN , t)|2drN denotes the probability of finding particle i between ri andri + dri for all i = 1, 2, . . . , N. The corresponding ket will be denoted26 |Ψ). It is a vectorin the Hilbert space of the N particle system. In the present context we call |Ψ) a purestate of the system. All states that you have encountered in an introductory course toquantum mechanics were pure states. It is not needed for Ψ to be an eigenstate of whatoperator soever. The number N may be very large, for example of the order of 1023. Amacroscopic system is described by a wave function in the same way as a microscopic one.

The Hamiltonian is an operator in the space of pure states. In one of the simplestcases it might be of the form

H =1

2m

N∑i=1

p2i +

∑1≤i<i′≤N

U(|ri − ri′ |),

pi =~i

∂

∂ri. (2)

The quantum mechanical scalar product is

(Ψ1|Ψ2) =

∫drN Ψ∗1(rN)Ψ2(rN) (3)

25If needed, the spin is easily incorporated in the same formalism.26We avoid the notation |Ψ〉 since we are using angular brackets to define ensemble averages.

59

and the normalization condition

(Ψ|Ψ) =

∫drN |Ψ(rN , t)|2 = 1. (4)

Let n be an arbitrary index (no need to specify whether it is discrete or continuous, orpartly both; it may take an infinity of values) and let φn(rN) be a time independentorthonormal basis, that is,

(φm|φn) = δmn . (5)

An arbitrary pure state is a superposition of the φn with time dependent coefficients cn(t),∣∣Ψ(t))

=∑n

cn(t)∣∣φn), (6)

where cn(t) =(φn∣∣Ψ(t)

)and where for convenience we have suppressed the spatial coor-

dinates rN . The normalization (4) of Ψ(t) inplies that∑n

|cn(t)|2 = 1. (7)

The time evolution of the cn(t) is determined by the Schrodinger equation (1). Uponsubstituting (6) in (1) and then left-multiplying by (φn| one obtains

i~dcn(t)

dt=∑m

Hnmcm(t) (8)

in whichHnm = (φn|H|φm) (9)

is the matrix representation of the Hamiltonian in the basis φn(rN).Let A be an operator corresponding to a physical observable. Quantum mechanics

tells us that A ≡ (Ψ|A|Ψ) is the expectation value of A that will be observed when thesystem is in pure state Ψ.

At this point we will choose the φn to be eigenstates of A,

Aφn = Anφn , (10)

where An is one of the eigenvalues of A. It then follows that in the pure state Ψ(t) theexpectation value of A may be expressed as

A(t) =(Ψ(t)|A|Ψ(t)

)=

∑n

∑m

Am c∗n(t)cm(t) (φn|φm)

=∑n

An|cn(t)|2. (11)

60

Here the An are the possible outcomes of a measurement of A performed on the systemand |cn(t)|2 is the quantum mechanical probability for the system to be in eigenstate n.

VI.2. The density operator / density matrix

Suppose now we do not know the exact pure state of the system. Suppose all we knowis that it is in one of a set of pure states Ψk, where k is an index. In analogy to whatprecedes we set

Ψk(t) =∑n

ckn(t)φn (12)

and will write the expectation value of a physical observable A in pure state Ψk as

A(k)

= (Ψk|A|Ψk) =∑n

|ckn(t)|2An . (13)

Let now Pk be the (classical = nonquantum) probability that the system is in Ψk. It willbe convenient to work with unnormalized weights pk from which the Pk may be derivedaccording to

Pk =pkZ, Z =

∑k

pk . (14)

The collection of weights pk constitutes again an ensemble; it is an ensemble of purestates. The expectation value (the average) of observable A in this ensemble will bedenoted by 〈A〉. It is obviously given by

〈A〉 =∑k

Pk A(k). (15)

This formula expresses the essence of what we want to convey in this chapter. However,we are now going to rewrite it in several other useful ways. We have

〈A〉 = Z−1∑k

pk A(k)

= Z−1∑k

pk(Ψk|A|Ψk). (16)

Upon inserting the complete basis of states∑

n |φn)(φn| before and after the operator Awe get

〈A〉 = Z−1∑n

∑m

∑k

(φm|Ψk)pk(Ψk|φn)︸︷︷︸

ρmn

(φn|A|φm)︸︷︷︸Anm

(17)

in which ρmn is the mn matrix element of the density operator ρ defined by

ρ =∑k

|Ψk)pk(Ψk|. (18)

61

Hence

〈A〉 =1

Z

∑n

∑m

ρnmAmn

=1

Ztr ρA

=tr ρA

tr ρ. (19)

Remarks.1. The factor 1/Z is sometimes incorporated in the definition of ρ. In that case one hastr ρ = 1 and 〈A〉 = tr ρA.2. The density operator ρ is an operator in Hilbert space in the same way as H, A,B, . . ..It is time dependent (why?). As any operator, it may be expressed as a matrix once abasis has been chosen.3. The “collection of pure states” with which we started is also called a statistical mixtureof pure states or a mixed state. One might be tempted to speak of an “ensemble of purestates”; however, it is the density operator ρ, rather than this collection of pure states,that is the counterpart of the classical ensembles. In particular, ρ determines all theexpectation values of the physical observables.4. Different statistical mixtures may lead to the same density operator.

VI.3. The von Neumann equation

The time dependence of the density operator ρ is determined by the Schrodingerequation and we will now establish how.

We haveρmn(t) =

∑k

pk ck∗m (t)ckn(t). (20)

Employing Eq. (8) for the time evolution of cn(t) and its complex conjugate and obtain

i~d

dtρmn(t) = . . .

=∑`

(Hm`ρ`n(t)− ρm`(t)H`n

)(21)

or, in operator notation,

i~dρ

dt=[H, ρ

]. (22)

Eq. (22) is called the von Neumann or Liouville–von Neumann equation; it is the quantumcounterpart of the classical Liouville equation.

One advantage of the formulas where the density operator appears is that it is notnecessary to specify a basis. For example, tr ρ is a scalar whose value is independent ofthe basis selected to evaluate it.

62

VI.4. Stationary density operators

A density operator is stationary if dρ/dt = 0, that is, if[H, ρ

]= 0. (23)

A sufficient condition for (23) to hold is that we have ρ = f(H), where f is an arbitraryfunction. In that case ρ is diagonal in the basis of eigenfunctions of H. Let us denotethese by ψn and their eigenvalues by En ,

Hψn = Enψn . (24)

Using these as a basis we find

Z = tr ρ =∑n

(ψn|ρ|ψn) =∑n

ρnn (25)

and

〈A〉 =tr ρA

tr ρ=

∑n(ψn|ρA|ψn)∑n(ψn|ρ|ψn)

=

∑n ρnn(ψn|A|ψn)∑

n ρnn. (26)

Interpretation

In thermal equilibrium the expectation value of an observable A of a quantum systemis obtained by

(1) calculating it in an eigenstate ψn of the system Hamiltonian;(2) averaging the result with weight ρnn on all eigenstates.

Different choices of f lead to different weights ρnn. These ρnn are the analogs of theclassical ensembles. In several exercices we have already carried out steps (1) and (2).They appear here as a logical consequence within the framework of the density operatorformalism.

The stationary ensembles usually considered in quantum mechanics are the same asthose used in classical mechanics. In particular

ρmn =

δmn if E < En < E + δE,

0 otherwise,microcanonical ensemble,

ρmn = δmn e−βEn , canonical ensemble. (27)

In the latter case one may also write

ρ = e−βH. (28)

In this formalism the canonical partition function and the free energy become

Zc = tr e−βH =∑n

e−βEn (29)

63

andFc = −kBT logZc = −kBT log tr e−βH, (30)

respectively, which are relations already encountered before.Let now HN be the Hamiltonian of an N particle system and let ρN = exp(−βHN).

Then the grand-canonical partition function is

Zg(µ, V, T ) =∞∑N=0

eβµNZc(T, V,N)

=∞∑N=0

eβµNtr ρN

=∞∑N=0

tr eβ(µN−HN ) (31)

In earlier chapters we have calculated quantum mechanical partition functions for a num-ber of simple systems (spins, harmonic oscillators). The present formalism adds nothingnew to that; it simply confirms that we did it the correct way.

Remark. From the above it is clear that in equilibrium statistical mechanics the off-diagonal elements (ρmn with m 6= n) of the density matrix play no role. They do play arole, however, in time-dependent phenomena, and in such cases the present formalism isthe indispensible starting point.

ExampleWe consider a spin 1

2whose Hilbert space is spanned by the two kets |+〉 and |−〉. An

arbitrary state may be written as α|+〉 + β|−〉, where α and β are complex coefficients.The probability distribution pk discussed above becomes, in this example, a distributionp(α, β). [To be completed.]

64

VII. IDEAL QUANTUM GASES

By an ideal gas we mean a system of particles without interaction. The followingobservations, made earlier for a classical system, remain true in quantum mechanics.1. An ideal gas which is not at equilibrium will never be able to reach equilibrium. Thisis because equilibration is due to the exchange of energy and momentum between theparticles.2. Ideal gases do not exist: there are always interactions, even if they may be weak, thatguarantee that a real gas will tend to equilibrium. Nevertheless, for several problems ofinterest in physics an ideal gas may be a good approximation to a real gas.

VII.1. The ideal gas Hamiltonian and its eigenfunctions

Let us consider N identical noninteracting particles of spin S enclosed in a cubicvolume V = L3. We will write rj = (xj, yj, zj) ∈ V for the position of the jth particle andsj = −S,−S+1, . . . , S for its spin state. The N particle Hamiltonian is spin independentand given by

H =1

2m

N∑j=1

p2j

= − ~2

2m

N∑i=1

(∂2

∂x2i

+∂2

∂y2i

+∂2

∂z2i

)

=N∑j=1

Hj , (1)

in which theHj are single-particle Hamiltonians. We adopt periodic boundary conditions,that is, we identify the positions r and r + (m1,m2,m3)L where m1,m2, and m3 arearbitrary integers.

The single-particle eigenkets and eigenvalues of Hj depend on a wavevector k and aspin index σ and are given by

|kσ) (2)

and

εkσ =~2k2

2m, (3)

respectively, in which

k =2π

Ln =

2π

L(nx, ny, nz) (4)

where nx, ny, and nz are arbitrary integers, and

σ = −S,−S + 1, . . . , S. (5)

The |kσ) are orthonormal; they are a special choice for the single-particle basis statesthat were called |λ) in your Quantum Mechanics course. As a consequence, the labels

65

correspond to single-particle energies. Note that the energy εkσ is spin independent here.In position-spin representation we may write

ψkσ(rj, sj) = (rjsj|kσ) =1√V

eik·rj δσ,sj (6)

Suppose now that from the infinite set of possible labels (kσ) we select N specificones, namely (k1, σ1), (k2, σ2), . . . , (kN , σN). The N particle product states that may beconstructed from the single-particle states with these labels are

Ψk1σ1...kNσN (r1, s1 . . . , rN , sN) = ψk1σ1(r1, s1)ψk2σ2(r2, s2) . . . ψkNσN (rN , sN) (7)

in which kj is the wavevector of particle j. They are also eigenfunctions of H witheigenvalues

Ek1σ1...kNσN = εk1σ1 + εk2σ2 + . . .+ εkNσN . (8)

Even though every function of type (7) is mathematically an eigenfunction of H,not every such function is allowed to describe a physical system. Two cases have to bedistinguished.

I. Bosons. The particles are bosons (have an integer spin). In that case, out of allmathematically possible product states (7) with the selected labels the only allowable oneis the symmetric linear combination

Ψsymmk1σ1...kNσN

=∑P

ψk1σ1(rP1, sP1)ψk2σ2(rP2, sP2) . . . ψkNσN (rPN , sPN), (9)

where P is a permutation of the indices 1, 2, . . . , N and where we ignore a normalizationfactor which in the present discussion is of no importance. The lower index of Ψsymm is acollection of N labels (kj, σj) that need not be all different. Note that a permutation oftwo labels does not give a new state. The particles are said to obey Bose-Einstein (BE)statistics.

II. Fermions. The particles are fermions (have half-integer spin). In that case, out ofall mathematically possible product states (7) with the selected labels the only allowableone is the antisymmetric linear combination

Ψantisymmk1σ1...kNσN

=∑P

(−1)Pψk1σ1(rP1, sP1)ψk2σ2(rP2, sP2) . . . ψkNσN (rPN , sPN)

=

ψk1σ1(r1, s1) . . . ψkNσN (r1, s1)

ψk1σ1(r2, s2) . . . ψkNσN (r2, s2)

: . . . :

: . . . :

ψk1σ1(rN , sN) . . . ψkNσN (rN , sN)

(10)

in which (−1)P stand for the signature of the permutation; and where the second line is arewriting of the first one as a Slater determinant. In order for this wavefunction Ψantisymm

66

to be nonzero its lower index must be a collection of labels (kj, σj) that are all different.This is called the Pauli exclusion principle : no two particles can be in the same state.The particles are said to obey Fermi-Dirac (FD) statistics.

VII.2. Characterization with the aid of occupation numbers

The wavevectors k constitute an infinite three-dimensional lattice each of whose sitesrepresents 2S + 1 single-particle states. A microstate of the quantum N particle systemmay be characterized• either by listing the N pairs (kj, σj);• or by specifying for each single-particle state (k, σ) the number nkσ of particles in

that state; we call nkσ the occupation number .In the latter case the list of occupation numbers is infinite and such that∑

k,σ

nkσ = N (11)

in which we have the possible valuesfor BE: nkσ = 0, 1, 2, . . . ,

for FD: nkσ = 0, 1.(12)

This alternative but equivalent representation will greatly facilitate the calculations below.

VII.3. Important remark

One often encounters quantum systems of N independent particles but which are notdecribed by the ideal gas Hamiltonian (1). Examples are• N independent particles in an external potential.• N interacting particles whose Hamiltonian, after diagonalization, takes the form of

a sum of independent single-particle Hamiltonians. These are then called quasi-particles.What changes in such cases? The only changes are

(i) the label (k, σ) is replaced with a more general label λ that contains all relevantquantum numbers. The plane waves ψkσ are replaced with more general wave functionsψλ, but as we have seen the nature of the eigenfunctions of the Hamiltonian does not haveany consequences for the thermodynamics.

(ii) the expression εkσ = ~2k2/2m, which is characteristic of free particles, is replacedwith a more general expression ελ.

VII.4. Partition function of a system of independent identical quantum par-ticles

VII.4.1. Arbitrary single-particle energies

We are looking for the quantum analog of the classical (= nonquantum) equation ofstate pV = NkBT . We summarize the relevant formulas. The canonical partition function

67

of a system of N independent quantum particles is

Zc(T, V,N) =∑BE,FD

k1σ1,...,kNσNe−βEk1σ1,...,kNσN , (13)

in which the summation must take into account the particle statistics, either BE or FD.The total energy of the N particle system is given by

Ek1σ1,...,kNσN =N∑j=1

εkjσj [which is (8)]

=∑k,σ

nkσεkσ . (14)

In (13) we may alternatively sum on the nkσ, which gives

Zc(T, V,N) =∑BE,FD

nkσδN,∑k′′σ′′ nk′′σ′′

e−β∑

k′σ′ nk′σ′εk′σ′ . (15)

The delta function constraint in this sum is inconvenient. A way to lift it is to pass tothe grand-canonical partition function,

Zg(T, V, µ) =∞∑N=0

eβµNZc(N, V, T )

=∑BE,FD

nkσeβ

∑k′,σ′ (µ−εk′σ′ )nk′σ′

=∏k,σ

[∑BE,FD

nkσ

eβ(µ−εkσ)nkσ

]︸︷︷︸

zkσ

. (16)

The third line of this equation makes Zg appear as the product of grand-canonical partitionfunctions zkσ of a single energy level εkσ that may be populated by a variable number nkσ

of particles.

Remark. A simpler example of the mathematical operation that has taken place above is thefollowing, ∑

n1,n2

e∑

j=1,2 ajnj =∏i=1,2

[∑ni

eaini

].

Check this.

We now need to specify whether the particles are bosons (BE) or fermions (FD). Thisleads to the two distinct explicit expressions

ZBEg (T, V, µ) =

∏k,σ

[1− eβ(µ−εkσ)

]−1,

ZFDg (T, V, µ) =

∏k,σ

[1 + eβ(µ−εkσ)

]. (17)

68

We will suppose for simplicity that the single-particle energies do not depend on the spinvariable and write G = 2S + 1 for the spin degeneracy. We then have from (17)

logZBE,FDg (T, V, µ) = ∓G

∑k

log[1∓ eβ(µ−εkσ)]. (18)

In the limit V →∞ the wavevectors in k space get closer and closer and we may replacethe sum in (18) by an integral according to∑

k

→ V

(2π)3

∫R3

dk . (19)

Recall that (2π)3/V is the volume of the cubic unit cell of the lattice in k space.

VII.4.2. Density expansion for an ideal gas of free particles

So far everything is valid for arbitrary single-particle energies εkσ . We will now restrictourselves to expression (3), valid for an ideal gas of free particles. Upon substituting (3)in (18) and using (19) we get

logZBE,FDg (T, V, µ) = ∓G V

8π3

∫dk log[1∓ eβµ−β~

2k2/2m]. (20)

We now use in (20) the identity

∓ log(1∓ ex) =∞∑`=1

(±)`−1 ex`

`, x < 0, (21)

and obtain

logZBE,FDg (T, V, µ) = G

V

8π3

∞∑`=1

(±)`−1 eβµ`

`

∫dk e−β`~

2k2/2m

= GV

λ3

∞∑`=1

(±)`−1 eβµ`

`5/2, µ < 0, (22)

in which

β =1

kBT, λ =

h√2πmkBT

. (23)

Recall that λ is the de Broglie wavelength of a typical particle of mass m in a gas oftemperature T . Thermodynamics tells us that the pressure p of a gas is related to thegrand-canonical partition function by logZg = βpV . Hence (22) leads us to

pBE,FD(µ, T ) = GkBT

λ3

∞∑`=1

(±)`−1 eβµ`

`5/2. (24)

69

The chemical potential cannot be measured directly and for that reason we rather expressp in terms of the density,

n =〈N〉gV

=

(∂p

∂µ

)V,T

. (25)

From (24) and (25) we get

n =G

λ3

∞∑`=1

(±)`−1 eβµ`

`3/2. (26)

We solve (26) for eβµ and substitute the result in (24). From here on we make an expansionfor low density n, useful only (as we will see in a moment) when the quantum effects areweak. To this end we set

eβµ = a1n+ a2n2 + . . . (27)

and find the coefficients ai successively,

a1 =λ3

G,

a2 = ∓ a21

23/2,

a3 =

(1

4− 1

33/2

)a3

1 , (28)

and so on.

Substitution of (27)-(28) in (24) leads to

pBE,FD = nkBT

[1∓ 2−5/2nλ

3

G+

(1

8− 2.3−5/2

)(nλ3

G

)2

+ . . .

]

= nkBT

[1∓ 0.1768

nλ3

G− 0.0033

(nλ3

G

)2

+ . . . .

], (29)

which is the desired equation of state, albeit in the form of a series expansion. Severalcomments are in place now.

Exercise. Check the expressions for a1, a2, and a3 given in Eq. (28). Check thatyou find (29).

Exercise. Show that pV = 23E for both bosons and fermions.

Exercise. Show that the density of single-particle energy levels ρ(ε) is given byρ(ε) = V A× ε1/2 and find the expression for the constant A.Answer: A = G

4π2 (2m~2 )3/2.

VII.5. Comments

70

(1) The above expansion is analogous to the virial expansion.27 Even though theparticles have no interaction, the effect of quantum statistics are those of an interaction.

(2) The quantum effects are important when nλ3 1 or equivalently

λ3

v 1, (30)

where v ≡ 1/n is the volume per particle. This condition may be interpreted as follows.A single particle has quantum behavior on length scales less than its thermal de Brogliewavelength λ. The other particles that notice this quantum character are those withinthe volume λ3, and this number is equal to λ3/v = nλ3.

(3) The sign ∓ is such that the ‘−’ is for bosons and the ‘+’ for fermions. The Pauliprinciple says (we simplify) that fermions ‘repel’ each other. The pressure of a gas offermions is therefore larger than that of a classical gas. Bosons, on the contrary, may‘pile up’ in the same single-particle state without limit. The pressure of a gas of bosonsis therefore less than that of a classical gas.

(4) One sometimes says that classical particles obey Maxwell-Boltzmann (MB) statis-tics. This is the statistics that prevails when 〈nkσ〉 1.

(5) From what precedes it is clear that quantum effects may be enhanced in thefollowing ways.

(i) By lowering the temperature T . Usually you have to be as low as a few degreesKelvin or less.

(ii) By increasing the particle density n. This means considering solids or liquidsrather than gases.

(iii) By working with particles of light mass m, such as electrons or He atoms.

Exercise. The quantum effects disappear also in the limit G → ∞. Interpretthis result.

Terminology. When the quantum effects are dominant, one speaks of a stronglydegenerate quantum system. The above expansion in powers of nλ3 is useful for weaklydegenerate quantum systems.

VII.6. Averages of occupation numbers

We return here to Eqs. (18) and (19), which is the point where we were before doingthe low density expansion. In the grand-canonical ensemble the average total number ofparticles is given by

〈N〉g =1

β

∂ logZBE,FDg

∂µ. (31)

More in detail one may ask about the average number 〈nkσ〉 of particles in a specificsingle-particle state (k, σ). Since (16) tells us that logZg =

∑k,σ log zkσ , this average is

27Compare Eqs. (24) and (26) above to Eqs. (35) and (36) of Chapter V.5. In both cases the latterequation has an extra factor ` in the summand.

71

Figure 9: Average occupation number of a single-particle level of energy εkσ in Bose-Einstein, Fermi-Dirac, and Maxwell-Boltzmann statistics.

obtained as

〈nkσ〉 =1

β

∂ log zkσ∂µ

, (32)

whence

〈nkσ〉BE,FD =eβ(µ−εkσ)

1∓ eβ(µ−εkσ). (33)

Notice that in (32) we are dealing with the grand-canonical partition function zkσ of onespecific single-particle energy level.

Expressions (33) are the Bose-Einstein and Fermi-Dirac distributions for independentparticles. We recall that µ < 0 so that also β(µ−εkσ) < 0 for all (k, σ). The total particledensity in the system is obtained as 1/V times the sum of the particle numbers in eachstate, that is,

n =1

V

∑k,σ

〈nk,σ〉BE,FD

=G

8π3

∫dk

eβ(µ−εk)

1∓ eβ(µ−εk), (34)

which of course follows more directly if you substitute (20) in (31) and carry out thedifferentiation with respect to µ.

72

Exercise. Say how the classical limit of Eq. (33) may be taken and show thatthis expression then becomes the Maxwell-Boltzmann distribution.

Exercise. Calculate the first order correction term to the equation of state asgiven by (29) for a gas of argon at normal temperature and pressure. Compareyour answer with the first order correction term for the same gas given by (V.28)and discuss the difference between the two expansions.

73

VIII. IDEAL FERMI GAS

VIII.1. Introduction

We will now be interested in the ideal Fermi gas (that is, composed of noninteractingparticles) at temperatures T low enough so that it is strongly degenerate. Two questions,at least, arise:

(1) When may we consider that the temperature is sufficiently low?(2) What is the ground state (the only state that counts when T → 0) ?

The best example is the “gas” of conduction/valence electrons in a metal. Anotherexample is a system of liquid 3He. The electron cloud around an atom of large atomicnumber may also be considered as a Fermi gas, but the Coulomb interactions are certainlynot negligible, so this gas is not “ideal.”

Exercise. As we have seen, the inequality nλ3 1 is the condition for quan-tum effects to be weak. Calculate nλ3 for the electron gas in a metal at roomtemperature, assuming that there is one conduction electron per unit of volumeequal to (3 A)3). Does this gas behave quantum mechanically?

The exercise above shows that room temperature is a very low temperature indeed forthe electron gas in a metal. As a consequence, a canonical ensemble at room temperaturefor this sytem will be dominated by states close to the ground state.

VIII.2. Ground state

We now consider a system (“gas”) of free and independent electrons28 in a cubicvolume V = L3 with periodic boundary conditions. Hence εkσ = εk = ~2k2/2m withk = 2πn/L and n = (nx, ny, nz) a three-dimensional vector of integers. Furthermoreσ = ±1/2 whence G = 2.

I. Direct argument

The single particle wavevectors k constitute a cubic lattice in k space (“reciprocalspace”) whose lattice distance is 2π/L; the reciprocal-space volume per lattice point istherefore 8π2/V . Each lattice point k may store one electron of spin σ = 1/2 and oneelectron of spin σ = −1/2. The relation εk = cst defines a spherical surface in k space. Ifyou add successively electrons to the system while keeping its energy as low as possible,you therefore fill successive spherical shells. We let kF stand for the radius of the shellwhen all N electrons have been added. A sphere of radius k contains

4π

3k3 · V

8π3(1)

lattice points. In order to fill these with N electrons this number should be equal to N/2

28The ions are considered as providing an immobile positive background that screens the Coulombinteractions between the electrons.

74

Figure 10: Single-particle states of an ideal gas. At high temperature only very few willbe actually occupied by a particle.

Figure 11: The Fermi surface: here a circle.

75

Figure 12: Occupied and unoccupied states at low temperature.

(a factor 1/2 because of the spin) and hence

n =N

V=

k3F

3π2, (2)

which when solved for k leads to

kF = (3π2n)1/3, the Fermi wavenumber. (3)

We therefore conclude that the ground state is given by the set of occupation numbers

nkσ =

1 |k| ≤ kF ,

0 |k| > kF .(4)

The spherical surface k = kF is called the Fermi surface. The last electron added hasa single-particle energy

εF ≡ εkF =~2kF

2

2m, the Fermi energy, (5)

and a velocity

vF =~kF

m, the Fermi velocity. (6)

The Fermi energy and velocity are characteristic of a temperature

TF =εFkB

, the Fermi temperature. (7)

Example. Copper has εF = 7.00 eV, whence TF = 8.16 × 104 K. Furthermore it hasvF = 1.57× 106 ms−1.

76

Exercise. Show that the ground state energy E0 of the ideal Fermi gas is givenby

E0 =3

5NεF . (8)

Exercise. Show that the condition nλ3 ≈ 1 corresponds to T ≈ TF.

II. Derivation within the formalism

We return to Eq. (33). In the limit of low temperature we have

〈nkσ〉 =1

e−β(µ−εkσ) + 1−→T→0

1 εkσ < µ,

0 εkσ > µ.(9)

Using now that εkσ = ~2k2/2m we conclude that

all single-particle levels are

occupiedempty

for |k|

<> kF ≡

√2mµ

~, (10)

in which we still have to determine µ. That may be done by imposing the total densityof electrons, n = N/V , which in the limit T → 0 leads to

n =2

8π3· 4π

∫ kF

0

dk k2 =kF

3

3π2. (11)

From this equation we obtain the same expression for kF and thence all further results ofthe “direct argument,” but moreover the extra relation

µ(T = 0, n) = εF =~2

2m(3π2n)2/3. (12)

We may now use the relation pV = kBT logZg and Eq. (VII.20) to find the pressure inthe limit T → 0,

p(T = 0, n) = limT→0

kBT

V

2V

8π3

∫k<kF

dk log[1 + eβ(εF−~2k2/2m)

]︸︷︷︸

β(εF−~2k2/2m)+...

=1

4π3

∫k<kF

dk

(εF −

~2k2

2m

)

=1

4π3

[~2kF

2

2m· 4π

3kF

3 − ~2

2m· 4π

5kF

5

]

=~2kF

5

15π2m

=2

5nεF

=(3π2)2/3

5

~2

mn5/3. (13)

77

We see that even at zero temperature the ideal Fermi gas exerts a pressure, which maybe seen as due to the nonzero velocities of the particles.

VIII.3. Thermodynamics at very low T

Application: electron contribution to the specific heat of a metal.Method: at room temperature the electron system is still close to its ground state, so

we will set up a perturbative calculation around the latter.

Question: Find CV (T, V,N) = dE/dT at low temperature. Write E (rather than Eg or〈E〉g) for the average total energy of the system and the task becomes finding E(T, V,N).

To answer this question we will start from

E(T, V, µ) =∑k,σ

〈nkσ〉εkσ

= . . .

=V

2π2

(2m

~2

)3/2 ∫ ∞0

dηη3/2

eβ(η−µ) + 1, (14)

where we introduced η = εkσ = ~2k2/2m as the variable of integration. Similarly

N(T, V, µ) =∑k,σ

〈nkσ〉

=V

2π2

(2m

~2

)3/2 ∫ ∞0

dηη1/2

eβ(η−µ) + 1. (15)

Our strategy to obtain E(T, V,N) will be as follows: we will invert N(T, V, µ) given by(15) so as to obtain µ(T, V,N) and substitute the latter in (14). It will appear that thisis possible only if we expand all relevant quantities as power series in the ratio kBT/εF.

Remark. Since (εF/kBT )3/2 = 38

√πnλ3 we see that kBT/εF is small when nλ3 is large.

We begin by considering the integral

I =

∫ ∞0

f(η)dη

eβ(η−µ) + 1

=

∫ µ

0

dη f(η) −∫ µ

0

f(η)dη

e−β(η−µ) + 1+

∫ ∞µ

f(η)dη

eβ(η−µ) + 1(16)

in which f(η) = η1/2 [see figure 13] or = η3/2. To begin with, we will expand in inversepowers of βµ, supposed to be 1. This hypothesis will have to be checked a posteriorisince, even though µ0 ≡ µ(T = 0) = εF, and therefore finite, µ is a still unknown functionof T . [We impose that n does not vary.]

In the second term in (16) we may replace the lower limit of integration by −∞, whichonly adds a contribution that is exponentially small in βµ. That done, we set x = µ− η

78

Figure 13: Graphical representation of the integrand of Eq. (16).

in the second term and x = η − µ in the third one. This yields

I =

∫ µ

0

dη f(η) +

∫ ∞0

dx

2xf ′(µ)+...︷︸︸︷f(µ+ x)− f(µ− x)

eβx + 1+ . . . (17)

[Asymptotic expansion discussed in class.] The result is

I =

∫ µ

0

dη f(η) +π2

6β2f ′(µ) + . . . , (18)

where we used the identity ∫ ∞0

z dz

ez + 1=π2

12. (19)

and where now the dots stand for terms of higher powers in 1/β. This procedure is calleda Sommerfeld expansion. Setting f(η) = η1/2 and = η3/2 successively we find

E(T, V, µ) =V (2mµ)5/2

10π2m~3

[1 +

5π2

8

(kBT

µ

)2

+ . . .

], (20)

N(T, V, µ) =V (2mµ)3/2

3π2~3

[1 +

π2

8

(kBT

µ

)2

+ . . .

]. (21)

Relation (21) gives N (or n = N/V ) as a function of µ and T . In order to find µ as afunction of n and T we divide it by V and raise it to the power 2

3, which gives

µ =~2

2m(3π2n)2/3︸︷︷︸=µ0=εF

[1 +

π2

8

(kBT

µ

)2

+ . . .

]− 23

, (22)

whence

µ(T, V,N) = µ0

[1− π2

12

(kBT

µ0

)2

+ . . .

](23)

79

in which µ0 = µ0(n). Substitution of (23) in (20) gives

E(T, V,N) =3

5NεF︸︷︷︸

GS energy

[1 +

5π2

12

(kBT

µ0

)2

+ . . .

]. (24)

Hence

CV (T, V,N) =dE

dT=π2

2

kBT

εFNkB + . . . (T → 0) (25)

where the dots indicate terms with higher powers of kBTεF

= TTF

. The last two equationsconstitute the answer to the question that we asked at the beginning of this section.Eq. (25) shows that the electron heat capacity vanishes linearly with temperature in theT → 0 limit. This contrasts with the classical expression CV = 3

2NkB.

In contrast to expansion (VII.29), equation (25) is an expansion in negative powers ofnλ3. It is not an equation of state, but see the exercise below.

Exercise. Use result (24) and the relation pV = 23E [see Chapter VII] to obtain

the equation of state near zero temperature. Write it down explicitly.Answer:

p(n, T ) =(3π2)2/3n5/3~2

5m︸︷︷︸p0

[1 +

5π2

12

(kBT

εF

)2

+ . . .

]. (26)

Exercise. Let ∆ε be the width of the Fermi surface at temperature T . Deter-mine the order of magnitude of ∆ε/εF. Same question (in obvious notation) for∆k/kF.

Remarks and comments

(1) The final result. We recall that expression (25) represents the heat capacity of theelectrons in a metal; as we have seen, the lattice vibrations (phonons) contribute anamount ∝ T 3. In the low temperature limit the total heat capacity of a solid is thereforegiven by

CtotalV = Cel

V + CphV

= AT +BT 3, T → 0. (27)

The electron contribution becomes visible only at very low T , typically of the order of afew degrees K. Observe that A and B are fully determined by TF and TD (the Fermi andthe Debye temperature), respectively.

(2) Excitations. For T > 0 excitations from the ground state are possible: an electronclose to but just inside the Fermi surface occupies a position just outside that surface and

80

leaves a “hole.” The Fermi surface becomes slightly diffuse. This is best represented bythe occupation number 〈nkσ〉FD shown as a function of the energy εk [see figure 13].

(3) For independent electrons that are not free but subject to a periodic potential – suchas the one created by the atoms of a lattice – there is still a Fermi surface, but it is nolonger spherical.

81

IX. IDEAL BOSE GAS

IX.1. Introduction

An important application: liquid 4He (see at the end of this chapter). Below we willtake bosons of spin S = 0, hence G = 1.

Ground state: all particles occupy the “point” (lattice site, single-particle state) k = 0.Therefore the ground state energy is zero. We recall [Eq. VII.(33)] that for bosons withoutspin

〈nk〉 =1

e−β(µ−εk) − 1. (1)

The lowest single-particle energy is εk=0 = 0 and therefore we need µ < 0 if all the 〈nk〉are to be nonnegative. If −β(µ− εk) 1 we may write (1) as

〈nk〉 'kBT

~2k22m− µ

(2)

with the special case

〈n~0〉 ' −kBT

µ. (3)

This shows that the average occupation number 〈n~0〉 may increase without bounds asµ → 0−, while those with k 6= 0 stay finite. But in that case we have to redo themathematics of Chapter VII.

IX.2. Particle density and condensate

In order to derive Eq. (VII.34) for the particle density n we replaced∑

k with V (2π)−3∫

dk,which required the summand to be a smooth function of k. This operation is certainlynot allowed when the term with k = 0 dominates all the others, in which case it has tobe treated separately. Instead of (34) we therefore have for the density

n =1

V

∑k

〈nk〉

=1

V〈n~0〉 +

1

(2π)3

∫dk

eβ(µ−εk)

1− eβ(µ−εk), (4)

In this last expression we now expand the denominator in a power series in eβ(µ−εk), usethat εk = ~2k2/2m, pass to spherical coordinates in k space, and integrate term by termwith the aid of the identity ∫ ∞

0

dx x2 e−x2

=

√π

4, (5)

and multiply by λ3. This leads to (check this as an exercise)

nλ3 =λ3

V

1

e−βµ − 1+

∞∑`=1

eβµ`

`3/2. (6)

82

Figure 14: Reduced density nλ3 as a function of the fugacity eβµ for the ideal Bose gas.For comparison the straight line corresponding to the classical gas is also shown.

The second term on the RHS above is identical to what one finds from Eq. (VII.26). Forµ→ 0− this term tends to

∞∑`=1

1

`3/2= ζ(3

2) = 2.612 . . . , (7)

where ζ is the Riemann zeta function; the finiteness of this limit value is proof that theexcited levels need no special treatment as µ → 0−. For small βµ the first term on theRHS may be written as

λ3

V

1

e−βµ − 1' −λ

3kBT

V µ, (8)

which for µ→ 0− tends to infinity.

Remark. In the second term the thermodynamic limit V →∞ has been taken (where?).In the first term it is not allowed to let V → ∞, which would reduce it to zero. Thecorrect procedure is to obtain first all final results for a finite system and only then to letV →∞.

The first term cannot be neglected compared to the second one only if

−kBTλ3

V µ>∼ 2.612 or − kBTλ

3

2.612V. µ < 0, (9)

that is, for µ in an interval that tends to zero in the thermodynamic limit. See the figure[not yet included], which represents the number of particles in a volume λ3 as a functionof eβµ [or: as a function of µ when T is kept fixed].

83

Distribution of the particles in k space. When the chemical potential µ is increased,the density increases; but when µ gets very close to zero, all new particles are going tooccupy the level k = 0. In the thermodynamic limit we therefore have a peak at k = 0plus the curve that represents the second term. See figure.

As before, we are interested in the dependence of the observable physical quantitieson T and n, rather than µ and n. Our procedure will be as follows. Let n be fixed. Thefigure then shows λ3 [which is ∝ T−3/2] as a function of µ/kBT . Inversely we may viewthis curve as representing µ/kBT as a function of λ3 and observe that µ/kBT continuesto increase when T goes down so long as T has not reached a value Tc (corresponding toa thermal wavelength λc) which is the solution of

nλ3c =

nh3

(2πmkB)3/2= 2.612, (10)

that is,

Tc =h2

2πmkB

( n

2.612

)2/3

. (11)

We call Tc a critical temperature. Let us now consider expression (6) divided by λ3. ForT > Tc it suffices to take the second term into account. For T < Tc one may set µ = 0 inthe second term which may then be rewritten as

2.612

λ3=

(T

Tc

)3/2

n. (12)

[This term is in fact ∝ T 3/2 and independent of n: the n cancels against the one in thedefinition of Tc.] Hence29

n =

1

λ3

∞∑`=1

eβµ`

`3/2T > Tc ,

− 1

V βµ︸︷︷︸n0

+

(T

Tc

)3/2

n T < Tc .(13)

Without the need of knowing µ for T < Tc we now see that the particle density in thestate k = 0 is given by

n0 = n− 2.612

λ3

=

[1−

(T

Tc

)3/2]n, T < Tc , (14)

where now the limit V → ∞ has been taken. This phenomenon is called Bose-Einsteincondensation. The particles in the single-particle state k = 0 are called the condensate.

29We introduce the density n0 ≡ 〈n~0〉/V ; note that n~0 is a dimensionless number. CT writes N0 insteadof our n0.

84

Figure 15: Condensate density n0(T ) as a function of temperature for the ideal Bose gas.

It is a “condensation” (in the sense of an accumulation or piling up) not in ordinary spacebut in k space. In the figure [not available yet] we show the number of particles in a shellbetween k and k + dk, namely 4πk2〈nk〉dk.

IX.3. Thermodynamics

We investigate several more thermodynamic relations. The particles with k = 0 donot contribute to the energy E of the system, which is given by

E

V=

1

V

∑k 6=0

〈nk〉εk

=1

(2π)3

∫dk εk

eβ(µ−εk)

1− eβ(µ−εk)

= . . .

=3

2

kBT

λ3

∞∑`=1

eβµ`

`5/2. (15)

The calculation is analogous to a preceding one; we used the identity∫ ∞0

dx x4 e−x2

= 38

√π. (16)

The formulas (VI.22) and (VI.24) derived earlier for logZBEg and pBE, respectively,

continue to hold. The replacement of the sum on k by an integral carried out there toobtain them was innocent. The particles in the condensate therefore do not contribute tothe pressure (the point k = 0 has zero weight in an integration).

If we compare (VI.24) for pBE with the one above for E/V we see that

p =2E

3V. (17)

85

For T < Tc the two formulas take a special form since µ = 0,

p =2E

3V=

kBT

λ3

∞∑`=1

1

`5/2= ζ(5

2)kBT

λ3

=ζ(5

2)

ζ(32)nkB

T 5/2

T3/2c

, T < Tc . (18)

Now Tc is proportional to n2/3; the last line therefore implies that below Tc the pressuredoes not depend on the density n. If at fixed T one increases n

(i) p at first increases as it would in an ordinary gas;(ii) at a well-defined critical density n = nc , which when T is fixed follows from the

condition ncλ3 = 2.612, the system enters its low temperature phase: a condensate forms;

(iii) when the density is made to further increase, the pressure stays at the value ithad at the critical density and which is proportional to T 5/2 ;

(iv) explanation: all particles that one continues to add to the system will occupythe single-particle state k = 0.

Since CV = (∂E/∂T )V,N we have for T < Tc the proportionality CV ∝ T 3/2.

Exercise. Derive from Eq. (18) the expression for CV in the temperature regime0 ≤ T < T0. Show that at the critical temperature T = T−c we have the relation

CVkBN

= 1.93. (19)

IX.4. Liquid 4He

Application: liquid 4He. Recall Eq. (11) for the critical temperature. Upon substitutingthe values of m and n for liquid 4He one finds

Tc = 3.14K. (20)

In reality 4He undergoes a phase transition at Tλ = 2.17K, called the lambda transitionbecause of the shape of the specific heat shown in the figure. It is, in fact, a Bose-Einsteincondensation modified by the effects of the interatomic interactions. Below its criticaltemperature liquid 4He is superfluid , that is, it flows without resistance.

See Diu, complement VI.D (in particular pp. 892, 893), or J. Bardeen in Physics TodayVol. 43 no. 12 (December 1990), page 25 (this article also mentions superconductivity).

86

X. PHOTON GAS

X.1. Another view of the harmonic oscillator

Recall that a quantum harmonic oscillator of frequency ω has energy levels (n+ 12)~ω,

where n = 0, 1, 2, . . .. The canonical partition function therefore is

Zc = e−β~ω/2∞∑n=0

e−nβ~ω. (1)

We may look at this formula differently and say that there exists one energy level, ~ω,which may be occupied by an arbitrary number, n, of fictitious particles (actually quantaof oscillation, also called quasi-particles). These particles are therefore bosons and Zc istheir grand-canonical partition function at chemical potential µ = 0.

More generally, if a physical system is composed of N0 oscillators of frequenciesω1, ω2, . . . , ωN0 , then its state is specified by the set of “occupation numbers” of theseoscillators: n1, n2, . . . , nN0 . In that case, as we have seen in Chapter V, the expression forthe partition function, here now denoted by Z, is

Z = e−12β∑i ~ωi︸︷︷︸

Z0

∞∑n1=0

∞∑n2=0

. . .∞∑

nN0=0

e−n1β~ω1+n2β~ω2+...+nN0β~ωN0

︸︷︷︸Z′

,

Z ′ =

N0∏i=1

1

1− e−β~ωi, (2)

in which Z ′ is again a grand-canonical partition function at µ = 0, analogous to the onesencountered in Chapter IX. For the free energy one has

F = −kBT logZ

= −kBT logZ0 − kBT logZ ′

=1

2

∑i

~ωi︸︷︷︸F0

+ kBT∑i

log(1− e−~ωi/kBT )︸︷︷︸F ′

(3)

in which the zero point energy F0 is the contribution to the free energy stemming fromthe ground state energy of the oscillators. Since F0 is a constant and since all physicalobservables may be expressed as derivatives of F , we may work with F ′ instead of F , andwill do so below.30

A difference with the bosons studied in Chapter IX is that here the total boson num-ber, N =

∑i ni, is not conserved and therefore does not play the role of an independent

30We are in fact eliminating a bit too easily here a nontrivial problem: in the case of electromagneticradiation one has F0 =∞. See section X.5.

87

parameter. At a given temperature T the system will automatically settle in an equi-librium state with an average 〈N〉 that is some function of T . This equilibrium is notattained as the result of the exchange of a conserved number of bosons with a reservoir,but by their creation and destruction. There is therefore here one fewer independent pa-rameter than in the case of nondestructible bosons: (T, V ) replaces (T, V,N) or (T, V, µ).

X.2. Maxwell’s equations

The Maxwell equations are classical equations: they are self-contained and withoutany reference to quantum mechanics. We recall here a few things that you know fromyour course on Electromagnetism. Let EEE(r, t) denote a space and time dependent electricfield. Inside a vacuum cavity (that we will take cubic and of volume V = L3) the electricfield must satisfy

1

c2

∂2EEE∂t2

= ∆EEE , divEEE = 0, (4)

in which c is the speed of light in vacuum. Let us look for eigensolutions of (4) bysupposing that they have the form of running waves

EEE(r, t) = EEE0 eik·r−iωt (5)

in which for the moment k and ω are arbitrary. By substitution of (5) in (4) we are ledto the following conclusions.(1) It is necessary for ω to be related to k by

ω = ωk = ck, k ≡ |k|. (6)

(2) For periodic boundary conditions k can take only the values

k =2π

Ln (7)

where n is a vector of arbitrary integer components.(3) The vector EEE0 must be perpendicular to k, that is, EEE0 · k = 0. Hence EEE0 is a linearcombination of two unit vectors that are orthogonal to each other and to k, and that wewill denote by

ekλ with λ = 1, 2 the polarization index.31 (8)

We have therefore found eigensolutions [or: vibrational modes] of the electric field (4)that are running waves characterized by a wavevector k and a polarization λ,

EEE(r, t) = A0kλ ekλ eik·r−iωkt, (9)

where A0kλ is an amplitude. The general solution EEE(r, t) is a sum of (9) over k and λ with

arbitrary32 amplitudes.

31Do not confuse with a wavelenght, often also indicated by the same letter.32The electric field is an observable taking real values and therefore we must have A0

−k,λ = A0∗kλ.

88

X.3. Quantizing the electromagnetic field

The eigensolution (9) my be expressed alternatively as

EEE(r, t) = Akλ(t) ekλ eik·r, (10)

which is the product of a time dependent amplitude and a fixed function of space. Theamplitude Akλ(t) satisfies

d2Akλ

dt2+ c2k2Akλ = 0, (11)

as may be verified directly or by substituting(10) in (4). Eq. (11) may be compared tothe classical harmonic oscillator with Hamiltonian H = p2/2m+mω2x2/2 for which

d2x

dt2+ ω2x = 0. (12)

The vibrational mode (k, λ) of the electromagnetic field is therefore a harmonic oscillatorof frequency ω = ck. With this established we may now quantize the electromagneticfield according to the rules of quantum mechanics. It follows that the energy stored inthe vibrational mode (k, λ) can take only the discrete values

εkλ = (nkλ + 12)~ωk , nkλ = 0, 1, 2, . . . . (13)

We call nkλ the number of photons of type (k, λ). Photons, therefore, are bosons. What ismore, they are strictly noninteracting among themselves and therefore constitute a “true”ideal gas.

Remarks.(1) The polarization index λ is analogous to the spin σ. A photon has a spin S = 1, yetλ takes only two values and not 2S + 1 = 3. This is due to the fact that the photon massis zero (see a book on quantum mechanics; there is no longitudinal polarization).(2) There is no Bose-Einstein condensation since always µ = 0. There is no controlparameter allowing us to fix the total number of photons.(3) The number N0 of modes (k, λ) is infinite and since classically in thermal equilibriumeach of these modes would contain an energy kBT , this would lead to a problem: theultraviolet catastrophe (see below).

X.4. The black body: a photon gas at equilibrium

[Or: Thermodynamics of the radiation field.] By a black body 33 we mean a vacuumcavity (or “oven”) whose walls, having a fixed temperature T , are capable of absorbing andemitting radiation of all wavelengths (this latter feature earns it the qualification “black”).The balance between absorption and emission will lead to an equilibrium in which thecavity is filled with a photon gas at temperature T , independently of the details of theemission and absorption processes; it will be entirely determined by

(i) the temperature T of the walls;

33Do not confuse with a black hole, which is an altogether different thing!

89

(ii) the energy levels (modes) of the electromagnetic field inside the cavity. The ra-diation field thus established is called black-body radiation or sometimes also thermalradiation.

X.4.1. Spectral energy density

One may study the black-body radiation by making a small hole in the cavity (ofnegligible size) and measuring the radiation that comes out. The quantity easiest tomeasure is the intensity of the radiation in a small frequency interval between ω andω + dω.

According to Chapter IX we have at equilibrium

〈nkλ〉 =1

eβ~ωk − 1. (14)

The number of energy levels corresponding to a wavevector of length between k and k+dkis equal to

2V

(2π)34πk2dk =

V ω2

π2c3dω ≡ ρ(ω)dω. (15)

The number n(ω, T )dω of photons having an energy between ~ω and ~ω + ~dω is thissame number multiplied by (14),

n(ω, T )dω =V

π2c3

ω2

eβ~ω − 1dω. (16)

The energy dE of the photons in this frequency interval is

dE = ~ω n(ω, T )dω

= V~π2c3

ω3

eβ~ω − 1︸︷︷︸≡u(ω,T )

dω, (17)

where u(ω, T ) is the spectral energy34 density per unit of volume in the cavity, whichis a measurable quantity. This expression for u(ω, T ) is called Planck’s law (1900) forblack-body radiation. It is valid for all ω and all T , without any restriction, at the solecondition that the system be at equilibrium.

It is of interest to consider two limits of Planck’s law.(i) For ~ω kBT we obtain the classical limit,

u(ω, T ) ' kBT

π2c3ω2, (18)

which is called the Rayleigh-Jeans formula (19th century) and does not contain ~. Whenintegrated over ω it diverges at high frequencies (the “ultraviolet catastrophe”), which

34Here: density = energy per unit interval on the frequency axis.

90

brings out the necessity for a modification of the classical theory.(ii) For ~ω kBT we obtain

u(ω, T ) ' ~π2c3

ω3 e−~ω/kBT , (19)

which is called Wien’s law and was discovered empirically in 1893.

Exercise. Show that for given T the spectral energy density u(ω, T ) has itsmaximum for ω = ωm where

ωm = 2.821kBT

~. (20)

The proportionality of this maximum to T is called Wien’s displacement law.

All matter radiates and in fact many ordinary objects emit thermal radiation ap-proximately as black bodies. Thermal radiation at room temperature peaks at infraredwavelengths. The sun emits thermal radiation at a temperature of about 5250 K (thetemperature of the surface of the sun), which peaks in the visible light.

X.4.2. Thermodynamics

Free energy. We start from (3). The free energy F ′ is given by

F ′ = 2V

(2π)3kBT

∫dk log(1− e−~ωk/kBT ). (21)

Use now (15) to write

2V

(2π)3

∫dk =

∫ ∞0

dω ρ(ω) =V

π2c3

∫ ∞0

dω ω2. (22)

Then, setting x = ~ω/kBT , we get

F ′(T, V ) =V kBT

π2

(kBT

~c

)3 ∫ ∞0

dx x2 log(1− e−x)︸︷︷︸−

1

3

∫ ∞0

dxx3

ex − 1︸︷︷︸π4/15

= −π2

45kBT

(kBT

~c

)3

V. (23)

Energy. The total energy E contained in the cavity may be obtained either from (23)according to E = −T 2(∂/∂T )(F ′/T ) or by integrating the spectral energy density (17).

91

The latter method leads to

E(T, V ) = V

∫ ∞0

dω u(ω, T )

=V ~π2c3

(kBT

~

)4 ∫ ∞0

dxx3

ex − 1

=π2

15V ~c

(kBT

~c

)4

=π2

15kBT

(kBT

~c

)3

︸︷︷︸inversevolume

V, (24)

that is, E(T, V ) is proportional to T 4. This is called the Stefan-Boltzmann law.To interpret this formula, recall that a photon has a wavelength λ that may be written

λ = 2π/k = 2πc/ωk = 2π~c/(~ωk). Therefore 2π~c/kBT is the wavelength of a photonhaving a thermal energy and is analogous to the de Broglie wavelength of a massiveparticle.

Radiation pressure. We obtain directly from (23)

p = −∂F′

∂V=π2

45

(kBT

~c

)3

kBT (25)

and hence, because of (24),

pV =1

3E, (26)

which may be compared to the analogous relations for a classical ideal gas, for whichpV = 2

3E. Microscopically this pressure is due to the collisions of the photons against the

walls. (Can you derive it that way?)

X.5. Casimir force

Exercise. Strictly speaking, the free energy contains a term F0 , the zero-point(or “vacuum”) energy of the electromagnetic field. We have suppressed it rightfrom the beginning by arguing that it is a constant and that therefore its deriva-tives vanish. Do you agree with this for the calculation of E ? For the calculationof p ?The origin of the Casimir force (1948) between two parallel metal plates in

vacuum at distance L was discussed in class. This attractive force is equal to

FCas = − π2

240

~cL4

per unit area. The proof of this formula is beyond the scope of

our course. Show, however, that the only combination of ~, c, and L that makesa pressure is ~c/L4.

92

X.6. Cosmic microwave background (CMB)

[Sometimes referred to as relic radiation.In French: Fond diffus cosmologique or rayonnement fossile.]

During the first ≈ 0.3 × 106 years of its history, the temperature of the universewas above 3000 K and there were many free electric charges around, not combined intoneutral atoms. Since the interaction between photons and charged particles is very strong,during that period the photons were being scattered, emitted, and absorbed all the time,which led to an equilibrium between photons and matter. The photons were distributedaccording to Planck’s law at the prevailing temperature. The universe is expanding andtherefore getting colder. When the temperature went down below 3000 K, neutral atomswere formed and the scattering cross section for photons off matter decreased sharply:the universe became “transparent” for photons. Said differently, “radiation decoupledfrom matter.” Put more quantitatively, the probability for a photon that was free at thatmoment to have been scattered or absorbed by a particle of matter, between then andnow, is negligible. These photons now fill the large empty intergalactic spaces.

The only change is that, since the universe has expanded, all distances have increasedsince the moment of decoupling by a factor that may be calculated and that is close to1000. In particular, the photon wavelength has increased by this factor. Planck’s law atthe moment of decoupling is thereby transformed into a new law of Planck for a radiationin equilibrium at a temperature a thousand times lower, that is, at T ≈ 3 K. Its spectraldensity peaks in the range of microwave frequencies.

This radiation is isotropic (arrives on earth with the same intensity from all directions).Its existence was predicted in 1948 and it was discovered accidentally in 1965 by A. Penziasand K. Wilson (Nobel prize 1978). It has been studied since the 1990’s with the aid ofsatellites (COBE, WMAP, Planck). Its exact temperature has been measured with greataccuracy and was determined in 2009 as 2.72548 ± 0.00057 K. Small fluctuations of thistemperature according to the direction in the sky are of great interest to cosmologistssince they contain information about what happened in the early universe.

93

XI. CORRELATION FUNCTIONS

XI.1. The single-particle density

Let us consider the microstate Γ = (p1, . . . ,pN ; r1, . . . , rN) that was at the basis of theclassical ensembles in Chapters I, II, III, and V. Let R be some point in space. Rememberthat we suppose our particles to be point-like. When the system is in microstate Γ, thenthe microscopic particle density n(R; Γ) in R is zero if there is no particle and infinite ifthere happens to be one of the N particles. The function A(Γ) [Chapter I, Eq. (21)] thatexpresses this is therefore

n(R; Γ) =N∑i=1

δ(ri −R). (1)

Attention: The ri are dynamical variables (degrees of freedom of the system), whereas Ris a parameter representing a fixed position.

Exercise. What quantity is represented by∫V ′⊂V dRn(R; Γ) ?

The average density in R will be denoted by n(R) and is defined as the ensembleaverage of n(R; Γ) according to

n(R) =

∫dΓ ρ(Γ)n(R; Γ)∫

dΓ ρ(Γ)(2)

in which ρ = ρm or ρ = ρc . This is the density that is usually measured in experiments.Note that it may depend on R even in equilibrium, for example when the system is subjectto an external potential.

Exercise. How should expression (2) for n(R) be adapted to the grand-canonicalensemble? Check that∫

V

dRn(R) =

N in the ensembles ‘m’ and ‘c’,

〈N〉g in ensemble ‘g’.(3)

From here on we will work in the canonical ensemble, so that ρ(Γ) = exp[−βH(Γ)].Let as before

H =N∑i=1

p2i

2m+

∑1≤i<j≤N

U(rij) +N∑i=1

Uext(ri).︸︷︷︸UN (r1,...,rN )

(4)

We will let ZN denote the configurational partition function35

ZN =

∫V

dr1 . . .

∫V

drN e−βUN (r1,...,rN ). (5)

35The same quantity was called QN in Chapter V.5.

94

Substitution of (1) and (4) in (2) gives

n(R) = Z−1N

∫V

dr1 . . .

∫V

drN

N∑i=1

δ(ri −R) e−βUN (r1,...,rN ) (6)

whence

n(r1) =N

ZN

∫V

dr2 . . .

∫V

drN e−βUN (r1,...,rN ), (7)

where r1 has become a parameter. Eqs. (6) and (7) show that n(R) is proportional to thepartition function restricted to the microstates that have a particle in R.

Exercise. Show that for an ideal gas in an external potential Uext(R) one has

n(R) = Ce−βUext(R). (8)

and find the expression for C.

The dimensionless quantity n(R)∆R represents the average number of particles ina volume element of size ∆R at R. When n(R)∆R 1, it is also the probabilitythat the volume element of size ∆R around R contain a particle. (Question: why isthe condition 1 needed in this last statement?) For a spatially homogeneous systemn(R) = N/V ≡ n.

XI.2. The p-particle density

XI.2.1. p-particle correlations

We now ask about the probability n(p)(R1, . . . ,Rp)dR1 . . . dRp that p infinitesimalvolume elements around given points be occupied by particles. This requires averaging ageneralization of (1),

n(p)(R1, . . . ,Rp; Γ) =

[N∑i=1

δ(ri −R1)

]. . .

[N∑i=1

δ(ri −Rp)

](9)

and one finds36

n(p)(r1, . . . , rp) =N !

(N − p)!ZN

∫V

drp+1 . . .

∫V

drN e−βUN (r1,...,rN ),︸︷︷︸partition function with pparticles fixed

(10)

which is called the p-particle density.

36You have to fill p positions with p particles selected among N ; there are(Np

)possible selections and

then p! ways to do the filling.

95

XI.2.2. Pair correlation

An important case is p = 2. One calls n(2)(R1,R2) the pair density. For a systemwhich after ensemble averaging is translation and rotation invariant we have

n(2)(R1,R2) = n(2)(R1 −R2) = n(2)(|R1 −R2|) = n(2)(R12). (11)

Recall that the thermodynamic limit consists in letting N, V → ∞ in a fixed ratio n =N/V . After this limit has been taken one may consider the limit |R1 − R2| → ∞. Itshould be expected that37

n(2)(R12)→ n2 for R12 →∞ (12)

(why?) and therefore we setn(2)(R12) = n2g(R12). (13)

Here the dimensionless quantity g(R) is called the pair distribution function and is suchthat g(∞) = 1. Finally we introduce

h(R) = g(R)− 1, (14)

where h(R) is called the (pair) correlation function 38 and is such that h(∞) = 0.

Remark. Eq. (9) leads to (10) only if R1, . . . ,Rp are all different. One may also considerthe RHS of (9) for arbitrary R1, . . . ,Rp, not necessarily all different. For p = 2 we thenget [

N∑i=1

δ(ri −R1)

][N∑i=1

δ(ri −R2)

]=

N∑i=1

δ(ri −R1)δ(ri −R2)︸︷︷︸δ(ri−R1)δ(R1−R2)

+∑i 6=j

δ(ri −R1)δ(rj −R2)

= δ(R1 −R2)n(R1; Γ) + n(2)(R1,R2; Γ). (15)

This expression will be useful in the next section.

In order to discuss the behavior of the correlation function, let us consider for defi-niteness a fluid composed of particles interacting by a Lennard-Jones potential,

VLJ(r) = ε0

[(σr

)12

−(σr

)6]. (16)

This potential reproduces in a good approximation the force39 acting between two noblegas atoms (Ne, Ar, Xe,. . . ). The parameters ε0 and σ (see figure [not included]) areadjustable to the atom species under consideration.

37Except in case (c) below.38This same name is often also applied to g(R) or to n(2)(R).39The behavior ∝ r−6 as r →∞ follows from a quantum mechanical calculation that takes into account

that atoms, although neutral, have fluctuating dipole moments.

96

(a) At high temperature and relatively low density the system is in its gas phase andthe correlation function g(r) behaves as in the figure [not included].

(b) At lower temperature and higher density it is in its liquid phase and g(r) has abehavior as in the figure [not included]. The same behavior prevails in an amorphoussolid (as may be obtained, e.g., by rapidly quenching a liquid), which however is not anequilibrium state.

(c) At low enough temperature the system will become a crystalline solid and g(r)will behave as in the figure [not included]. In this case it will no longer tend to unity asr →∞; we say that the system has long range order.

XI.2.3. Correlation length

The oscillations of g(R), if any, occur at a scale of typically less than a few tens of A.They are representative for the degree of local order in the fluid. For R→∞ (in practicea few tens or at most a few hundreds of A) one may ask how g(R) tends to its limit valueg(∞) = 1. Most often this approach will be exponential and one sets

h(R) = g(R)− 1 ∼ e−R/ξ (17)

where ξ(n, T ) is the correlation length and depends on the two intensive variables n andT . Typically, ξ increases with decreasing temperature. Under special circumstances notto be discussed here the correlation length may tend to infinity: the system is then saidto become ‘critical.”

XI.3. Structure factor

The (static) structure factor S(q) is defined as

S(q) =1

N

⟨∣∣∣∣∣N∑i=1

e−iq·ri

∣∣∣∣∣2⟩

=1

N

N∑i=1

N∑j=1

⟨e−iq·(ri−rj)

⟩. (18)

We may relate this quantity to the pair density by writing

S(q) =1

N

∫V

dR1

∫V

dR2 e−iq·(R1−R2)

⟨N∑i=1

δ(ri −R1)N∑j=1

δ(rj −R2)

⟩︸︷︷︸

nδ(R1−R2)+n(2)(R1−R2)

= 1 +1

N

∫V

dR1

∫d(R2 −R1) eiq·(R2−R1)n(2)(R2 −R1)

= 1 + n

∫dR eiq·Rg(R)

= 1 + (2π)3nδ(q) + n

∫V

dR eiq·R[g(R)− 1] (19)

97

so that (for q 6= 0 )

S(q) = 1 + n

∫V

dR eiq·Rh(R)︸︷︷︸Fourier transform ofthe correlation func-tion

. (20)

The structure factor S(q) derives its importance from the fact that it may be measureddirectly in an experiment of elastic scattering of slow neutrons or of X rays. See figure[not yet available].

In a point r in space the amplitude of the scattered wave is equal to a sum of scatteringcontributions from all particle positions ri , that is,

ψscatt(r) ∝N∑i=1

eik0·ri eik·(r−ri)

= eik·rN∑i=1

e−iq·ri (21)

where q = k−k0 and |k| = |k0| . The scattered intensity measured in r therefore becomes,after ensemble averaging,

I(q) ∝ 〈|ψscatt(r)|2〉 ∝ S(q). (22)

By using (22) and inverting the Fourier transform in (20) one recovers h(R) and henceg(R) from a measurement of I(q).

XI.4. Pair correlation and thermodynamics

Let us consider again Hamiltonian (4), but in the absence of an external field,

H =N∑i=1

p2i

2m+

∑1≤i<j≤N

U(|ri − rj|)︸︷︷︸UN (rN )

, (23)

where we will suppose that the particles are enclosed in a volume V = L3. We now write

98

the average potential energy Epot as

〈Epot〉 =1

2

N∑i=1

N∑j=1(6=i)

〈U(|ri − rj|)〉

=1

2

N∑i=1

N∑j=1(6=i)

∫dR1

∫dR2 U(|R1 −R2|)〈δ(ri −R1)δ(rj −R2)〉

=1

2N n

∫dr g(r)U(r)

= 2πN n

∫ ∞0

dr r2g(r)U(r). (24)

We now look for a similar expression for the pressure p. To that end we start from thedefinition

βp =∂ logZc

∂V=

1

Zc

∂

∂V

1

N !h3N

∫dpN

∫drN e−β

∑Ni=1

p2i2m−βUN (rN ). (25)

How to calculate the derivative with respect to V ? We should take into account that thelimits of integration of the ri depend on V . Here comes the trick to do that: in order toeliminate this dependence we set

ri = Lr′i , i = 1, 2, . . . , N, (26)

so that 0 ≤ x′i, y′i, z′i ≤ 1 and drN = L3Ndr′N = V Ndr′N . In terms of these new variables

of integration Eq. (25) becomes

βp =1

Zc

∂

∂V

1

N !h3NV N

∫dpN

∫dr′N e−β

∑Ni=1

p2i2m−βUN (Lr′N ), (27)

in which now all dependence on V [or: L] is visible. The result of the differentiation is

βp =N

V− β

3V

⟨N∑i=1

∂UN∂ri· ri

⟩. (28)

The negative of the expression in angular brackets is called the (internal) virial of thesystem. This equation is actually a reformulation of the virial theorem (see next sectionfor a derivation from first principles; not available yet). The virial theorem relates theaverage kinetic energy of a many-particle system to another average that involves theforces (and hence the potential energy) between the particles.

So far we have not used for UN its specific expression (23) as a sum of pair potentials.Now note the identity

∂rij∂ri

=ri − rjrij

. (29)

99

Upon using it together with the expression for UN in Eq. (27) we get

p = nkBT −1

3V

N∑i=1

N∑j=1(6=i)

⟨∂U(rij)

∂rij

ri − rjrij

· ri⟩

= nkBT −1

6V

N∑i=1

N∑j=1(6=i)

⟨U ′(rij)

r2ij

rij

⟩

= nkBT −2

3πn2

∫ ∞0

dr r3g(r)U ′(r), (30)

the last line being obtained in the same way as the derivation of (24). When you compare(30) to (V.28),

p

kBT= n+B2n

2 +B3n3 + . . . , (31)

you understand why that equation is called the “virial expansion.” The terms of order n3

and higher in (31) appear when g(r) in (30) is expanded as a power series in n.

Exercise. Check Eq. (30); verify in particular that it is dimensionally correct.

We have therefore succeeded in expressing the pressure with the help of the two-particle correlation function. The second term in (30) is added to the ideal gas result.

XI.6. Virial and virial theorem

Done in class. [See Diu, p. 704 sqq. or C. Kittel, Elementary statistical physics, p.222.]

XI.7. Application: pair correlation in the one-dimensional Ising model

100

XII. SYSTEMS OF CHARGED PARTICLES

A plasma is an ionized gas. An electrolyte is a system of charged particles dissolved ina liquid. The theoretical study of systems with charged particles requires special methodsbecause of the long range of the Coulomb interaction. Many of the methods developedfor systems with short range interactions do not apply any longer. As a general rule, themathematical problems stem from the fact that the space integrals encountered do notconverge, either at infinity or in the origin, or both.

XII.1. The problem

We consider N = Nion + Nel particles, ions and electrons, in a volume V . We let Zstand for the atomic number. Out of the N particles Nion have an electric charge qi = Zeand a mass mi = M (where i = 1, 2, . . . , Nion), and Nel have a charge qi = −e and amass mi = m (where i = Nion + 1, . . . , Nion + Nel). We let N be even and impose globalneutrality,

∑Ni=1 qi = 0. This leads to Nion = (Z + 1)−1N and Nel = Z(Z + 1)−1N . The

Hamiltonian of this system is

H =N∑i=1

p2i

2m+

1

2

N∑i=1

N∑j=1,j 6=i

1

4πε

qiqj|ri − rj|

.︸︷︷︸≡∑Ni=1 qiUi(ri)

(1)

Here Ui(ri) is the potential at the position of the ith particle due to all other particles.The canonical partition function is

Z =1

Nel!Nion!

(2πm

βh2

) 32Nel(

2πM

βh2

) 32Nion

Zconf , (2)

Zconf =

∫dr1 . . .

∫drN e

− 12β∑Ni=1

∑Nj=1,j 6=i

14πε

qiqj|ri−rj | , (3)

where Zconf is the configurational partition function. There is no known way of calculatingZconf exactly and the question that arises is: how to find the best possible approximation?A physical idea is the following. A positive ion will attract round itself negative electronsrather than other positive ions. We may therefore imagine that around a given ion therewill form a “cloud” of negative charges that “shield” its potential. As a consequence,the ion will be the center of a potential that decays at large distances not as ∼ 1/r butfaster40.

XII.2. Debye-Huckel theory

XII.2.1 Equations.

40The idea of an effective potential also occurs in van der Waal’s theory of phase transitions (notdiscussed in these lecture notes); it is a fundamental concept in statistical physics.

101

The first particle (i = 1) is an ion of charge q1 = Ze. We choose its position asthe origin of our coordinate system and ask about the spatial distribution of the othercharges. [*** We might alternatively have fixed a “test” charge in the origin.]

XII.2.2 Solution.

102

XIII. CHEMICAL EQUILIBRIUM

This chapter is an application of the canonical and the grand-canonical ensemble.

XIII.1. Introduction

Suppose a volume V contains NA, NB, and NC particles of three different gases A, B,and C, respectively. We will denote th free energy of this system, when it is at equilibriumat temperature T , by F (T, V,NA, NB, NC). It may be calculated, in principle, from thesystem Hamiltonian H(NA, NB, NC) via the canonical partition function Zc according to

Zc = e−βF

=1

h3NANA!

1

h3NBNB!

1

h3NCNC!

∫dΓA

∫dΓB

∫dΓC e−βH(NA,NB,NC). (1)

Each species has its own chemical potential,

µA =∂F

∂NA

, µB =∂F

∂NB

, µC =∂F

∂NC

. (2)

A variation (dFA, dFB, dFC) of the three particle numbers will change the free energy by

dF = µAdNA + µBdNB + µCdNC . (3)

XIII.2. Chemical reaction

XIII.2.1. Condition for chemical equilibrium

Suppose now that we “turn on” the possibility of the chemical reaction

A + B C. (4)

Examples are e− + H+ H and 2H H2. That is, we have released a constraint. Youknow from chapter III (the Canonical Ensemble) that in such a case the general rule isthat the system will tend to a new equilibrium in which the free energy takes its minimumvalue compatible with the remaining constraints.

In the present case this means that NA, NB, and NC will take values N∗A, N∗B, and N∗C

such that dF = 0 for all allowed variations (dFA, dFB, dFC). Why “allowed”? That isbecause the reaction implies that necessarily dNC = −dNA = −dNB. Using this in (3)we see that the resulting free energy change is given by

dF = (−µA − µB + µC) dNC . (5)

It follows that in the presence of the reaction chemical equilibrium occurs for

µA + µB = µC . (6)

This relation is easily generalized to more complicated reactions (formula given in class).

103

XIII.2.2. Finding N∗A, N∗B, and N∗C

To find the particle numbers at chemical equilibrium a slightly more elaborate versionof the above argument is needed. Done in class.

XIII.3. A special case: three ideal gases

Let A, B, and C be three ideal gases. Let C be a bound state of an A and a B whosebinding energy is −u0 (we will imagine u0 > 0). We will begin by considering this systemin the absence of any reaction, that is , with fixed NA, NB, and NC.

The system Hamiltonian is, in self-evident notation,

H(NA, NB, NC) =

NA∑i=1

p2i

2mA

+

NB∑j=1

p′ 2j2mB

+

NC∑k=1

p′′ 2k2mC

− NC u0 . (7)

You may easily calculate the canonical partition Zc function using Eq. (1), and then thegrand-canonical partition function Zg. The result is that

Zg(T, V, µA, µB, µC)) =∞∑

NA=0

∞∑NB=0

∞∑NC=0

eβµANA+βµBNBβµCNCZc(T, V,NA, NB, NC)

= exp[eβµA

V

λ3A

+ eβµBV

λ3B

+ eβ(µC+u0) V

λ3C

]= exp

[ (ζA + ζB + ζC eβu0

)V︸︷︷︸

−βJ

], (8)

in which the last line defines the ζα and where furthermore λα = h/√

2πmαkBT , forα =A,B,C. For the densities of the three particle species we have

nA =〈NA〉V

=1

V

∂ logZg

β∂µA

= . . . = ζA (9)

and similarlynB = ζB , nC = eβu0ζC . (10)

Now take the reaction into account: this means imposing Eq. (6), µA + µB = µC. Itthen follows from Eqs. (9)-(10) that

ζAζB

ζC

=λ3

C

λ3Aλ

3B

=

(2πkBT

h2

mAmB

mC

)3/2

, (11)

which may be rewritten as

nAnB

nC

= e−βu0(

2πkBT

h2

mAmB

mC

)3/2

. (12)

104

This is called the Saha equation or alternatively the law of mass action. It tells us wherethe reaction equilibrium is. It is remarkable that the ratio in Eq. (12) depends only onthe particle masses, the temperature, and the binding energy u0 .

Exercise (done in class). In each of the following cases, determine with the aidof Saha’s equation in which direction the equilibrium shifts:

(a) when the temperature is raised;(b) when the binding energy u0 is increased;(c) when the pressure is increased (i.e., the volume decreased).

Could you comment on the case u0 < 0 ?

XIII.4. Ionization equilibrium of atomic hydrogen

As an application of the preceding section we consider a gas of atomic hydrogen athigh temperature, such as occurs, for example, in stellar atmospheres. Notation:

H+ + e− H

masses m+ m− m0

densities n+ n− n0 (13)

Since m+ ≈ m0 and n+ = n− the Saha equation becomes (check this)

n2+

n0

=e−βu0

λ3−

, (14)

where λ− = h/√

2πm−kBT . That is, we have one equation for two unknowns. However,there is one more equation. Let us denote by n the total number of H atoms that wewould have in the absence of ionization; equivalently, n = total proton density = totalelectron density. We may consider n as a given variable. Let furthermore

x =n+

n(15)

stand for the fraction of ionized atoms. Then n = n0 + n+ whence n0 = n(1− x). Usingthis in the Saha equation (14) we find

x2

1− x=

e−βu0

nλ3−

=1

n

(h2

2πm−kBT

)3/2

e−u0/kBT . (16)

This quadratic equation for the ionized fraction x(n, T ) at given density and temperatureis easily solved. The temperature dependence is dominated by the exponential. A valuetypical to a stellar atmosphere is n = 1020 m−3. With this value one finds from Eq. (16)that the ionized fraction goes up from close to zero below 3000 K to 1

2around 10 000 K

and almost 1 above 15 000 K.

105

Exercise. Let an atomic hydrogen gas be in equilibrium at temperature T .The ratio of the number N2 of atoms in the excited state of quantum numbern = 2 and energy −EI/4 to the number N1 in the ground state (quantum numbern = 1, energy −EI = −13.6 eV) is (why?)

N2

N1

= 4 exp

(−3

4βEI

). (17)

Calculate this fraction numerically for T = 10 000 K. Explain why it is in someway “easier” to ionize an H atom than to raise it to an excited state.

XIII.5. Reactants with internal states

Discussed in class.

XIII.6. Reaction kinetics

Consider again reaction (4),

A + Bk1k2

C. (18)

Here the dissociation rate k2 is the probability per unit of time for a C to dissociate; andthe recombination rate k1 is the probability per unit of time for an A to recombine witha B, per unit of number density of the B’s.

Let nA, nB, and nC be the time dependent number densities (or “concentrations”) ofthe three species. In your chemistry course you have learned to associate with a reactionof this type the time evolution equations (often called chemical rate equations )

nC = k1nAnB − k2nC ,

nA = −k1nAnB + k2nC ,

nB = −k1nAnB + k2nC . (19)

Note that (19) implies mass conservation, as expressed by (d/dt)(nA + nB + 2nC) = 0.The rate equations say that the system is in equilibrium when the right hand membersvanish, that is, when

nAnB

nC

=k2

k1

. (20)

Upon comparing (20) to the Saha equation (12) we see that the reaction rates k1 and k2

cannot be arbitrary but must have a ratio fixed by fundamental constants (the particlemasses, the temperature, and the binding energy).

106

Appendix A. PROBABILITY DISTRIBUTIONS

A.1. Random variables

Example 1. When a die is thrown, the result may be 1 or 2 or . . . or 6 pips, but theoutcome cannot be predicted: it is a random variable. Let’s denote by P (n) the probabilitythat the result be n. For a fair die we have P (n) = 1/6 for all n = 1, 2, . . . , 6. If the dieis biased, the P (n) may be different from 1/6, but in all cases we should have

0 ≤ P (n) ≤ 1 for n = 1, 2, . . . 6, (1)

P (1) + P (2) + . . .+ P (6) = 1. (2)

This example leads us to the following definition.

Definition. A discrete random variable n is specified by(i) the set X of all values it may take;

(ii) a function P (n) such that

0 ≤ P (n) ≤ 1 for all n ∈ X, (3)∑n∈X

P (n) = 1 (normalization). (4)

We call P (n) a probability law or probability distribution. Eq. (4) is the normalizationcondition of this law: it expresses that we are sure that n will take one of the values inX.

Remark. In many applications X will be a subset of the integers, as in this example.However, X may be any set, e.g. X = red, yellow, green.

Example 2. A needle is dropped onto a two-dimensional surface equipped with a coordi-nate system. What is the probability that the needle make an angle φ of, say, π/4 withthe x axis? Obviously, the probability that the angle be exactly π/4 = 0.785398163 . . . iszero, and we should rephrase the question as follows: what is the probability that φ be inthe interval [π/4− ε, π/4 + ε], for some small given ε ?

More generally, given a prescribed angle φ0, let P (φ0)∆φ0 be the probability that theangle be in the interval [φ0, φ0 + ∆φ0]. Instead of (3) and (4) we now have

P (φ) ≥ 0 for 0 ≤ φ ≤ π, (5)∫ π

0

dφP (φ) = 1. (6)

Note that whereas P (φ0)∆φ is a probability (a number in [0, 1]), the function P (φ) is aprobability density (a probability per unit of angle). This example leads us to the followingdefinition.

Definition. A continuous random variable x is specified by

107

(i) the set X of all values it may take;(ii) a function P (x) such that

0 ≤ P (x) for all x ∈ X, (7)∫X

dxP (x) = 1 (normalization). (8)

Again, we refer to P (x) as a probability law or probability distribution. It is actually aprobability density. In the case of a continuous variable a frequent misuse of language isan expression like “the probability P (x) that the variable be equal to x”. We will toleratethis as long as the speaker is aware of what he actually means.

Remarks.1. A random variable is also called a stochastic variable.2. An outcome (n or x in the above examples) is also called an event.3. Mathematicians often use a separate symbol (for example n or x) to distinguish therandom variable itself from its possible outcomes. This allows such statements as “theprobability that n be equal to n”. However, we will not complicate our notation; it willalways be clear what is meant.4. The discrete and continuous cases discussed above are actually not very different fromone another. In our general discussion below we will denote the random variable by xwithout specifying whether it is discrete or continuous.5. Obviously, one may conceive of random variables whose range X is partly discrete andpartly continuous. Can you think of an example from physics?

A.2. Random variables in physics

A random variable as defined above may be considered as a purely mathematicalobject. In that case one may choose the probability distributions, P (n) or P (x), withoutany constraint. However, in statistical mechanics we will be going to consider certainphysical observables as random variables, for example

– the number of atoms in a (mentally delineated) subvolume of 1 cm3 inside a containerfilled with an atomic gas;

– the speed of a Brownian particle suspended in a gas;in both cases the variable changes rapidly and unpredictably in time and by performingrepeated measurement we may determine its probability distribution. One therefore in-troduces for such a variable a probability law P (x) supposed to summarize our knowledgeabout it. Evidently, such physical probability laws cannot be arbitrary: they must becompatible with the fundamental laws of physics. The probability laws that we will en-counter in this course are all derived from a few basic postulates combined with the lawsof physics. We will consider these postulates as correct when the results of our statisticaltheory are confirmed by the experimental measurements.

In many applications X will be equal to the real axis R or a subset thereof. However,we will often also consider multicomponent variables x = (x1, x2, . . . , xm) taking their

108

values in the m-dimensional space Rm.Example 3. Suppose two particles with positions r1 and r2 are contained in an external

potential Vext(r) and interact through V (|r1 − r2|). Their total potential energy Epot isthen Epot = V (|r1−r2|)+Vext(r1)+Vext(r2). Statistical mechanics asserts that in thermalequilibrium at a temperature T , the probability density P (r1, r2) for finding the twoparticles at positions r1 and r2, respectively, is proportional to exp(−Epot/kBT ), that is,to the Boltzmann factor41. Then

P (r1, r2) = C e−[V (|r1−r2|)+Vext(r1)+Vext(r2)]/kBT , (9)

where C is the appropriate normalization constant. This P is a probability distributionin the six-dimensional space R6.

Example 4. Statistical mechanics assrts that the probability distribution P (v) =P (v1, v2, v3) of the velocity v = (v1, v2, v3) of a particle of mass m in a gas at equilibriumat a temperature T is given by

P (v) =

(m

2πkBT

)3/2

e− mv2

2kBT , (10)

which is called the Maxwell distribution and and is a probability distribution in R3. It isproportional to the Boltzmann factor of the kinetic energy Ekin = 1

2mv2.

A.3. Examples of probability distributions

A.3.1. Distributions of a continuous variable x

a. The exponential lawP (x) = ae−ax, x ≥ 0. (11)

b. The Lorentz or Cauchy law

P (x) =a

π(a2 + x2), −∞ < x <∞. (12)

c. The uniform distribution on an interval

P (x) =1

b− a, a ≤ x ≤ b. (13)

d. The Gaussian or normal law

P (x) =1√

2πσ2e−

(x−a)2

2σ2 , −∞ < x <∞. (14)

A.3.2. Distributions of a discrete variable n

41Here kB is Boltzmann’s constant (more about it later).

109

e. Poisson’s law

P (n) = e−aan

n!, n = 0, 1, 2, . . . (15)

f. The binomial law

P (n) =

(nmax

n

)pn(1− p)nmax−n, n = 0, 1, 2, . . . , nmax. (16)

Exercise. Verify for each case that the distribution is correctly normalized.

Exercise. How does Poisson’s law arise? Let N point-like dots be placed randomly ona line segment of length L, independently and with a uniform distribution. We take thelimit N,L → ∞ with N/L = ρ fixed. What is the probabilty pn that a subsegment oflength y contain exactly n dots?

A.4. Averages and moments

Given a probability distribution P (x), one calls

〈x〉 =

∫dx xP (x) (17)

the average or expectation value of x. If g(x) is a function of x, we call

〈g(x)〉 =

∫dx g(x)P (x) (18)

the average of g(x). The special choice g(x) = xm leads to

〈xm〉 =

∫dx xmP (x), (19)

called the mth moment of x.The quantity x − 〈x〉 is the deviation of a particular value x of the random variable

from its expectation value 〈x〉. It is frequently written as

∆x = x− 〈x〉. (20)

Evidently 〈∆x〉 = 0. However, if we first square and then average we get

σ2x ≡ 〈∆x2〉 = 〈(x− 〈x〉2)〉, (21)

called the mean square deviation, or the variance, or in certain contexts the dispersion,42

of x. It is necessarily nonnegative: σ2x ≥ 0. The quantity σx itself is called the root-mean-

square (rms) or standard deviation of x.

42In your Quantum Mechanics course we used δx and ∆x in lieu and stead of our present ∆x and σx,respectively.

110

Exercise. Show that σ2x = 〈x2〉 − 〈x〉2.

Exercise. Calculate 〈x〉 and σx for the probability laws in section A.3. Which newphenomenon appears in case b ? Adapt the definitions of 〈x〉 and σx to the discrete casefor the examples e and f.

A.5. Multivariate probability laws

By a multivariate law we mean one that depends on two or more variables, such as inExamples 3 and 4 above. We return to Example 4 above. The Maxwell distribution (10)may be rewritten as

P (v) = p(v1)p(v2)p(v3), (22)

where

p(vi) =

(m

2πkBT

)1/2

e− mv2i

2kBT , i = 1, 2, 3. (23)

The important point is that here P factorizes into three separate probability distributions(that in this case happen to be all equal) for v1, v2, and v3. We say that v1, v2, and v3 areindependent random variables. This leads us to the following more general definition.

Definition. Whenever a probability distribution factorizes into two or more distributionsimplying disjoint subsets of its variables, we say that these subsets are independent.

In the general case, whether or not its variables are independent, when a multivariatedistribution P of r variables is integrated on one of them, one obtains the probabilitydistribution – let us call it P ′ – of the remaining r − 1 variables:

P ′(x1, . . . , xr−1) =

∫dxr P (x1, . . . , xr−1, xr). (24)

Here P ′ is called a marginal distribution of P . By repeated integrations one obtainsmarginal distributions of still fewer variables.

For a multivariate distribution one defines the correlation between two variables xiand xj as

Cij ≡ 〈(xi − 〈xi〉)(xj − 〈xj〉)〉 = 〈∆xi∆xj〉. (25)

For i = j we have that Cii = σxi ; for i 6= j we call Cij a covariance. The symmetricmatrix (Cij) is called the covariance matrix. In the frequently occurring case where forall i we have 〈xi〉 = 0, the (co)variances simplify to Cij = 〈xixj〉.

Two variables xi and xj for which Cij = 0 are said to be uncorrelated. Independentvariables are uncorrelated; inversely, however, uncorrelated variables are not necessarilyindependent. Random variables that are independent and drawn from the same proba-bility law are said to be independent and identically distributed, or “i.i.d.”.

Remark. In order to calculate an average involving only the subset of variables x1, . . . , xs,but not its complement xs+1, . . . , xr, it suffices to know the marginal law restricted tothat subset.

111

Exercise. Illustrate this remark explicitly by calculating 〈v3〉 with respect to the Maxwelldistribution.

Exercise. Consider the one-dimensional analog of Example 3. Calculate the averagesand the covariance matrix of the probability distribution P (x1, x2). Are the signs (posi-tive/negative) of your answers compatible with what should be expected?

A.6. Generating function

A.6.1. Moments

Let k be any real parameter. The particular average

G(k) ≡ 〈eikx〉 =

∫ ∞−∞

dx eikx P (x) (26)

is the Fourier transform of P (x) normalized such that G(0) = 1. It is also the momentgenerating function associated with P (x). This becomes clear when we Taylor expandG(k) in k,

G(k) =∞∑`=0

(ik)`

`!〈x`〉. (27)

In certain cases the easiest way to calculate the moments 〈x`〉 may be to first calculateG(k) and then expand it in powers of k. The full distribution P (x) may be obtained fromG(k) by an inverse Fourier transformation.

Note: Equation (26) acquires a simpler look if you set eik = s and consider G as afunction of s instead of k; you may also just set ik = p and consider G as a function of p.

If instead of x we have a discrete variable n, the generating function (26) obviouslybecomes (now in terms of the variable s)

G(s) ≡ 〈sn〉 =∑n

snpn . (28)

The inverse of this relation is

pn =1

2πi

∮ds

sn+1G(s), (29)

where the integration is counterclockwise around the origin of the complex s plane andthe residue theorem is used.

A.6.2. Cumulants

We may ask what series is generated not by G(k) but by logG(k). This yields, insteadof (27), the series

logG(k) =∞∑`=1

(ik)`

`!κ` , (30)

112

whose coefficients κ` are called the cumulants of the distribution P (x). The cumulantsare nonlinear combinations of the moments. One easily finds

κ1 = µ1,

κ2 = µ2 − µ21,

κ3 = µ3 − 3µ2µ1 + 2µ21,

κ4 = µ4 − 4µ3µ1 − 3µ22 + 12µ2µ

21 − 6µ4

1. (31)

The general formula is not simple. Note that the second cumulant is also the variance ofthe distribution. Cumulants appear in all sorts of expansions encountered in theoreticalphysics.

Exercise. Find the generating function of the Gaussian distribution (14). Show that allits cumulants beuond the second one vanish, that is, κ3 = κ4 = . . . = 0.

A.7. Transformation of variables

Let P (x) be a given probability distribution and let y = f(x).Question: What is the probability distribution P (y) of y ?Answer: If y = f(x), the “probability content” of the interval [x, x+dx] must be equal

to that of [y, y + dy] that is,P (x)dx = P (y)dy (32)

where dy = |df(x)/dx|dx. It follows that

P (y) = P (x)

∣∣∣∣dxdy

∣∣∣∣ . (33)

For multivariate distributions the absolute value of the Jacobian appears.

Exercise. Derive from (10) the marginal distribution P1(v) for the velocity v ≡ |v| andthen the distribution P2(E) of the kinetic energy E = 1

2mv2.

A.8. Central limit theorem

Let x1, x2, . . . , xr be independent variables drawn from distributions Px1(x1), Px2(x2), . . .,Pxr(xr) whose corresponding generating functions are Gx1(k1), Gx2(k2), . . . , Gxr(kr). Wewill take these variables to have zero average.

We define the sum y = x1+x2+. . .+xr and are interested in its probability distributionPy(y) ; we will call the corresponding generating function Gy(y).

By Fourier transforming

Py(y) =

∫dx1 . . .

∫dxr Px1(x1) . . . Pxr(xr) δ(y − x1 − x2 − . . .− xr) (34)

we findGy(k) = Gx1(k)Gx2(k) . . . Gxr(k). (35)

113

Suppose now all xi identically distributed, denote their law by Px(x), and the correspond-ing generating function by Gx(k). Then (35) becomes

Gy(k) = [Gx(k)]r . (36)

1. Px(x) a Gaussian. For the special case of a Gaussian Px(x) we have the correspondence

Px(x) =1√

2πσ2e−

x2

2σ2 ⇐⇒ Gx(k) = e−12σ2k2 . (37)

In this case we see by combining (36) and (37) that Gy(k) = exp(−12rσ2k2) and hence

Py(y) is a Gaussian of variance rσ2.This suggest to define a differently scaled sum variable,

z =x1 + xx + . . .+ xr√

r. (38)

By relating its generating function to that of y we obtain

Gz(k) = 〈eikz〉 =⟨

ei k√ry⟩

= Gy

(k√r

)=

[Gx

(k√r

)]r= e−

12σ2k2 , (39)

whence it follows that the distribution Pz(z) is a Gaussian of variance σ2.

2. Px(x) arbitrary. Now things become interesting. For an arbitrary distribution Px(x)we have that

Gx(k) = 1− 1

2σ2k2 + . . . , (40)

where the dots stand for higher order terms in k. Again we ask about the distributionPz(z) of the scaled sum defined by (38). Upon substituting (40) in (36) we find

Gz(k) =

[1− σk2

2r+ . . .

]r. (41)

We observe that the dot terms, which are of higher order in k, are also of higher order in1/√r. Hence, upon taking the limit r →∞ we get from (41) the final result

Gz(k) = e−12σ2k2 , r →∞. (42)

The correspondence (37) then leads to the Central Limit Theorem (CLT), that we maynow formulate as follows.

Let i.i.d. random variables x1, x2, . . . , xr be drawn from an arbitrary common proba-bility distribution Px(x) that has a variance σ2. Then the scaled sum z = (x1 +x2 + . . .+xr)/√r is distributed, in the limit r →∞, according to the Gaussian

Pz(z) =1√

2πσ2e−

z2

2σ2 . (43)

Remarks.

114

(1) It is sometimes said that under addition of i.i.d. random variables the Gaussiandistribution is an attractor in the space of probability laws. Mathematically this happensin Eq. (41), where the dot terms, that differentiate between different distributions Px(x),are seen to be negligible when r →∞.

(2) The CLT applies to a very wide class of laws Px(x). Nevertheless, excluded arethose that do not satisfy σ2 < ∞, that is, do not have a finite variance. For such lawsthere exist other attractors, called Levy distributions.

115

Appendix B. METHOD OF STEEPEST DESCENT

B.1. Maximum inside an interval

Suppose you are given the integral

IN =

∫ ∞−∞

dx eNg(x) (1)

in which g(x) is an arbitrary (sufficiently differentiable) function having a maximum onthe x axis. Of course you cannot calculate IN explicitly if you know nothing else aboutg(x).

Question: In the absence of any further knowledge, what can you say about thebehavior of IN in the limit of large N?

Answer: Let the maximum of g(x) occur at x = x0. Expand

g(x) = g(x0)+1

2g′′(x0)(x−x0)2+

1

6g′′′

(x0)(x−x0)3+1

24giv

(x0)(x−x0)4+O((x−x0)5

), (2)

substitute this expansion in (1), and change to the variable of integration y = x − x0.This leads to

IN = eNg(x0)

∫ ∞−∞

dy eN[

12g′′

(x0)y2+ 16g′′′

(x0)y3+ 124giv

(x0)y4+O(y5)]. (3)

Now scale y by a power of N such that the leading (that is, the quadratic) term in the

exponential becomes N independent. This fixes the choice y = N12 y. Hence

IN = N−12 eNg(x0)

∫ ∞−∞

dy e12g′′

(x0)y2+ 16N−

12 g′′′

(x0)y3+ 124N−1g

iv(x0)y4+O(N−

32 y5). (4)

Expand now all terms in the integrand that are accompanied by negative powers of N .This gives

IN = N−12 eNg(x0)

∫ ∞−∞

dy e12g′′

(x0)y2[1 +

1

6N−

12 g′′′

(x0)y3 +1

24N−1g

iv

(x0)y4 +O(N−32y5)

].

(5)Each term on the RHS now contains a Gaussian integral with a power of y inserted. Usingthat odd powers yield zero by symmetry and that∫ ∞

−∞dy e−

12αy2 =

(2π)12

α12

,

∫ ∞−∞

dy y4 e−12αy2 =

3π12

α32

, (6)

we obtain, up to the first correction term,

IN =

√2π

N |g′′(x0)|eNg(x0)

[1 +

3√

2 giv

(x0)

12|g′′(x0)|N+ . . .

]. (7)

116

This series may be determined to any order desired in N−1.

Exercise. Slightly generalize this to do the same integral with the insertion ofan arbitrary (suffciently differentiable) function h(x), that is,

I ′N =

∫ ∞−∞

dx h(x) eNg(x) (8)

[Indication: The first correction term receives an extra contribution proportionalto h′(x0)g

′′′(x0).]

Exercise. Show that the series (7) does not change if the integration in (1) isonly over an interval (a, b) containing x0, the corrections being of order O(e−cN)for some c > 0.

Remarks(1) The method exposed here is called the method of steepest descent. It has several

alternative names: Laplace’s method, the saddle point method, or the stationary phasemethod. The last two names refer to the fact that when viewed in the complex plane theintegrand has a saddle point.

(2) In statistical mechanics N usually represents the system size. In many instanceswhere integrals of the above type occur one is interested first of all in the terms in theexponential that are proportional to N . In that case we may write (7) as

IN = eNg(x0)+O(logN), (9)

and obtaining this result requires nothing more than the value of g(x0).(3) The range of x values that effectively contribute to the integral is determined by

the rms deviation 〈(x− x0)2〉 12 = 1/√Ng′′(x0).

(4) Notice that when h(x) is inserted, the integration acts on this insertion as anunnormalized Dirac delta at x = x0. This may also be expressed by I ′N/IN = h(x0) + . . .,where the dot terms vanish as N →∞.

B.2. Boundary maximum

We consider again the integral (1) but now suppose that x is integrated over a finiteinterval [a, b] and that g(x) has a boundary maximum in x = a. The calculation thenproceeds as follows. [*** To be completed.]

117

Appendix C. LAGRANGE MULTIPLIER METHOD

Suppose you wish to find the stationary points of a function f(x1, x2, . . . , xs) underthe conditions

gj(x1, x2, . . . , xs) = 0, (1)

where j = 1, 2, . . . , r and r ≤ s. To solve this problem, introduce parameters λ1, λ2, . . . , λr,called (Lagrange) multipliers, and consider the more general problem of finding the sta-tionary points of

F (x1, x2, . . . , xs)−r∑j=1

λjgj(x1, x2, . . . , xs). (2)

This leads to the s equations

∂F

∂xi= 0 , i = 1, 2, . . . , s, (3)

which may have one or several solutions(x∗1(λ), x∗2(λ), . . . , x∗s(λ)

), where λ = (λ1, λ2, . . . , λr).

Now substitute this solution in (1). This leads to

gj(x∗1(λ), x∗2(λ), . . . , x∗s(λ)) = 0, j = 1, 2, . . . , r, (4)

which are now r equations for the unknown r-component vector λ. Let the solution be λ∗.Then (x∗1(λ∗), x∗2(λ∗), . . . , x∗s(λ

∗)) is a stationary point of f(x1, x2, . . . , xs) that satisfies theconditions.

118

statistical mechanics - université paris-sud · with in the statistical mechanics ii course or in...

Documents