chapter 1 review of thermodynamics and statistical · pdf filechapter 1 review of...

Chapter 1

Review of thermodynamics andstatistical mechanics

This is a short review that puts together the basic concepts and mathematical expressionsof thermodynamics and statistical mechanics, mainly for future reference.

1.1 Thermodynamic variables

For the sake of simplicity, we will refer in this section to a so-called ’simple system’, mean-ing: a one-component system, without electric charge or electric or magnetic polarisation,in bulk (i.e. far from any surface), etc. Such a system is characterised from a thermody-namic point of view by means of three variables: N , number of particles; V volume, andE, internal energy. Normally one explores the thermodynamic limit,

N → ∞, V → ∞,N

V= ρ < ∞, (1.1)

where ρ = N/V is the mean density. In a magnetic system the role of volume is playedby the magnetisation M . All these parameters are extensive, i.e. they are proportionalto the system size: if the system increases in size by a factor λ, the variables change bythe same factor.

1.2 Laws of Thermodynamics

Thermodynamics is a macroscopic theory that provides relationships between (but notquantitative values of) the different thermodynamic parameters; no reference is made toany microscopic variable or coordinate. It is based on three well-known laws:

1. First law of thermodynamics: it focuses on the energy, E, stating that the energy isan extensive and conserved quantity. Mathematically:

dE = δW + δQ, (1.2)

where dE is the change in energy involved in an infinitesimal thermodynamic pro-cess, δW is the mechanical work done on the system, and δQ is the amount ofheat transferred to the system. δW and δQ are inexact differential quantities (i.e.

1

quantities such as W or Q do not exist, in the sense that their values cannot be de-fined for the equilibrium states of the system). However, dE is the same regardlessof the type of process involved, and depends only on the initial and final states: itis a state function, and every equilibrium state of the system has a unique value of E.

In fact both δW and δQ are ‘work’ terms, but the latter contains the contributionsof microscopic interactions in the system, which cannot be evaluated in practicalterms. δQ is then separated from the ‘thermodynamic’ work δW associated withthe macroscopic (or thermodynamic) variables.

There may be various terms contributing to the thermodynamic work. In general:

δW =∑

i

xidXi, (1.3)

where (xi, Xi) are conjugate variables. xi is an intensive variable, e.g. µ (chemicalpotential), p (pressure), H (magnetic field), ... , and Xi the corresponding extensivevariable, e.g. N , −V , −M (minus the magnetisation), ...

2. Second law of thermodynamics: the central quantity here is the entropy, S, which ispostulated to be a monotonically increasing state function of the energy E. The lawstates that, in an isolated system, a process from a state A to a state B is such thatSB ≥ SA. The equality (inequality) stands for reversible (irreversible) processes.Then, entropy always increases (up to a maximum) or remains unchanged in anabiabatic process (one not involving energy exchanges with the environment). Pos-sible processes involve changes in microscopic variables; this implies that entropyis a maximum with respect to changes in these variables at fixed thermodynamicvariables (N, V and E).

Most of this course will be hovering over the entropy concept. The reason is that thesecond law is a genuine thermodynamic law in the sense that it introduces a new,non-mechanical quantity, the entropy S = S(N, V, E), which is a thermodynamic

potential. This means that all other macroscopic quantities can be derived from S:

1

T=

(

∂S

∂E

)

N,V

,p

T=

(

∂S

∂V

)

N,E

,µ

T= −

(

∂S

∂N

)

V,E

, (1.4)

where T , p and µ are the temperature, the pressure and the chemical potential,respectively. These are intensive variables, which do not depend on the system size.The above scheme is called entropy representation.

Since S increases monotonically with E, it can be inverted to give E = E(N, V, S).The intensive parameters in this energy representation are:

T =

(

∂E

∂S

)

N,V

, p = −(

∂E

∂V

)

N,E

, µ =

(

∂E

∂N

)

V,E

. (1.5)

In a process where both N and V remain fixed (i.e. no mechanical work is involved),dE = δQ = TdS [the latter equality following from the first of Eqns. (1.5)] and wehave dS = δQ/T . This is valid for reversible processes. For irreversible processesdS ≥ δQ/T .

2

3. Third law of thermodynamics: A system at zero absolute temperature has zeroentropy.

1.3 Processes and classical thermodynamics

Processes are very important in the applications of thermodynamics. There are threetypes of processes:

• quasistatic process: it occurs very slowly in time

• reversible process: a quasistatic process that can be reversed without changing thesystem or the environment. The system may always be considered to be in anequilibrium state. A quasistatic process is not necessarily reversible

• irreversible process: it is a fast process such that the system does not pass throughequilibrium states. If the process is cyclic the system returns to the same state butthe environment is changed

Historically the discipline of thermodynamics grew out of the necessity to understand theoperation and efficiency of thermal engines, i.e. systems that perform cycles while inter-acting with the environment. The central equation used in this context is δQ = TdS,which assumes that processes occurring in the system are reversible. Thermal enginesperform cycles, where the system departs from a given state and returns to the samestate after the end of the cycle. The cycle is then repeated. During the cycle, the sys-tem absorbs heat Q and performs work −W (remember that W is the work done by theenvironment on the system). Obviously Q = −W , since the energy does not change ina cycle, ∆E = 0 (because the system begins and ends in the same state, and E is astate function). It turns out that, after the cycle is completed, the system necessarily hasreleased some heat, and this result follows from the second law.

Consider the cycle depicted in Fig. 1.1, called Carnot cycle. The cycle has been rep-resented in the T − S plane, which is quite convenient to discuss cycles. In the Carnotcycle the engine operates between two temperature reservoirs T1 and T2, with T1 > T2.From a state A, the system follows an isothermal process at temperature T1 to a stateB, after which the change of entropy is ∆SAB = Q1/T1, with Q1 the heat absorbed bythe system from the reservoir. From B an adiabatic process is followed down to C, with∆SBC = 0 and no heat exchange. Then the system follows a reversed path at temper-ature T2 from C to D along which it releases a heat −Q2 and changes its entropy by∆SCD = −Q2/T2. Finally, the system revisits state A by following another adiabatic pathwith ∆SDA = 0. The total entropy change has to be zero, since S is a state function:∆S = ∆SAB + ∆SBC + ∆SCD + ∆SDA = 0, which means

Q1

T1

=Q2

T2

→ Q1

Q2

=T1

T2

> 1, (1.6)

the last inequality following from the condition T1 > T2. This means that the systemabsorbs and releases heat, with a positive net amount of heat absorbed, Q = Q1−Q2 > 0.

3

This heat is used to perform work, W = −Q. The amount of heat or work is given bythe area surrounded by the cycle,

Q = −W =∮

TdS = (T1 − T2)(S2 − S1). (1.7)

On the other hand, the heat absorbed in the first process is Q1 = T1(S2 − S1). Theefficiency of the heat engine is defined as

η =work performed by system

heat absorbed by system=

−W

Q1

=(T1 − T2)(S2 − S1)

T1(S2 − S1)= 1 − T2

T1

. (1.8)

Obviously, since 0 < T2 < T1, the efficiency is never 100% (i.e. there is always some heatreleased by the engine, Q2 > 0). The Carnot cycle can be used to measure temperatureratios by calculating the efficiency of a given cycle.

It can be shown that the Carnot cycle is the most efficient engine possible, which is analternative statement of the second law. Other alternative statements exist, notably thosedue to Kelvin and Clausius. This point of view is useful in connection with engineeringapplications of thermodynamics, and therefore we will not discuss it in the course.

TT

T

SS S

BA

D C

1

2

1 2

Q=-W

Figure 1.1: Carnot cycle in a heat engine between temperatures T1 and T2. The netamount of heat absorbed by the system, which is equal to the work performed by thesystem, is given by the area enclosed by the four thermodynamic paths.

1.4 Thermodynamic potentials

The parameters (N, V, E) characterising the system are often inconvenient from variousaspects (experimental or otherwise). Generalisations of thermodynamics are necessaryto contemplate different sets of variables. The maximum-entropy principle can be easilygeneralised by defining different thermodynamic potentials .

4

• (N, V, S). This set of variables is not widely used but it is anyway useful sometimes,especially in theoretical arguments. The thermodynamic potential here is the energy,E = E(N, V, S), which can always be obtained from S = S(N, V, E) by inversion.The energy satisfies a minimum principle, in constrast with the entropy (whichsatisfies a maximum principle).

• (N, V, T ). These are more realistic variables. The thermodynamic potential is theHelmholtz free energy, F = F (N, V, T ), such that F = E − TS. This functionsatisfies a minimum principle, which is basic to most of the developments that willbe made in this course.

• (N, p, T ). The thermodynamic potential is the Gibbs free energy, G = G(N, p, T ),with G = F + pV , also obeying a minimum principle.

• (µ, V, T ). The characteristic thermodynamic potential is now the grand potential,Ω = Ω(µ, V, T ), with Ω = F − µN . The grand potential is again minimum for theequilibrium state of the system.

The fact that the different thermodynamic potentials satisfy an extremum principle canbe summarised in the following differential expressions:

dS = 0, d2S < 0, dF = 0, d2F > 0, dG = 0, d2G > 0, dΩ = 0, d2Ω > 0. (1.9)

Since variations implied by the differentials are due to changes in microscopic variables,many times in statistical mechanics we will be taking derivatives of the relevant thermo-dynamic potentials with respect to microscopic variables, equating these derivatives tozero in order to obtain the equilibrium state of the system as an extremum condition, andimposing a sign on second variations in order to analyse the stability of the solutions.

1.5 Thermodynamic processes and first derivatives

Finally, let us consider expressions relating the variables during thermodynamic processes;these come in terms of relations between differential quantities. For the energy:

dE =

(

∂E

∂N

)

V,S

dN +

(

∂E

∂V

)

N,S

dV +

(

∂E

∂S

)

N,V

dS = µdN − pdV + TdS, (1.10)

which expresses the first law of thermodynamics when there is a change in the number ofparticles. For the Helmholtz free energy:

dF = dE − TdS − SdT = µdN − pdV − SdT, (1.11)

from which

µ =

(

∂F

∂N

)

V,T

, p = −(

∂F

∂V

)

N,T

, S = −(

∂F

∂T

)

N,V

. (1.12)

For the Gibbs free energy:

dG = dF + pdV + V dp = µdN − SdT + V dp, (1.13)

5

so that one obtains

µ =

(

∂G

∂N

)

p,T

, V =

(

∂G

∂V

)

N,T

, S = −(

∂G

∂T

)

N,p

. (1.14)

Finally, for the grand potential:

dΩ = dF − µdN − Ndµ = −pdV − SdT − Ndµ, (1.15)

and

N = −(

∂Ω

∂µ

)

V,T

, p = −(

∂Ω

∂V

)

µ,T

, S = −(

∂Ω

∂T

)

µ,V

. (1.16)

1.6 Second derivatives

Second derivatives of the thermodynamic potentials incorporate useful properties of thesystem. They are also called response functions, since they are a measure of how thesystem reacts to external perturbations. The most important are:

• Thermal expansion coefficient:

α ≡ 1

V

(

∂V

∂T

)

N,p

=1

V

(

∂2G

∂T∂p

)

N

. (1.17)

This coefficient measures the resulting volume change of a material when the tem-perature is changed. α > 0, since ∆T > 0 necessarily implies ∆V > 0.

• Isothermal compressibility:

κT ≡ − 1

V

(

∂V

∂p

)

T

= − 1

V

(

∂2G

∂p2

)

N,T

. (1.18)

Compressibility measures the change of volume of a material when a pressure isapplied on the material. A very compressible material has a large value of κ: onapplying a slight incremental pressure the volume changes by a large amount. Theopposite occurs when the material is very little compressible. The sign − in thedefinition is needed to insure that κT > 0, since ∆p > 0 implies ∆V < 0.

Another compressibility, the adiabatic compressibility, is also defined:

κS ≡ − 1

V

(

∂V

∂p

)

S

. (1.19)

• Specific heat (or heat capacity) at constant volume:

CV ≡ T

(

∂S

∂T

)

N,V

=

(

∂E

∂T

)

N,V

=

(

∂Q

∂T

)

N,V

= −T

(

∂2F

∂T 2

)

N,V

. (1.20)

The heat capacity per particle, cv = Cv/N is also commonly used. The heat capacityis a measure of how energy changes when the temperature of a material is changed.If high, a large variation in energy corresponds to a small change in temperature(this is the case in water, which explains the inertia of water to change temperatureand its quality as a good thermoregulator). Obviously CV > 0 since, when ∆T > 0,energy must necessarily increase, ∆E > 0.

6

• Specific heat at constant pressure:

Cp ≡ T

(

∂S

∂T

)

N,p

=

(

∂E

∂T

)

N,p

=

(

∂Q

∂T

)

N,p

= −T

(

∂2G

∂T 2

)

N,p

. (1.21)

Likewise the heat capacity per particle, cp = Cp/N is commonly used. This heatcapacity is more easily accessible experimentally. There is a relation between Cv

and Cp involving some of the previously defined coefficients:

Cp = CV +TV α2

NκT

. (1.22)

A useful additional result is the Euler theorem. To obtain the theorem, we rst write theextensivity property of the energy as

E(λN, λV, λS) = λE(N, V, S), (1.23)

and differentiate with respect to λ:

∂E

∂λNN +

∂E

∂λVV +

∂E

∂λSS = E(N, V, S). (1.24)

Taking λ = 1 (this is legitimate since λ is arbitrary) and using the definition of theintensive parameters:

E = µNpV + TS (Euler theorem). (1.25)

From this we can obtain the Gibbs-Duhem equation by differentiating:

dE = µdN + Ndµ − pdV − V dp + TdS + SdT,

Ndµ − V dp + SdT = 0 (Gibbs-Duhem equation). (1.26)

The Gibbs-Duhem equation reflects the fact that the intensive parameters are not allindependent.

When the Euler theorem is applied to the other thermodynamic potentials, the followingrelations are obtained:

Ω = −pV, G = µN, F = µN − pV. (1.27)

For example, let us obtain the second relation. We have:

G(λN, p, T ) = λG(N, p, T ). (1.28)

Differentiating with respect to λ:

∂G

∂λNN = G(N, p, T ). (1.29)

Taking λ = 1 and using µ = (∂G/∂N)(p,T ), the relation G = µN follows.

7

1.7 Equilibrium and stability conditions

Intensive parameters are very important for various reasons. One is that they are easilycontrolled in experiments so that one can establish certain conditions for the environmentof the system. Another reason is that they define specific criteria for the equilibriumconditions of a system.

Let us consider two systems in contact, 1 and 2, so that energy is allowed to flow fromone to the other, but the system as a whole is otherwise isolated. Let E1 and E2 be theirenergies, V1 and V2 their volumes, and N1 and N2 their numbers of particles. Volumesand number of particles are constant, but energies can change subject to the conditionE1 + E2 =const. Since dS = 0 for arbitrary internal changes, with S = S1 + S2, and inany process dE1 = dE2, we have:

0 = dS =

(

∂S1

∂E1

)

(N1,V1)

dE1 +

(

∂S2

∂E2

)

(N2,V2)

dE2 = dE1

(

∂S1

∂E1

)

(N1,V1)

−(

∂S2

∂E2

)

(N2,V2)

= dE1

(

1

T1− 1

T2

)

. (1.30)

Therefore, at equilibrium, T1 = T2. Now if we also allow for variations in the volumes V1

and V2, with the restriction V1 + V2 =const., then in any process

0 = dS =

(

∂S1

∂E1

)

(N1,V1)

dE1 +

(

∂S2

∂E2

)

(N2,V2)

dE2 +

(

∂S1

∂V1

)

(N1,E1)

dV1 +

(

∂S2

∂V2

)

(N2,E2)

dV2

= dE1

(

∂S1

∂E1

)

(N1,V1)

−(

∂S2

∂E2

)

(N2,V2)

+ dV1

(

∂S1

∂V1

)

(N1,E1)

−(

∂S2

∂V2

)

(N2,E2)

, (1.31)

so that

0 = dE1

(

1

T1

− 1

T2

)

+ dV1

(

p1

T1

− p2

T2

)

, (1.32)

from which T1 = T2 and p1 = p2. Finally, if transfer of particles is allowed between thetwo systems, but with the restriction N1 + N2 = const. and the volumes of the systemsfixed,

0 = dS =

(

∂S1

∂E1

)

(N1,V1)

dE1 +

(

∂S2

∂E2

)

(N2,V2)

dE2 +

(

∂S1

∂N1

)

(V1,E1)

dN1 +

(

∂S2

∂N2

)

(V2,E2)

dN2

= dE1

(

1

T1− 1

T2

)

− dN1

(

µ1

T1− µ2

T2

)

, (1.33)

from which T1 = T2 and µ1 = µ2. Therefore, for the two systems to be in equilibrium,thermal, mechanical and chemical equilibria are required at the same time. These con-ditions will be very important when we discuss phase transitions of the first order insubsequent chapters. Using the same arguments, one can also say that in a single systemat equilibrium the local temperature, the local pressure and the local chemical potentialmust be equal everywhere, i.e. all spatial gradients of the intensive parameters must be

8

zero (of course this requires a suitable definition for local quantity). Any nonzero gradientgives rise to a current or flow (transport of energy and mass) that restores equilibrium.Relation between current and gradient is usually given, in linear response theory, by aproportionaly law of the type of Fouriers equation or similar laws.

The above equilibrium conditions are obtained from the equation dS = 0, which is anextremum condition, and involve the equality of the intensive parameters. The conditionon second differential, d2S < 0, can also be exploited, giving rise to the stability conditionsfor the system. These involve the response functions. As an example, we consider theGibbs free energy G(N, p, T ) which, by virtue of the Euler equation, can be writtenG = F + pV = ETS + pV , i.e.

G(N, p, T ) = E(N, V, S)TS + pV. (1.34)

Suppose we fix (N, p, T ) and consider an internal process in the system. This will changethe values of S and V and, correspondingly, of E. The variation in G to all orders willbe:

∆G = ∆E − T∆S + p∆V =

(

∂E

∂V

)

dV +

(

∂E

∂S

)

dS − TdS + pdV

+1

2

[(

∂2E

∂V 2

)

dV 2 + 2

(

∂2E

∂V ∂S

)

dV dS +

(

∂2E

∂S2

)

dS2

]

+ · · · (1.35)

Since(

∂E

∂V

)

= −p,

(

∂E

∂S

)

= T, (1.36)

we get

∆G =1

2

[(

∂2E

∂V 2

)

dV 2 + 2

(

∂2E

∂V ∂S

)

dV dS +

(

∂2E

∂S2

)

dS2

]

+ · · · (1.37)

The condition d2G > 0 implies(

∂2E

∂V 2

)

dV 2 + 2

(

∂2E

∂V ∂S

)

dV dS +

(

∂2E

∂S2

)

dS2 > 0, (1.38)

which requires

(

∂2E

∂V 2

)

> 0,

(

∂2E

∂S2

)

> 0,

(

∂2E

∂V 2

)(

∂2E

∂S2

)

−(

∂2E

∂V ∂S

)2

> 0. (1.39)

Using the denition of the response functions, the following conditions of stability areobtained:

CV > 0, κS > 0,T

V κSCV>

(

∂T

∂V

)2

S

. (1.40)

Following the same procedure but using other thermodynamic potentials one can obtainother stability conditions, e.g. κT > 0, Cp > 0, ...

9

1.8 Statistical ensembles

While thermodynamics derives relationships between macroscopic variables, statisticalmechanics aims at providing numerical values for such quantities. It also explains rela-tions between thermodynamic quantities as derived by thermodynamics. It is based onknowledge of the interactions between the microscopic entities (either real or defined)making up the system.

In principle statistical mechanics takes account of the time evolution of all degrees offreedom in the system. Thermodynamic mechanical quantities (thermodynamic energy,pressure, etc.) are identified with appropriate time averages of corresponding quantitiesdefined for each microscopic state. Since it is not possible to follow the time evolution ofthe system in detail, a useful mathematical device, the ensemble, is used by equilibriumstatistical mechanics. The basic postulate is that time averages coincide with ensembleaverages (the ergodic hypothesis). Ensemble theory tremendously facilitates calculations,but a statistical approach is implicit via probability distributions. The loss of informationinherent to a probabilistic approach is not relevant in practical terms.

An ensemble is an imaginary collection of static replicas of the system, each in one ofthe possible microscopic states; possible means compatible with the imposed macroscopicconditions. Corresponding to different possibilities to define these conditions, different en-sembles can be defined. The one associated with the primitive variables (N, V, E) is theso-called microcanonical ensemble: here we mentally collect all the possible microscopicstates (Ω in number; not to be confused with the grand potential) that are compatiblewith the imposed conditions (N, V, E). A basic postulate of statistical mechanics is that,as the system dynamically explores all these possible states, the time spent on any oneof them is the same. This means that the number (or fraction) of replicas of the systemto be found in the ensemble in any one state is the same, irrespective of the state. If weconsider the states to be numerable (this is actually the case quantum-mechanically) withan integer ν = 1, 2, ..., Ω, all states are equally probable and assigned the same probabilitypν = Ω−1. Connection with thermodynamics is realised via the famous expression

S = k log Ω, (1.41)

where k is Boltzmann’s constant, k = 1.3806503×10−23 J K−1. All macroscopic variablescan be obtained from S using Eqns. (1.4). The microcanonical ensemble is difficult to usesince counting the number of microstates (i.e. computing Ω) is often impractical. Also,in classical mechanics, where states form a continuum and therefore Ω would in principlebe infinite, one has to resort to quantum-mechanical arguments (to demand compatibilitywith the classical limit of quantum mechanics) and associate one state to a volume hn inphase space (with h Planck’s constant and n the number of degrees of freedom). One canstill use Eqn. (1.41), but its interpretation is a bit awkward.

Eqn. (1.41) connects the microscopic and the macroscopic worlds. Also it gives anintuitive picture of what entropy means: it is a quantity associated with order. In effect,an ordered system has very few possible configurations Ω and therefore low entropy (con-sider a solid where atoms are located in the neighbourhood of the sites of a lattice andcannot wander very far), while a disordered system has a large number of possible states

10

which means high entropy (for instance, a fluid, where atoms can explore the availablevolume more or less freely). The history of the development of the entropy concept and itsrelation to mechanics is a fascinating one, involving the irreversibility paradox solved byBoltzmann (and the heated debate associated with it that allegedly led to Boltzmann’ssuicide). The paradox states that the microscopic (either classical or quantum) equa-tions of motion are time-reversible, but macroscopically there is an arrow of time whichprocesses have to respect (for example, a gas spontaneously expands into the availablevolume, but never compresses locally and leaves a void region). We will not go into manytechnicalities here, but simply mention an intuitive solution for the problem of why anideal gas expands but the reversed process never occurs, Fig. 1.2. Thermodynamic’ssecond law states that entropy always increases when an isolated system undergoes anirreversible process. Eqn. (1.41) then implies that, at the microscopic level, the numberof available microstates should increase. If we take a gas confined to one half of a volumeV , i.e. to V/2, and at some instant of time let the gas expand into the whole volume V ,we know how to calculate the entropy increase: since the entropy of an N -particle idealgas in a volume V is S = S0 + Nk log (V Λ3/N) [see Eqn. (1.59)], the change of entropywhen the gas expands from a volume V/2 to a volume V is

∆S = Sfinal − Sinitial = Nk log(

V Λ3/N)

− Nk log(

V Λ3/2N)

= Nk log 2 > 0. (1.42)

Incidentally, this result can also be obtained by direct application of Eqn. (1.41). Sinceparticles are independent, the number of ways in which the N particles can be distributedin one half of the volume, with respect to the total number, is given by (1/2)N , since thereare only two choices for each particle (it is either in the left or the right half). Then:

∆S = k (log Ωfinal − log Ωinitial) = −k logΩinitial

Ωfinal

= −k log(

1

2

)N

= Nk log 2. (1.43)

According to experience, the reversed process where the gas compresses spontaneouslyinto the left half never occurs, and indeed thermodynamics prohibits so. Microscopicallyit is easy to see that Ω has also increased: a particle has a larger phase-space volumeavailable when the gas expands than before. Therefore Eqn. (1.41) is reasonable. Butwhat about the time evolution? When the gas has expanded, since every gas particlefollows a reversible trajectory, there should be phase-space trajectories leading to stateswhere all particles are again only in half of the volume, but these trajectories are very fewwith respect to the total number. In other words, these trajectories are very improbableor, put it differently, we should wait a disproportionately long time for that state to bereached. The microscopic explanation of the second law is statistical: entropy generallyincreases, but there may be an insignificantly small number of particular processes whichlead to an entropy decrease.The microcanonical ensemble is very important from a conceptual point of view, butimpractical for doing calculations. A more convenient ensemble is the canonical ensemble,associated to the variables (N, V, T ). The system is in contact with a thermal reservoirat temperature T . Here the probability p of the ν-th state depends on the energy of thestate, Eν :

pν =e−βEν

Q(1.44)

11

V/2 V/2 V

Figure 1.2: Irreversible expansion of a gas from a volume V/2 to a volume V .

where β = 1/kT and Q is the canonical partition function, which appears as a normal-isation of the probability function pν . For a classical conservative system of N identicalparticles characterised by a Hamiltonian H(q, p), with (q, p) a set of canonically conju-gated variables q = q1, q2, ..., qn, p = p1, p2, ..., pn,

Q =1

N !hn

∫ ∫

dqdpe−βH(q,p). (1.45)

For a quantum system

Q =∑

ν

e−βEν . (1.46)

Connection with thermodynamics is made through the expression

F = −kT log Q, (1.47)

and all macroscopic variables can be obtained from F using Eqns. (1.12). One obviousdifficulty with this ensemble is that it is not always possible to evaluate the partition func-tion, which involves a complicated sum over states. Very often the activity in statisticalmechanics involves ways to avoid having to sum over states (by doing it in a differentbut approximate way), even though this is actually what the theory asks for. For themechanical variables (variables that can be linked to each microscopic state) a differentmethod to have access to macroscopic properties is via ensemble averaging using the prob-ability pν . This implies that thermodynamic quantities are directly identified with theseaverages. For example, in the canonical ensemble, where the energy fluctuates, the meanenergy 〈E〉 is

〈E〉 =∑

ν

Eνpν , pν =e−βEν

Q. (1.48)

In classical mechanics we would have

〈E〉 =∫ ∫

dqdpH(q, p)ρ(q, p), ρ(q, p) =e−βH(q,p)

Q. (1.49)

Other typical mechanical variables for which this method can be applied are pressure andmagnetisation (for a magnetic system). Ensemble averaging finds important applications

12

in computer-simulation methods.

A simple example is a gas of N identical classical noninteracting monoatomic molecules,with Hamiltonian

H(q, p) =N∑

i=1

p2i

2m. (1.50)

Only kinetic terms are included. The effect of interactions, however, has to be taken intoaccount implicitely, as it is the only possible mechanism that allows energy to flow amongthe degrees of freedom. The partition function of the gas can be factorised into a productof reduced molecular partition functions:

Q =1

N !h3N

∫

dp∫

dqe−βH(q,p) =V N

N !h3N

(∫ ∞

−∞dpe−βp2/2m

)3N

=qN

N !, (1.51)

with q the ‘reduced’ molecular partition function:

q =V

h3

(∫ ∞

−∞dpe−βp2/2m

)3

=V

Λ3, Λ =

h√2πmkT

. (1.52)

The factor N ! in (1.51) accounts for particle indisguishability. Λ is the so-called thermal

wavelength, which is a measure of the extension of the wavepacket associated to themolecule (thus q is the fraction of total volume available to the molecule with respectto the quantum volume). The multiplicative property of the partition function goes overinto the additive property of the free energy, and all other thermodynamic functions canbe derived from it. We have:

F = −kT log Q = −NkT log q − kT log N ! = −NkT logV

Λ3− kT log N !. (1.53)

Since N is normally very large, the logarithm of N ! is usually approximated using Stirlingsapproximation:

log N ! = N log N − N + ... ≃ N log N − N, N ≪ 1, (1.54)

which finally gives

F

V kT= ρ log

(

ρΛ3)

− ρ, (1.55)

where ρ = N/V is the mean density. From Eqn. (1.55) the pressure is

p =

(

∂F

∂V

)

N,T

=NkT

V, (1.56)

which is the ideal-gas equation of state. The energy can be obtained from F = ETS:

E = F + TS = F − T

(

∂F

∂T

)

N,V

= −T 2

(

∂F/T

∂T

)

N,V

= kT 2

(

∂ log Q

∂T

)

N,V

, (1.57)

which gives

E =3

2NkT. (1.58)

13

The entropy of the ideal gas follows from F = E − TS:

S =E − F

T=

5

2Nk − Nk log

(

ρΛ3)

= S0 − Nk log(

ρΛ3)

. (1.59)

Note that the fact that the Hamiltonian can be separated into independent terms, in thiscase one for each particle, brings about a factorisation of Q into N identical reduced parti-tion functions that can be computed (in this case easily); the free energy is then obtainedas a sum of N identical reduced free energies. This is a clear example of an ideal system.Frequently the Hamiltonian of a given system is approximated by uncoupling particularvariables or set of variables in order to use this mathematical trick, the so-called ideal

approximation.

For completeness, we mention the statistical-mechanical expressions for the grand canon-ical ensemble, with variables (µ, V, T ). Here the system is in contact with a reservoir attemperature T and chemical potential µ, with which the system exchanges energy andparticles. We have:

Ξ =∞∑

N=0

1

N !hnN

∫ ∫

dqdpe−β[HN (q,p)−µN ], (1.60)

where Ξ is the grand partition function and nN is the number of degrees of freedom forN particles. For a quantum system:

Ξ =∞∑

N=0

∑

ν

e−β[Eν(N)−µN ] (1.61)

(here Eν(N) is the energy spectrum of a system of N particles). Connection with ther-modynamics is made through the expression

Ω = −kT log Ξ, (1.62)

and macroscopic variables can be obtained from Ω using Eqns. (1.16). This ensemble isuseful when discussing phase transitions. Sometimes it is simpler to use than the canon-ical ensemble, since summation over N may lead to simplication in the state-countingoperation (an obvious example is the Bose ideal gas).

1.9 Fluctuations

In statistical mechanics averages (first moments) are identified with thermodynamic me-chanical quantities. But in different ensembles the fluctuating mechanical quantities aredifferent. Equivalence of ensembles when N is large holds at the level of averaged me-chanical quantities (meaning, for example, that the pressure 〈p〉 obtained as an averagefor given volume V in the canonical ensemble (N, V, T ) gives rise, in the isobaric ensemble(N, 〈p〉 , T ), to a value of the average volume 〈V 〉 that coincides with V ). But it shouldalso hold at the level of second-order moments; this means that fluctuations must remainsmall in large systems (and zero in the thermodynamic limit). What about higher-orderaverages (i.e. higher than first moments) then? It turns out that these are also linked tothermodynamic properties of the system. We focus on second moments or ‘fluctuations’.

14

Their values depend on the type of ensemble.

In the canonical ensemble the system is coupled to a thermal reservoir at temperature T ,with which it interchanges energy. The fluctuation in energy is

σE =

√

⟨

(E − 〈E〉)2⟩

=√

〈E2〉 − 〈E〉2. (1.63)

A simple derivation using Eqn. (1.48) leads to

σ2E =

⟨

E2⟩

− 〈E〉2 = kT 2

(

∂ 〈E〉∂T

)

(N,T )

= kT 2CV . (1.64)

Therefore the fluctuation in energy is related to the specific heat at constant volume.This is an example of the fluctuation-dissipation theorem in statistical physics, whichrelates fluctuations of a statistical variable to some thermodynamic quantity. This resultimmediately implies (since 〈E〉 ∼ NkT ) that

σE

〈E〉 ∼ 1√N

, (1.65)

so that the relative fluctuation is very small in real materials. A similar result in themacrocanonical ensemble is

σ2N =

⟨

N2⟩

− 〈N〉2 = kT

(

∂ 〈N〉∂µ

)

(V,T )

= ρkTκT (1.66)

or the fluctuation in the number of particles. Also,

σN

〈N〉 ∼ 1√N

. (1.67)

The law N−1/2 is quite general in equilibrium statistical physics.

1.10 Examples of some ideal systems (reminder)

As a reminder, we list a few classical examples where the ideal approximation can besuccessfully applied. In most cases this approximation provides a starting point fromwhich more sophisticated approximations, incorporating the effect of interactions, can beimplemented.

• Harmonic solid

In a crystalline solid molecules are arranged into a periodic three-dimensional lat-tice, each molecule being attached to a particular lattice site, about which it movesin an oscillatory way. The Hamiltonian is

H(p, q) =N∑

i=1

p2i

2m+

1

2

N∑

i=1

∑

j 6=i

φ (|ri − rj|)

=N∑

i=1

p2i

2m+ U0 +

1

2

N∑

i=1

∑

j 6=i

ui · Dij · uj + · · · , (1.68)

15

where φ(r) is the pair potential between two molecules (we assume it, for the sakeof simplicity, to be isotropic, i.e. to depend on the modulus of the relative vector).In the last equality the displacement vector of the i-th molecule has been defined,ui = ri − Ri, with Ri the location of its associated lattice site, and the interactionpart has been approximated using a Taylor expansion in all the vectors ui. D is anondiagonal 3N ×3N matrix, which essentially gives the curvatures of the potentialat the equilibrium (lattice) sites. U0 is the (constant) Madelung energy. Now,restricting the expansion to the quadratic term (this can be done at sufficiently lowtemperatures), and diagonalising the D matrix, the Hamiltonian adopts a diagonalform:

H(ξ, ξ) =1

2

3N∑

k=1

(

ξ2k + ω2

kξ2k

)

, (1.69)

which corresponds to 3N decoupled harmonic oscillators of frequencies ωk. Again,as in the ideal gas, the degrees of freedom can be grouped into a set of 3N inde-pendent, decoupled coordinates ξk, called normal modes. In this case the original(natural) Cartesian coordinates were coupled but, within the context of the har-monic approximation, particular linear combinations of these coordinates (the nor-mal modes) are decoupled. From this point of view, the harmonic solid is an ‘idealgas’. The statistical mechanics of the problem is now trivial; the partition functionis

Q =3N∏

k=1

qk, (1.70)

with qk the reduced partition function of the i-th vibrational mode. At the lowtemperatures where the harmonic approximation is supposed to be valid, a quantumtreatment is required. The quantum partition function (reduced partition function)of the k-th harmonic oscillator is:

qk =∞∑

n=0

e−βE(k)n =

∞∑

n=0

e−β(n+1/2)hωk =e−βhωk/2

1 − e−βhωk

. (1.71)

Here E(k)n = (n + 1/2)hωk is the quantized energy of a single quantum harmonic

oscillator. The complete partition function is the product of the reduced partitionfunctions:

Q =3N∏

k=1

qk =3N∏

k=1

e−βhωk/2

1 − e−βhωk

, (1.72)

and the free energy is

F = −kT log Q = U0 + kT3N∑

k=1

hωk

2kT+ log

(

1 − e−βhωk

)

(1.73)

(the Madelung energy has been added). Since there are so many frequencies dis-tributed along a finite interval in frequencies, it is a good approximation to introduce

16

a frequency distribution g(ω), such that g(ω)dω gives the number of modes withfrequencies in the interval [ω, ω + dω]. Obviously:

∫ ∞

0dωg(ω) = 3N. (1.74)

Going from the sum over modes k to an integral over ω involves the transformation

3N∑

i=1

→∫ ∞

0dωg(ω). (1.75)

The free energy can then be written as

F = U0 + kT∫ ∞

0dωg(ω)

hω

2kT+ log

(

1 − e−βhω)

, (1.76)

and the energy is

E = U0 + h∫ ∞

0dωg(ω)ω

1

2+

e−βhω

1 − e−βhω

. (1.77)

A standard approximation is the Debye theory, which assumes

g(ω) =

αω2, 0 < ω < ωmax,

0, ωmax < ω < ∞.(1.78)

From the normalisation of g(ω) one gets α = 9N/ω3max

. Then the energy can bewritten as

E = U0 +9Nhωmax

8+

9NkT

u3

∫ u

0dx

x3

ex − 1, (1.79)

where u ≡ βhωmax. From here the specific heat is

CV =∂E

∂T= 3Nk

[

12

u3

∫ u

0dx

x3

ex − 1− 3u

eu − 1

]

(1.80)

It is easy to analyse the low- and high-temperature limits of CV : for T → 0,CV ∝ T 3, while for T → ∞, CV → 3Nk. Both limits are correct, despite thecrudeness of the Debye theory (for metals, an additional contribution to CV dueto free electrons changes this scenario a bit; the above would be the contributioncoming from ionic degrees of freedom).

• Paramagnetism

A paramagnetic substance exhibits a finite magnetisation when subject to an exter-nal magnetic field, whereas it possesses no net magnetisation at zero field. Langevinparamagnetism is due to the interaction of the intrinsic magnetic moments of themolecules of a substance with the magnetic field, and it can be studied classically(this is in contrast with Pauli paramagnetism, which is due to free electrons in ametal and has to be analysed using quantum statistics). Thermodynamically, the

17

magnetisation M and the magnetic field H are conjugate variables. The followingderivatives can be defined:

M = −(

∂G

∂H

)

T

, χT =

(

∂M

∂H

)

T,H→0

, CH = −T

(

∂2G

∂T 2

)

H

. (1.81)

[here we are using the Gibbs free energy as the relevant thermodynamic potential,G = G(H, T ), since the relevant variables are taken as H –which plays the role of apressure– and temperature; one can also use M and T as variables, in which case therelevant potential is the Helmholtz potential F = F (M, T )]. These are completelyequivalent to those defined for a simple substance, noting that M is to be associatedwith the volume V , and H with the pressure; thus, χT (magnetic susceptibility atzero field) is equivalent to the compressibility, and CH is the specific heat at constantmagnetic field (similar to Cp). κT and CH are response functions. In particular, κT

will play an important role in future developments.

A very simple model of paramagnetism consists of assigning a magnetic moment(spin) to each atom, disregarding any interaction between the spins. We assumethat spins are independent and that there is an applied magnetic field H . Each spinhas a magnitude S = 1/2, so that only two values of the spin along the directionof the magnetic field are possible, s = ±1. The spins will have an associatedmagnetic moment µB (the Bohr magneton), so that µ = µBs. The Hamiltonian ofthe paramagnetic solid is then

H = −N∑

i=1

µiH = −µBHN∑

i=1

si. (1.82)

Let us calculate the magnetisation M . There are two ways: one is to calculate theaverage value by thermal averaging:

〈M〉 =

∑

s1=±1

∑

s2=±1

· · ·∑

sN=±1

(

N∑

i=1

µBsi

)

e−βH

∑

s1=±1

∑

s2=±1

· · ·∑

sN=±1

e−βH

= µB

N∑

i=1

∑

s1=±1

eβµBHs1

∑

s2=±1

eβµBHs2

· · ·

∑

si=±1

sieβµBHsi

· · ·

∑

sN=±1

eβµBHsN

∑

s1=±1

eβµBHs1

∑

s2=±1

eβµBHs2

· · ·

∑

sN=±1

eβµBHsN

= µB

∑

s1=±1

eβµBHs1

∑

s2=±1

eβµBHs2

· · ·N∑

i=1

∑

si=±1

sieβµBHsi

· · ·

∑

sN=±1

eβµBHsN

∑

s1=±1

eβµBHs1

∑

s2=±1

eβµBHs2

· · ·

∑

sN=±1

eβµBHsN

= NµB

∑

s=±1

seβµBHs

∑

s=±1

eβµBHs= NµB

eβµBH − e−βµBH

eβµBH + e−βµBH

18

= NµB tanh (βµBHs). (1.83)

Note that 〈M〉 = NµB 〈s〉, where 〈s〉 is the thermal average of a single spin. Thisis of course a consequence of the spins being independent.

The other route is via the free energy. The partition function is:

Q = qN , q =∑

s=±1

eβµBHs = eβµBH + e−βµBH (1.84)

(we consider the spins to be distinguishable since they are arranged on the sites ofa crystalline lattice). Then:

G = −NkT log q = −NkT log(

eβµBH + e−βµBH)

. (1.85)

Using the thermodynamic derivative:

M = −(

∂G

∂H

)

T

= NµBeβµBH − e−βµBH

eβµBH + e−βµBH= NµB tanh (βµBH), (1.86)

which coincides with the previous expression. The magnetic susceptibility at zerofield is:

χT =

(

∂M

∂H

)

T,H→0

=µ2

BM

kT. (1.87)

The entropy is

S

Nk= − 1

Nk

(

∂G

∂T

)

H

= log [2 cosh (βµBH)] − βµBH tanh (βµBH). (1.88)

It can be demonstrated that the limits of M and S for small H are, respectively:

M → Nµ2BH

kT,

S

Nk→ log 2 − 1

2

(

µBH

kT

)2

. (1.89)

The first equation is the famous Curie’s law, M = cH/T , which is the equation ofstate of an ideal paramagnet. It is only valid for small M (M should be small com-pared with the saturation magnetisation M0, which is the maximum magnetisationthat a particular magnet can exhibit, in this case NµB). Note that M → χT H , asit corresponds to χT being a response function at zero field. The first term in theentropy is the entropy of the system at infinite temperature (or zero magnetic field),where all spins are independent and S = k log 2N .

Other typical examples where the ideal approximation can be applied are: diatomicmolecules (where rotational, translational and vibrational degrees of freedom are ap-proximately decoupled and treated as independent), chemical reactions, mixtures of idealgases, and ideal quantum systems (giving rise to quantum statistics and their typicalapplications: Bose-Einstein condensation, black-body radiation, non-interacting fermionsystems such as electrons in metals, etc.)

19

Figure 1.3: Equations of state of (a) a fluid, and (b) a magnet, for different temperatures.The discontinuous line is the ideal approximation.

1.11 Phenomena that cannot be explained by ideal

approximation

The ideal approximation enjoys a number of successes. We have reviewed just three:derivation of ideal-gas law relating pressure, temperature and density for a dilute gas,Eqn. (1.56), correct low- and high-temperature limits of the heat capacity of solids (ioniccontribution in metals), and explanation of Curies law. The most severe limitation of theideal approximation is its failure to explain a very important class of phenomena, namelythe occurrence of phase transitions in Nature (the only exception is the Bose-Einstein con-densation, which occurs in a quantum ideal system of bosons and is therefore a quantumphase transition). Phase transitions are collective phenomena due to interactions; anyapproximation that neglects interactions will not be able to reproduce any phase transi-tion. Since phase transitions are ubiquitous, the ideal approximation must be corrected.

But many other features of materials require consideration of interactions. For exam-ple, most properties of liquids are not even qualitatively reproduced by adopting an idealapproximation. The experimental behaviour of the equation of state of a fluid can beused to illustrate these features. Experimentally, the equation of state can be written as

p

kT= ρ + B2(T )ρ2 + B3(T )ρ3 + · · · (1.90)

where Bn(T ) are the virial coefficients, smooth functions of the temperature. The firstterm in the right-hand side accounts for the ideal-gas approximation. Fig. 1.3(a) depictsqualitatively the equation of state p = p(ρ, T ) of a typical fluid at two temperatures.Several gross discrepancies with the ideal-gas approximation are worth-mentioning:

• As ρ increases from zero the pressure departs from linear behaviour. This effect isdue to interactions in the fluid.

• At low temperature there is a region where the pressure is constant; this reflects the

20

presence of a phase transition: the low-density gas phase passes discontinuously tothe high-density liquid phase.

• At high temperatures, T > Tc, the phase transition disappears. At Tc (the critical

temperature), the system exhibits very peculiar properties; for example, the firstderivative of the pressure with respect to density vanishes, implying an infinitecompressibility.

In the case of a paramagnet, the equation of state H = H(M, T ) is represented schemat-ically in Fig. 1.3(b). In this case the ideal approximation works relatively well for lowvalues of H . However, in a ferromagnetic material the phenomenon of spontaneous mag-netisation takes place at a temperature T = Tc: below this temperature, a non-zeromagnetisation arises in the material, even at H = 0. The origin of this is the interactionbetween spins, which reinforces the effect of any external magnetic field and may createan internal magnetic field even when an external magnetic field is absent.

21

chapter 1 review of thermodynamics and statistical · pdf filechapter 1 review of...

Documents