Statistical Mechanics
Contents
Chapter 1. Ergodicity and the Microcanonical Ensemble 1
1. From Hamiltonian Mechanics to Statistical Mechanics 1
2. Two Theorems From Dynamical Systems Theory 6
3. The Microcanonical Ensemble and the Ergodic Hypothesis 9
4. Density Operators in Quantum Mechanics 13
5. Discussion 18
3
CHAPTER 1
Ergodicity and the Microcanonical Ensemble
The pressure in an ideal gas, recall, is proportional to the average kinetic energy
per molecule. Since pressure may be understood as an average over billions upon
billions of microscopic collisions, this simple relationship illustrates how statistical
techniques may be used to supress information about what each individual molecule
is doing in order to extract information about what the molecules do on average
as a whole. Our first task, as we examine the foundations of statistical mechanics,
is to understand more precisely why this suppression is necessary and how exactly
it is to be accomplished with more precision. We must, therefore, begin by con-
sidering the laws of microscopic dynamics. In physics, there are two choices here
— the laws of classical mechanics and the laws of quantum mechanics. Remark-
ably, the choice is not important; in either case, detailed solutions to the dynamical
equations are completely unnecessary. We will consider both cases, but follow the
classical route through Hamiltonian mechanics first, as this provides the clearest
introduction to the structure of statistical mechanics. In this section, we will review
the essential elements of Hamiltonian mechanics and discuss the need for and ba-
sic elements of a probabilitistic framework...THIS NEEDS TO BE REWRITTEN
TO BETTER TAKE ACCOUNT OF THE CONTENTS AND HIGHLIGHTS OF
THIS CHAPTER
1. From Hamiltonian Mechanics to Statistical Mechanics
Newton’s second law for a particle of mass m,
Ftotal = mq,
is a second-order ordinary differential equation. Therefore, given the instantaneous
values of the particle’s position q and momentum p = mq at some time t = 0, the
particle’s subsequent motion is uniquely determined for all t > 0. For this reason,
the state of a classical system consisting of n configurational degrees of freedom can
1
2 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE
be thought of as a point (q1, . . . , qn, p1, . . . , pn) in a 2n-dimensional space called the
phase space of the system. As the state evolves in time, this point will trace out
in phase space a trajectory defined by the tangent vector,
(1) v(t) = (q1(t), . . . , qn(t), p1(t), . . . , pn(t)) .
A Hamiltonian system evolves according the canonical equations of motion,
qi =∂
∂piH(q,p, t),(2)
pi = − ∂
∂qiH(q,p, t),(3)
where the function
H(q,p, t) = H(q1, . . . , qn, p1, . . . , pn, t)
is called the Hamiltonian of the system. These equations represent the full content
of Newtonian mechanics. Note that exactly one trajectory passes through each
point in the phase space; the classical picture is completely deterministic.
Example (single particle dynamics). Find the canonical equations of motion
for a single particle of mass m in an external potential V (q).
Solution. The Hamiltonian for this system is simply
H(q, p) =p2
2m+ V (q),
which we recognize as the sum of the kinetic and potential energies. This leads to
the following dynamical equations:
q =p
m
p = − ∂
∂qV (q).
A system of many interacting particles has a similar solution, though the potential
term becomes much more complicated.
We see, therefore, that the first canonical equation (2) generalizes the relationship
between velocity and momenta (in a more complicated system, the i-th momentum
may depend on several of the qi and qi). Similarly, the second canonical equation
(3) generalizes the rule that force may be expressed as a gradient of an energy
function.
1. FROM HAMILTONIAN MECHANICS TO STATISTICAL MECHANICS 3
In a Hamiltonian system, the time dependence of any function of the momenta
and coordinates
f = f(q1, . . . , qn, p1, . . . , pn, t)
can be written,
(4)dfdt
={f,H
}+∂f
∂t,
where{f,H
}is the Poisson bracket of the function f and the Hamiltonian.
The Poisson bracket of two functions f1 and f2 with respect to a set of canonical
variables is defined as
(5){f1, f2
}=
n∑j=1
(∂f1∂qj
∂f2∂pj
− ∂f1∂pj
∂f2∂qj
).
The Poisson bracket is important in Hamiltonian dynamics because it is indepen-
dent of how the various coordinates and momenta are defined; that is,{u, v
}takes
the same value for any set of canonical variables q and p. Furthermore, the canon-
ical equations of motion can be re-written in the following form,
qi ={qi,H
},(6)
pi ={pi,H
}.(7)
This is known as the Poisson bracket formulation of classical mechanics. It is impor-
tant to recognize that very similar expressions arise in quantum mechanics (we’ll
look at these in Section 4). Indeed, every classical expression involving Poisson
brackets has a quantum analogue employing commutators. This elegant correspon-
dence principle, first pointed out by Dirac, has deep significance for the relationship
between classical and quantum physics. It also provides our first glimpse of why
statistical mechanics transcends the details of the microscopic equations of motion.
For now, we return to the classical route into the heart of statistical mechanics...
Examining a physical system from the classical mechanical point of view, one
first constructs the canonical equations of motion and then integrates these from
known initial conditions to determine the phase trajectory. If the system of in-
terest involves a macroscopic number of particles, this approach condemns one to
numerical computations involving matrices of bewildering size. Yet system size is
not the major obstacle: The canonical equations of motion are in general nonlin-
ear and, as a result, small changes in system parameters or initial conditions may
4 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE
lead to large changes in system behavior. In particular, neighboring trajectories in
many nonlinear systems diverge from one another at an exponential rate, a phe-
nomenon known as sensitive dependence on initial conditions or, more popularly,
as the butterfly effect, the idea being that a flap of a butterfly’s wings may make
the difference between sunny skies and snow two weeks later. Systems exhibiting
sensitive dependence on initial conditions are said to be chaotic. Calculations of
chaotic trajectories are intolerant of even infinitesimal errors, such as those aris-
ing from finite precision and uncertainties in the state of the system. Therefore,
setting aside the impractical integration problem of calculating a high-dimensional
phase trajectory, our necessarily incomplete knowledge of initial conditions in a
macroscopic system seriously compromises our ability to predict future evolution.
Though the prospects for dealing directly with the phase trajectories of a macro-
scopic system of particles seem hopeless, it is not the case that we must discard
all knowledge of the microscopic physics of the system. There are many macro-
scopic phenomena which cannot be understood from a purely macroscopic point
of view. What is combustion? What determines whether a solid will be a metal
or an insulator? What are the energy sources in stellar and galactic cores? These
questions are best dealt with by appealing to various microscopic details. On the
other hand, given the success of the laws of thermodynamics, it is evident that
macroscopic systems exhibit a collective regularity where the exact details of each
particle’s motion and state are nonessential. This suggests that we may envision
the time evolution of macroscopic quantities in a Hamiltonian system as some sort
of average over all of the microscopic states consistent with available macroscopic
knowledge and constraints. For this reason, one abandons the mechanical approach
of computing the exact time evolution from a single point in phase space in favor
of a statistical approach employing averages over an entire ensemble of points in
phase space. This is accomplished as follows:
Consider a large collection of identical copies of the system, distributed in phase
space according to a known distribution function,
ρ(q,p, t) = ρ(q1, . . . , q3N , p1, . . . , p3N , t),
1. FROM HAMILTONIAN MECHANICS TO STATISTICAL MECHANICS 5
where
(8)∫ρ(q,p, t) dq dp = 1 for all t.
ρ(q,p, t) is the density in phase space of the points representing the ensemble,
normalized according to (8), and may be interpreted as describing the probability
of finding the system in various different microscopic states. Once ρ(q,p, t) is
specified, we can compute the probabilities of different values of any quantity f
which is a function of the canonical variables. We can also compute the mean value
〈f〉 of any such function f by averaging over the probabilities of different values,
(9) 〈f(t)〉 =∫f(q,p) ρ(q,p, t) dp dq.
Thus, instead of following the time evolution of a single system through many
different microscopic states, we consider at a single time an ensemble of copies of
the system distributed into these states according to probability of occupancy. This
shift is one of the cornerstones of statistical mechanics.
Exercise 1.1. Derive equation (4). HINT: Use the chain rule
dfdt
=∑
i
∂f
∂qi
∂qi∂t
+∑
i
∂f
∂pi
∂pi
∂t+∂f
∂t
Exercise 1.2. Show that H(q,p, t) is a constant of the motion if and only if
it does not depend explicitly on time.
Exercise 1.3. Show that the canonical equations of motion can be re-written
in the following form,
qi ={qi,H
},(10)
pi ={pi,H
}.(11)
This is known as the Poisson bracket formulation of classical mechanics.
Exercise 1.4. Compute the following Poisson brackets:
(1){qi, qj
}(2)
{qi, pj
}Are your results in any way familiar, given your knowledge of quantum mechan-
ics? If so, how do the interpretations of these results differ from their quantum
mechanical analogues?
6 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE
Exercise 1.5. Show that the canonical equations of motion can be written in
the symplectic form,
x = M∂
∂xH(q,p, t),
where x ={q,p
}(what’s M in this expression?)
2. Two Theorems From Dynamical Systems Theory
One is often interested in general qualitative questions about a system’s dy-
namics, such as the existence of stable equilibria or oscillations. In discussing such
questions, mathematicians often speak of the flow of a dynamical system: Any
autonomous system of ordinary differential equations can be written in the form
(12) x = f(x)
(changes of variables may be required if the equations involve second-order and
higher derivatives). If we interpret a general system of differential equations (12)
as representing a fluid in which the fluid velocity at each point x is given by the
vector f(x), then we may envision any particular point x0 as flowing along the
trajectory φ(x0) defined by the velocity field. More precisely, we define
φt(x0) = φ(x0, t),
where φ(x0, t) is a point on the trajectory φ(x0) passing through the initial condition
x0; φt maps the starting point x0 to its location after moving with the flow for a
time t. It is important to note that φt defines a map on the entire phase space —
we may envision the entire phase space flowing according the velocity field defined
by (12). Indeed, we shall see in this section that this fluid metaphor is especially
appropriate in statistical mechanics.
The notion of the flow of a dynamical system very naturally accomodates a shift
towards considering how whoIe regions of phase space participate in the dynamics,
a shift away from the language of initial conditions and trajectories. This shift is
what enables mathematicians to state and prove general theorems about dynamical
systems. It also turns out that this shift provides the natural setting for several of
the central concepts of statistical mechanics. In the previous section, we motivated
a statistical framework in which, rather than follow the time evolution of a single
system, we consider at a single time an ensemble of copies of that system distributed
2. TWO THEOREMS FROM DYNAMICAL SYSTEMS THEORY 7
in phase space according to probability of occupancy. The main player in this new
framework is the distribution function ρ(q,p, t) describing the ensemble. ρ allows
us to take into account take into account which states in phase space a system is
likely to occupy1. In this section, we examine how the ensemble interacts with the
flow defined by a set of canonical equations. It turns out that, in a Hamiltonian
system, the time evolution of ρ has several interesting properties, which are the
subject of two important theorems from dynamical systems theory.
We begin with a simple calculation of the rate of change of ρ. We know from
(4), which describes the time evolution of any function of the canonical variables q
and p, that
(13)dρdt
=∂ρ
∂t+
{ρ,H
}.
However, we also know from local conservation of probability that ρ must satisfy a
continuity equation,
(14)∂ρ
∂t+ ∇ · (ρv) = 0,
where
∇ =(
∂
∂q1, . . . ,
∂
∂qn,∂
∂p1, . . . ,
∂
∂pn
)is the gradient operator in phase space and v is defined in (1). Applying the chain
rule, we see that
(15) ∇ · (ρv) ={ρ,H
}+ ρ(∇ · v).
Since the ∇·v = 0 vanishes for a Hamiltonian system, (13) and (14) are equal and
therefore
(16)∂ρ
∂t+
{ρ,H
}= 0.
and
(17)dρdt
= 0.
This result is known as Liouville’s theorem. The partial derivative term in (16)
expresses the change in ρ due to elapsed time dt, while the (∇ρ) · v ={ρ,H
}term expresses the change in ρ due to motion along the vector field a distance vdt.
1Mathematicians include this as part of a more general approach, called measurable dynamics,
which we need not go into here.
8 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE
Thus, Liouville’s theorem tells us that the local probability density — as seen by
an observer moving with the flow in phase space — is constant in time; that is, ρ is
constant along phase trajectories. The theorem can also be interpreted as stating
that, in a Hamiltonian system, phase space volumes are conserved by the flow or,
equivalently, that ρ moves in phase space like an incompressible fluid.
From the incompressible fluid analogy, we see that while Hamiltonian systems
can exhibit chaotic dynamics, they cannot have any attractors! Liouville’s theorem
has other important consequences when combined with system constraints, such
as conservation laws. Conservation laws constrain the flow to lie on families of
hypersurfaces in phase space. These surfaces are bounded and invariant under the
flow:
(18) φt(X) = X
for each hypersurface X defined by a conservation law. When volume-preserving
flows are restricted to bounded, invariant regions of phase space, a surprising result
emerges: Let X be a bounded region of phase space which is invariant under a
volume-preserving flow. Take any region S which occupies a finite fraction of the
total volume in X (this specifically excludes what mathematicians call sets of mea-
sure zero: sets with no volume). Then any randomly selected initial condition x in
S generates a trajectory φt(x) which returns to S infinitely often — this is known
as the Poincare recurrence theorem.
In order to understand where this theorem comes from and what it means, we
consider how the region S moves under the flow. Define a function f which maps
S along the flow for a time T ,
f(S) = φT (S).
Subsequent iterations of this time-T map produce a sequence of subsets of X,
f2(S) = φ2T (S), f3(S) = φ3T (S), and so on, all with finite volume in X. Each
iteration takes a bite out ofX and so, if we iterate enough times, eventually we must
exhaust all of the volume in X. As result, two of these subsets must intersect; i.e.
there must exist integers i and j, with i > j, such that f i(S)∩ f j(S) is non-empty.
This implies that f i−j(S)∩S is also non-empty. S must fold back on itself repeatedly
under this time-T flow map. By considering small subsets of S, which must also have
3. THE MICROCANONICAL ENSEMBLE AND THE ERGODIC HYPOTHESIS 9
this property, we can convince ourselves that a randomly selected point in S does
indeed return to S infinitely often (for a precise proof of the theorem, see references
at end of chapter). The Poincare recurrence theorem as stated implies that almost
every initial condition x0 in the bounded region X generates a trajectory which
returns arbitrarily close to x0 infinitely many times. This recurrence property is
truly remarkable when you consider the bewildering array of nonlinear Hamiltonian
systems to which it may be applied. Indeed, the Poincare recurrence theorem is
considered the first great theorem of modern dynamics; we will have more to say
about its role in statistical mechanics later on.
3. The Microcanonical Ensemble and the Ergodic Hypothesis
As discussed earlier, the role of the ensemble in statistical mechanics is to
provide a probabilistic method of extracting important information about a macro-
scopic system. Naturally, the choice of a particular ensemble depends on the phys-
ical problem of interest but quite often one is interested in equilibrium properties
of a physical system. In this special case of equilibrium statistical mechanics, we
expect that ensemble averages (9) do not depend explicitly on time. This implies
(19)∂
∂tρ(q,p, t) = 0.
An ensemble satisfying (19) is said to be stationary. Note that a stationary
ensemble satisfying Liouville’s Theorem (17) has a vanishing Poisson Bracket with
the Hamiltonian,
(20){ρ(q,p),H(q,p)
}= 0.
Since{qi, pj
}= δij , no function of q or p alone will satisfy (20). The general
solution for a stationary ensemble therefore has the form
ρ(q,p) = ρ(H(q,p)).
The Hamiltonian plays an important role in determining the form of the distribution
function.
The simplest example of a stationary ensemble is the microcanonical ensem-
ble, in which the probabilities are uniformly distributed across the hypersurfaces
10 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE
in phase space defined by energy conservation:
(21) ρ(q,p) = constant.
The microcanonical ensemble is one of the cornerstones of equilibrium statistical
mechanics, Accordingly, many introductory textbooks begin with the assumption
that this ensemble is valid, that all probabilities are equal a priori. It is not difficult,
however, to construct low-dimensional examples for which this ensemble is clearly
not valid. Why then, beyond the demonstrable success of statistical mechanics
as a physical theory, do we believe in a priori equal probabilities? The answer
comes again from dynamical systems theory. In this section, we expose some of
the dynamical mechinery underlying the microcanonical ensemble. In particular,
we will introduce what physicists call the ergodic hypothesis and discuss how this
hypothesis places statistical mechanics on firm ground.
Recall from the preceeding section that conservation of energy constrains the
flow to lie on families of hypersurfaces in phase space; for what follows, we consider
one such surface X. Recall also that X is bounded and is invariant (18). Since the
flow is Hamiltonian, we know that almost every point is recurrent. What we don’t
know is how “intertwined” the phase trajectories are. That is, it’s conceivable that
X can be broken up into a number of different invariant subspaces X1, X2,. . . ,
φt(Xi) = Xi for all i,
without violating the Poincare recurrence theorem. When this partitioning into
invariant subspaces is not possible, the flow is said to be ergodic. More precisely,
the flow is ergodic if and only if the only invariant subspaces of X occupy a volume
equal to that of X or else occupy zero volume. This means that in an ergodic flow
almost every trajectory wanders almost everywhere on X. Note how much stronger
this is than the recurrence property. The Poincare recurrence theorem is always
true for a general Hamiltonian system but we are not given ergodicity a priori. In
physics, we assume that the Hamiltonian systems used to describe the macroscopic
physical world have ergodic flows2; this is known as the ergodic hypothesis.
2Remember that we’re talking about enormous systems involving over 1023 particles here
and not low-dimensional systems such as those used to describe rigid body motion. More about
this later.
3. THE MICROCANONICAL ENSEMBLE AND THE ERGODIC HYPOTHESIS 11
In order to better understand the consequences of the ergodic hypothesis and
how it connects with the microcanonical ensemble, we need two more theorems
from dynamical systems. These will be stated without proof; interested readers
can refer to the end of the chapter for recommendations on where to find a more
precise treatment of what follows. The first theorem we need states very simply
that a flow is ergodic if and only if the only functions which are invariant under the
flow are constants,
(22) φt is ergodic ⇐⇒ whenever f(φt(x) = f(x), f = constant.
This is plausible. Ergodic flows, recall, have trajectories wandering almost every-
where in X. And so, having a function be constant on some trajectory means that
it must be constant almost everywhere in X. This result leads us straight to the
microcanical ensemble. From Liouville’s theorem, we know that the distribution
function ρ is invariant under the flow. When we add the ergodic hypothesis, (22)
tells us right away that ρ must be a constant on each energy surface X. Thus, the
ergodic hypothesis replaces the need to accept on faith a separate assumption of a
priori equal probabilities.
Next we want to consider how functions of the dynamical variables behave when
averaged along trajectories in a Hamiltonian flow (and we mean any flow here, not
just an ergodic one). Does the following have a well defined limit:
limn→∞
1T
∫ ∞
0
f(φt(x)) dt?
The answer is yes. The ”Birkhoff pointwise ergodic theorem” states that, for almost
all x, these time averages converge to something :
limn→∞
1T
∫ ∞
0
f(φt(x)) dt = f∗(x)
The limit may depend on x, which is why f∗ is written above as a function of x
(and is why mathematicians call this a “pointwise” theorem), but the limit almost
always exists3 — this is the kind of thing a physicist is usually willing to take on
faith. This theorem also, however, makes some interesting statements about the
3We have to say “for almost all x” because, though the limit exists for a randomly selected
x, there may be a set of measure zero for which convergence fails and we want to be careful
12 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE
limiting function f∗(x). First, this limiting function is invariant under the flow,
(23) f∗(φt(x) = f∗(x).
Even more surprising, the ensemble average of the limiting function simply equals
ensemble average of the original function f,
(24)∫f∗(x) ρ(x) dx =
∫f(x) ρ(x) dx;
somehow time averaging under the integral sign doesn’t affect the value of the
ensemble average!
The Birkhoff theorem does not asume that the flow is ergodic. When we com-
bine this theorem with the ergodic hypothesis of statistical mechanics, we get a
major result: Begin with any function of the dynamical variables f(x). We’re
interested in the long time average of this function,
limn→∞
1T
∫ ∞
0
f(φt(x)) dt = f∗(x)
We know from (23) that the limiting function f∗(x) is invariant under the flow.
Since the flow is ergodic, (22) implies that f∗(x) is a constant (almost everywhere).
We can actually compute this constant by integrating over the ensemble,∫f∗ ρ(x) dx = f∗
∫ρ(x) dx = f∗.
However, we know from (24) the term on the left is equal to the ensemble average
of the original function f . Therefore,
(25) limn→∞
1T
∫ ∞
0
f(φt(x)) dt =∫f(x) ρ(x) dx ;
statistical averaging over the entire ensemble at fixed time is equivalent to time-
averaging a single member of the ensemble. This consequence of the ergodic hy-
pothesis is the justification for replacing macroscopic averages over computed tra-
jectories with an ensemble theory. Consider how when we compute the pressure in
an ideal gas using kinetic theory, we ignore time evolution and consider only what
a typical gas molecule is doing on average. This works precisely because of the
equivalence of ensemble averaging and time averaging. Indeed, all of equilibrium
statistical mechanics may be understood in terms of this result. It gives physicists
great confidence that the ergodic hypothesis has not led them astray. Furthermore,
to the extent that all measurements in the lab are time averages, ergodicity and
4. DENSITY OPERATORS IN QUANTUM MECHANICS 13
the microcanonical ensemble firmly ground macroscopic measurements in the mi-
croscopic dynamics of the system being investigated. No experiment to date has
shaken our confidence in the foundations of statistical mechanics.
EXERCISE: The indicator function.
4. Density Operators in Quantum Mechanics
In classical physics, the state of a system at some fixed time t is uniquely
defined by specifying the values of all of the generalized coordinates qi(t) and mo-
menta pi(t). In quantum mechanics, however, the Heisenberg uncertainty principle
prohibits simultaneous measurements of position and momentum to arbitrary pre-
cision. We might therefore anticipate some revisions in our approach. It turns
out, however, that the classical ensemble theory developed above carries over into
quantum mechanics with hardly revision at all. Most of the necessary alterations
are built directly into the edifice of quantum mechanics and all we need is to find
suitable quantum mechanical replacements for the density function ρ(q, p) and Li-
ouville’s Theorem. Understanding this is the goal of this section. Readers who are
unfamiliar with Dirac notation and the basic concepts of quantum mechanics are
referred to the references at the end of the chapter.
The uncertainty principle renders the concept of phase space meaningless in
quantum mechanics. The quantum state of a physical system is instead repre-
sented by a state vector, |ψ〉, belonging to an abstract vector space called the
state space of the system. The use of an abstract vector space stems from the
important role that superposition of states plays in quantum mechanics — lin-
ear combinations of states provide new states and, conversely, quantum states can
always be decomposed into linear combinations of other states. The connection
between these abstract vectors and experimental results is supplied by the formal-
ism of linear algebra, by operators and their eigenvalues. Dynamical variables,
such as position and energy, are represented by self-adjoint linear operators on the
state space and the result of any measurement made on the system is always rep-
resented by the eigenvalues of the appropriate operator (that is, the eigenvectors
of an observable physical quantity form a basis for the entire state space). This
use of operators and eigenvalues directly encodes many of the distinct hallmarks of
quantum mechanical systems: Discretization, such as that of angular momentum or
14 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE
energy observed in observed in numerous experiments, simply points to an operator
with a discrete spectrum of eigenvalues. And wherever the order in which several
different measurements are made may affect the results obtained, the associated
quantum operators do not commute.
In quantum mechanics, the time evolution of the state vector is described by
Schrodinger’s equation,
(26) i~∂
∂t|ψ(t)〉 = H(t) |ψ(t)〉,
where H(t) is the Hamiltonian operator for the system; this evolution law replaces
the canonical equations of classical mechanics.
Exercise 1.6 (single particle dynamics). Write down, using wavefunctions
ψ(q, t), Schrodinger’s equation for a single particle of mass m in an external po-
tential V (q).
Solution. Recall, that the classical Hamiltonian for this system is simply
H(q, p) =p2
2m+ V (q).
We transform this into a quantum operator by replacing q and p with the appropriate
quantum operators: q is the position operator and
p =~i
∂
∂q
is the momentum operator for a wavefunction ψ(q, t). Then, Schrodinger’s equation
(26) becomes the following partial differential equation,
(27) i~∂
∂tψ(q, t) =
(− ~2
2m∇2 + V (q)
)ψ(q, t).
Schrodinger’s equation has a number of nice properties. First, as a linear
equation, it directly expresses the principle of superposition built into the vector
structure of the state space — linear combinations of solutions to (26) provide new
solutions. In addition, it can be shown that the norm of a state vector 〈ψ|ψ(t)〉
is invariant in time; this turns out to have a nice interpretation in terms of local
conservation of probability. On the other hand, Schrodinger equation is not easy
to solve directly. Even a system as simple as the one-dimensional harmonic os-
cillator requires great dexterity. For a macroscopic system, (26) generates either
an enormous eigenvalue problem or a high-dimensional partial differential equation
4. DENSITY OPERATORS IN QUANTUM MECHANICS 15
(consider the generalization of (27) to a many-body system). Either way, we see
that direct solution is hopeless. The situation is essentially identical with that of
macroscopic classical mechanics — the mathematics and, more importantly, our
lack of information about the microscopic state (quantum numbers, in this case)
necessitate a statistical approach.
We would like to find a quantum mechanical entity that replaces the classical
probability density ρ(q,p), which uses probabilities to represent our ignorance of
the true state of the system. Unfortunately, the usual interpretation of quantum
mechanics already employs probabilities on a deeper level: If the measurement of
some physical quantity A in this system is made a large number of times (i.e. on
a large ensemble of identically prepared systems), the average of all the results
obtained is given by the expectation value
(28) 〈A〉 = 〈ψ|A|ψ〉,
provided the quantum state |ψ(t)〉 is properly normalized to satisfy 〈ψ|ψ(t)〉 = 1.
In order to understand the consequences of this, we introduce a basis of eigenstates
for the operator A. Let |ai〉 be the eigenvector corresponding to the eigenvalue ai.
Since the |ai〉 form a basis, we can expand the identity operator as follows,
(29) 1 =∑
i
|ai〉〈ai|.
Inserting this operator into (28) twice, we obtain
(30) 〈A〉 =∑
i
ai
∣∣〈ai|ψ〉∣∣2.
Comparing this result to the definition of the expectation value,
(31) 〈A〉 =∑
i
ai p(ai),
we see that∣∣〈ai|ψ〉
∣∣2 must be interpreted as represented the probability p(ai) of ob-
taining ai as the result of the measurement. This probabilistic framework replaces
the classical notion of a dynamical variable having a definite value. While the ex-
pectation value of A is a definite quantity, particular measurements are indefinite
— in quantum mechanics we can only talk about the probabilities of different out-
comes of an experiment. Now we can introduce an ensemble. Instead of considering
a single state |ψ〉, let pk represent the probability of the system being in a quantum
16 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE
state represented by the normalized state vector |ψk〉. If the system is actually in
state |ψk〉, then the probability of measuring ai is simply∣∣〈ai|ψk〉
∣∣2. If, however,
we are uncertain about the true state then we have to average over the ensemble.
In this case, the total probability of measuring ai is given by
(32) p(ai) =∑
k
pk
∣∣〈ai|ψk〉∣∣2 = 〈ai|
( ∑k
|ψk〉pk〈ψk|)|ai〉.
The object in parentheses in this last expression,
(33) ρ =∑
k
|ψk〉pk〈ψk|,
is known as the density operator. (33) turns out to be exactly what we’re look-
ing for, the quantum mechanical operator corresponding to the classical density
function ρ(q, p). Recall, that the classical density satisfies the following properties:
(1) Non-negativity of probabilities: ρ(q, p) must be non-negative for all points
in the phase space.
(2) Normalization of probabilities:∫ρ(q, p) dq dp = 1.
(3) Expectation values: The average value of a dynamical variable A(p, q)
across the entire ensemble represented by ρ(q, p) is given by
〈A〉 =∫A(q, p)ρ(q, p) dq dp.
These properties carry over into the quantum mechanical setting, with appropriate
modification (see exercises). In particular, it can be shown that
〈A〉 = trace{Aρ
}.
Apart from traces over a density operator replacing integration over the classical
ensemble, the statistical description of a complex quantum system is essentially no
different than that of a complex classical system. The time evolution of the density
operator ρ will be given by a quantum version of Liouville’s Theorem and will lead
to the same notions of a microcanonical ensemble and ergodicity.
First, we derive the quantum evolution law for ρ. Using the chain rule, we can
write
(34) i~∂ρ
∂t=
∑k
i~[( ∂∂t|ρ〉
)pk〈ρ|+ |ρ〉
)pk
( ∂∂t〈ρ|
)].
4. DENSITY OPERATORS IN QUANTUM MECHANICS 17
Substituting the Schrodinger equation, this reduces to
(35) i~∂ρ
∂t=
∑k
[(H|ρ〉
)pk〈ρ|+ |ρ〉
)pk
(H〈ρ|
)]= Hρ− ρH.
Thus,
(36)∂ρ
∂t= − 1
i~[ρ,H],
where [ρ,H] = ρH −Hρ is called the commutator of ρ and H. Note the striking
resemblance between (36) and Liouville’s Theorem — the commutator of the density
and Hamiltion operators has replaced the classical Poisson bracket of the density
and Hamiltonian functions but the expressions are otherwise identical. This is a
special case of a correspondence first pointed out by Dirac:
classical Poisson bracket,{u, v
}−→ quantum commutator,
1i~
[u, v
].
As in the classical setting, a stationary ρ should be independent of time; for an
equilibrium quantum system, ρ must therefore be a function of the Hamiltonian,
ρ(H). The simplest choice is again a uniform distribution,
(37) ρ =∑
k
|ψk〉1n〈ψk|,
where n is the number of states |ψk〉 in the ensemble. This the quantum micro-
canonical ensemble. It is essentially the same as the classical one, except discrete.
...THE SAME STATISTICAL PRINCIPLES APPLY, WE JUST HAVE TO
SWITCH TO A DISCRETIZED FORMALISM (TRACES OVER OPERATORS
INSTEAD OF...)
Exercise 1.7. Show that the eigenvalues of the density operator are non-
negative.
Solution. Let ρ′ represent any eigenvalue of ρ and let |ρ′〉 be the eigenvector
associated with this eigenvalue. Then∑k
|ψk〉pk〈ψk|ρ′〉 = ρ|ρ′〉 = ρ′|ρ′〉
Multiplying on the left by 〈ρ′|, we obtain∑k
pk
∣∣〈ψk|ρ′〉∣∣2 = ρ′〈ρ′|ρ′〉.
18 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE
It follows that, since the pk are positive and 〈ρ′|ρ′〉 is non-negative, ρ′ cannot be
negative. Since eigenvalues in the quantum setting represent measurements in the
classical setting, this result mirrors property (1) above.
Exercise 1.8. Show that the matrix representation of ρ in any basis satisfies
(38) trace{ρ}
= 1.
Solution. Consider a basis of eigenstates |ai〉 of the operator A. The matrix
elements ρij = 〈ai|ρ|aj〉 are the representation of ρ in this basis. Then,
trace{ρ}
=∑
i
〈ai|ρ|ai〉 =∑
i
∑k
pk
∣∣〈ψk|ai〉∣∣2
=∑
k
pk
( ∑i
∣∣〈ψk|ai〉∣∣2) =
∑k
pk = 1
Since the trace is invariant under a change of basis, this result holds for any basis.
The condition trace{ρ}
= 1 should be compared to the normalization property (2)
above.
Exercise 1.9. Show that, in a quantum ensemble represented by the operator
ρ, the expectation value of an operator A satisfies
(39) 〈A〉 = trace{Aρ
}.
Solution.
〈A〉 =∑
k
pk〈ψk|A|ψk〉 =∑k,i
pk〈ψk|ai〉〈ai|A|ψk〉
=∑i,k
〈ai|A|ψk〉pk〈ψk|ai〉 =∑i,k
〈ai|Aρ|ai〉 = trace{Aρ
}.
This result should be compared to the classical definition of expectation value, prop-
erty (3) above.
5. Discussion
NOT SURE WHAT TO DO HERE...
One important feature of Hamiltonian dynamics is the equal status given to
coordinates and momenta as independent variables, as this allows for a great deal of
freedom in selecting which quantities to designate as coordinates and momenta (the
5. DISCUSSION 19
qi and pi are often called generalized coordinates and momenta). Any set of vari-
ables which satisfy the canonical equations (2-3) are called canonical variables.
One may transform between different sets of canonical variables; these changes of
variables are called canonical transformations. Note that while the form of the
Hamiltonian depends on how the chosen set of canonical variables are defined, the
form of the canonical equations are by definition invariant under canonical trans-
formations. . .
Hamiltonian systems have a great deal of additional structure. The quantity,
(40)∮
γ
p · dq =n∑
i=1
∮γ
pi dqi,
known as Poincare’s integral invariant, is independent of time if the evolution of
the closed path γ follows the flow in phase space. The left-hand side of (40) is also
known as the symplectic area. This result can be generalized if we extend our phase
space by adding a dimension for the time t. Let Γ1 be a closed curve in phase space
(at fixed time) and consider the tube of trajectories in the extended phase space
passing through points on Γ1. If Γ2 is another closed curve in phase space enclosing
the same tube of trajectories, then
(41)∮
Γ1
(p · dq − H dt) =∮
Γ2
(p · dq − H dt).
This result that the integral ∮(p · dq − H dt)
takes the same value any two paths around the same tube of trajectories is called
the Poincare-Cartan integral theorem. Note, if both paths are taken at fixed time,
then (41) simply reduces to (40).
Structure of this sort, as well as the presence of additional invariant quantities,
greatly constrains the flow in phase space and one may wonder whether this struc-
ture is compatible with the ergodic hypothesis and the microcanonical ensemble.
The most extreme illustration of the conflict is the special case of integrable Hamil-
tonian systems. A time-independent Hamiltonian system is said to be integrable
if it has n indepedent global constraints of the motion (one of which is the Hamil-
tonian itself), no two of which have a non-zero Poisson bracket. The existence
of n invariants confinements the phase trajectories to an n-dimensional subspace
20 1. ERGODICITY AND THE MICROCANONICAL ENSEMBLE
(recall that the entire phase space is 2n-dimensional; this is a significant reduction
of dimension). The independence of these invariants guarantees that none can be
expressed as a function of the others. The last condition, that no two of the in-
variants has a non-zero Poisson bracket, restricts the topology of the manifold to
which the trajetories are confined — it must be a n-dimensional torus. A canonical
transformation to what are known as action-angle variables, for which
Ii =12π
∮γi
p · dq
provides the canonical momenta and the angle θi around the loop γi provides the
canonical coordinates, simplifies the description immensely: Each Ii provides a
frequency for uniform motion around the loops defined by the γi, generating tra-
jectories which spiral uniformly around the surface of the n-torus. For most choices
of the Ii, a single trajectory will fill up the entire torus; this is called emphquasi-
periodic motion. The microcanonical ensemble, for which the trajectories wander
ergodically on an (2n− 1)-dimensional energy surface, captures none of this struc-
ture. On one hand, highly structured Hamiltonian systems appear to exist in
Nature, the premiere example being our solar system. On the other hand, we have
the remarkable success of the statistical mechanics (and its underlying hypotheses
of ergodicity and equal a priori probabilities) in providing a foundation for thermo-
dynamics and condensed matter physics. This success remains a mystery.