1 construction of haar measure - ucsd mathematicsmath.ucsd.edu/~nwallach/haarmeasure.pdf · 1...

1 Construction of Haar Measure

Definition 1.1. A family G of linear transformations on a linear

topological space X is said to be equicontinuous on a subset K of X if

for every neighborhood V of the origin in X there is a neighborhood U

of the origin such that the following condition holds

if k1, k2 ∈ K and k1 − k2 ∈ U, then G(k1 − k2) ⊆ V

that is T (k1 − k2) ∈ V for all T ∈ G.

1

Theorem 1.2 (Kakutani). Let K be a compact, convex subset of a

locally convex linear topological space X, and let G be a group of linear

mappings which is equicontinuous on K and such that G(K) ⊆ K.

Then there exists a point p ∈ K such that

T (p) = p ∀T ∈ G

Proof. By Zorn’s lemma, K contains a minimal non-void compact

convex subset K1 such that G(K1) ⊆ K1. If K1 contains just one point

then the proof is complete. If this is not the case, the compact set

K1 −K1 contains some point other than the origin.

2

Thus, there exists a neighborhood V of the origin such that

V 6⊇ K1 −K1.

There is a convex neighborhood V1 of the origin such that αV1 ⊆ V for

|α| ≤ 1.

By the equicontinuity of G on the set K1, there is a neighborhood U1 of

the origin such that if k1, k2 ∈ K1 and k1 − k2 ∈ U1 then

G(k1 − k2) ⊆ V1.

3

Because each T ∈ G is invertible, T maps open sets to open sets (open

mapping theorem) and T (A ∩B) = TA ∩ TB for any sets A,B.

Since T is linear,

T convex-hull(A) = convex-hullT (A)

for any set A.

Because G is a group, G(GA) = GA for any set A.

4

Thus

U2 := convex-hull(GU1 ∩ (K1 −K1))

= convex-hull(G(U1 ∩ (K1 −K1))) ⊆ V1

is relatively open in K1 −K1 and satisfies GU2 = U2 6⊇ K1 −K1. By

continuity, GU2 = U2. Define

∞ > δ := inf{a : a > 0, aU2 ⊇ K1 −K1} ≥ 1

and U := δU2. For each 0 < ε < 1,

(1 + ε)U ⊇ K1 −K1 6⊆ (1− ε)U .

5

The family of relatively open sets {2−1U + k}, k ∈ K1, is a covering of

K1. Let {2−1U + k1, . . . , 2−1U + kn} be a finite sub-covering and let

p = (k1 + . . . kn)/n. If k is any point in K1, then ki − k ∈ 2−1U for

some 1 ≤ i ≤ n. Since ki − k ∈ (1 + ε)U for all i and all ε > 0, we have

p ∈ 1n

(2−1U + (n− 1) · (1 + ε)U

)+ k.

For ε = 14(n−1) , we have p ∈ (1− 1

4n )U + k for each k ∈ K1. Let

K2 = K1 ∩⋂

k∈K1

((1− 1

4n)U + k

)6= ∅.

6

Because (1− 14n )U 6⊇ K1 −K1, we have K2 6= K1. The closed set K2

is clearly convex. Further since T (aU) ⊆ aU for T ∈ G, we have

T (aU + k) ⊆ aU + Tk for all T ∈ G, k ∈ K1.

Recalling TK1 = K1 for T ∈ G, we find that GK2 ⊆ K2, which

contradicts the minimality of K1.

7

Theorem 1.3 (Haar Measure). Let G be a compact group. Let C(G)be the space of continuous maps from G to C. Then, there is a unique

linear form

m : C(G) −→ C

having the following properties:

1. m(f) ≥ 0 for f ≥ 0 (m is positive).

2. m(11) = 1 (m is normalized).

3. m(sf) = m(f) where sf is defined as the function

sf(g) = f(s−1g) s, g ∈ G

(m is left invariant).

4. m(fs) = m(f) where fs(g) = f(gs) for s, g ∈ G (m is right

invariant).

8

Proof. For f ∈ C(G), let Cf denote the convex hull of all left translates

of f . The elements of Cf are finite sums of the form:

g(x) =∑

finite

aif(six) ai > 0,∑

finite

ai = 1

Clearly

||g|| = max{|g(x)| : x ∈ G} ≤ ||f ||

Thus all sets Cf (x) = {g(x) : g ∈ Cf} are bounded and relatively

compact in C. Since G is compact, f is uniformly continuous, namely

for all ε > 0, ∃ a neighborhood V = Vε of the identity element e ∈ Gsuch that:

y−1x ∈ V ⇒ |f(x)− f(y)| < ε

9

Since (s−1y)−1s−1x = y−1x, we also have

|sf(y)− sf(x)| < ε whenever y−1x ∈ V

Since the functions g are convex combinations of functions of the form

sf ,

|g(y)− g(x)| < ε whenever y−1x ∈ V

Thus the set Cf is equicontinuous. By Ascoli’s theorem Cf is relatively

compact in C(G). Define the compact convex set Kf = Cf in C(G).The compact group G acts by left translations (isometrically) on C(G)and leaves Cf and hence Kf invariant. By Kakutani’s Theorem 1.2,

there is a fixed point g of this action G in Kf . Such a fixed point

satisfies by definition

sg = g (∀s ∈ G) ⇒ g(s−1) = sg(e) = c (∀s ∈ G)

for some constant c.

10

By the definition of the set Kf , given any ε > 0 there exists a finite set

{s1, . . . sn} in G and ai > 0 such that

n∑1

ai = 1 and

∣∣∣∣∣c−n∑1

aif(six)

∣∣∣∣∣ < ε (∀x ∈ G) (1.1)

We first show that there is only one constant function Kf . Start the

same construction as above, only now using right translations of f (e.g.

we can apply the preceding construction to the opposite group G′ of G ,

or the function f ′ = f(x−1)), obtaining a relatively compact set C′f with

compact convex closure K ′f containing a constant function c′. It will be

enough to show c = c′.(all constants c in Kf must be equal to one

chosen constant c′ of K ′f and conversely.)

11

There is certainly a finite combination of right translates which is close

to c′ namely

|c′ −∑

bjf(xtj)| < ε ( for some tj ∈ G, bj > 0 with∑

bj = 1)

Let us multiply this inequality by ai and put x = si to get

|c′ai −∑

aibjf(sitj)| < εai (1.2)

Summing over i, we obtain

|c′∑

ai −∑i,j

aibjf(sitj)| < ε∑

ai = ε (1.3)

12

Operating symmetrically on Equation (1.1) (multiplying by bj , putting

x = tj and summing over j), we find:

|c−∑i,j

aibjf(sitj)| < ε (1.4)

Subtracting (or adding) Equation (1.3) from (1.4) we get |c− c′| < 2ε.Since ε was arbitrary this completes the proof.

From now on the constant c in Kf will be denoted by m(f). It is the

only constant function which can be approximated arbitrarily close with

convex combinations of left or right translates of f .

13

The following properties are obvious:

• m(11) = 1 since Kf = {1} if f = 1.

• m(f) ≥ 0 if f ≥ 0.

• m(af) = am(f) for any a ∈ C (since Kaf = Kf ).

• m(sf) = m(f) = m(fs) (by uniqueness)

The proof will be complete if we show that m is additive (hence linear).

Let us take f, g ∈ C(G) and start with Equation (1.1) above with

c = m(f). Further let

h(x) =∑

aig(six)

Since h ∈ Cg, we certainly have Ch ⊆ Cg whence Kh ⊆ Kg. But the set

Kg contains only one constant: m(h) = m(g).

14

We can write

|m(h)−∑

bjh(tjx)| < ε

for finitely many suitable tj ∈ G and bj > 0 with∑bj = 1. Using the

definition of h and m(h) = m(g), this implies

|m(g)−∑i,j

aibjg(sitjx)| < ε (1.5)

However multiplying Equation (1.1) by bj and replacing x by tjx and

summing over j we find

|m(f)−∑i,j

aibjf(sitjx)| < ε (1.6)

15

Adding Equation (1.5) and (1.6), this implies

|m(f) +m(g)−∑i,j

aibj(f + g)(sitjx)| < 2ε

Thus the constant m(f) +m(g) is in Kf+g. However note that the only

constant in this compact convex set is m(f + g). This completes the

proof.

16

1.1 Exercises

Exercise 1.4. Let m be the normalized Haar measure of a compact

group G. For f ∈ C(G) or L1(G) show that m(f) = m(f) where the

function f is defined as the function f(x) = f(x−1). This equality is

usually written as ∫G

f(x)dx =∫

G

f(x−1)dx

Hint: Observe that f → m(f) is a Haar measure on G and use the

uniqueness part of the Theorem on Haar measures, Theorem 1.3

17

Before stating the next exercise we need a definition

Definition 1.5 (Semidirect products). Let L be a group and assume

it contains a normal subgroup G and a subgroup H such that GH = L

and G ∩H = {e}. That is, suppose one can select exactly one element

h from each coset of G so that {h} forms a subgroup H. If H is also

normal then L is isomorphic with the direct product G×H. If H fails to

be normal, we can still reconstruct L if we know how the inner

automorphisms ρh behave on G. Namely for xj ∈ G and hj ∈ H(j = 1, 2), we have:

(x1h1)(x2h2) = x1h1x2h−11 h1h2 = (x1ρh1(x2))h1h2

18

The construction just given can be cast in an abstract form. Let G and

H be groups and suppose there is a homomorphism h→ τh which

carries H onto a group of automorphisms of G, namely τh ◦ τh′ = τhh′

for h, h′ ∈ H. Let GsH denote the cartesian product of G and H. For

(x, h) and (x′, h′) in GsH, define:

(x, h)(x′, h′) = (x (τh(x′)) , hh′)

Then GsH is a group; it is called a semidirect product of G and H. Its

identity is (e1, e2) where e1 and e2 are the identities of G and H

respectively. The inverse of (x, h) is (τh−1(x−1), h−1). Let

G1 := {(x, e2) : x ∈ G}

and

H1 := {(e1, h) : h ∈ H}

19

Then G1 is a normal subgroup of GsH and H1 is a subgroup. Since

(e1, h) · (x, e2) · (e1, h)−1 = (τh(x), e2)

the inner automorphism ρ(e1,h) for (e1, h) ∈ H1 reproduces the action τh

on G. Thus every semidirect product is obtained by the process

described in the previous paragraphs.

20

Exercise 1.6. Let G and H be compact groups and let GsHbe a

semidirect product of G and H . Suppose also that the mapping

(x, h)→ τh(x) is a continuous mapping of G×H onto G . In particular,

each τh is a homeomorphism of G onto itself. Show that the semidirect

product GsHwith the product topology is a compact group. What is

the Haar measure on GsH in terms of the Haar measures on G and H ?

21

Exercise 1.7. Let On(R) be the group of n× n orthogonal matrices.

Suppose that Zij , 1 ≤ i ≤ j ≤ n are i.i.d. standard normal random

variables. Let U be the random orthogonal matrix with rows obtained by

applying the Gram-Scmidt process to the vectors (Z11, . . . , Z1n), . . .,(Zn1, . . . Znn). Show that U is distributed according to the Haar

measure on On(R).

22

2 Representations, General Constructions

For E , a complex Banach space, let Gl(E) denote the group of

continuous isomorphisms of E onto itself. A representation π of a

compact group G in E is a homomorphism π:

π : G −→ Gl(E)

for which all the maps G→ E defined as s→ π(s)v (v ∈ E) are

continuous. The space E = Eπ in which the representation takes place

is called the representation space of π. A representation π of a group

G in a vector space E canonically defines an action (also denoted by π)

π : G× E −→ E

(s, v) −→ π(s)v

23

The definition requires this action to be separately continuous. The

action is then automatically globally continuous.

We say that a representation π is unitary when E =H , is a Hilbert

space and each operator π(s) (s ∈ G) is a unitary operator (i.e. each

π(s) is isometric and surjective). Thus π is unitary when E =H is a

Hilbert space and

π(s)∗ = π(s)−1 = π(s−1) (s ∈ G)

The representation π of G in E is said to be irreducible when E and

{0} are distinct and are the only two closed invariant subspaces under all

operators π(s) (s ∈ G) (topological irreducibility).

24

Two representations π and π′ of the same group G are called

equivalent when the two spaces over which they act are G -isomorphic,

namely there exists a continuous isomorphism A : E → E′ of their

respective spaces with

A(π(s)v) = π′(s)Av (s ∈ G, v ∈ E)

More generally, continuous linear operators A : E → E′ satisfying all

commutation relations A(π(s)) = π′(s)A for all s ∈ G are called

intertwining operators or G -morphisms (from π to π′) and their set is

a vector space denoted either by

HomG(E,E′) or Hom(π,π′)

25

Proposition 2.8. Let π be a unitary representation of G in the Hilbert

space H . If H1 is an invariant subspace of H (with respect to all

operators π(s), s ∈ G), then the orthogonal space H2 = H⊥1 of H1 in

H is also invariant.

Proof. We need to show that if v ∈ H, v ⊥ H1 then π(s)v is also

orthogonal to H1 for all s in G . For any x ∈ H1,

〈x,π(s)v〉 = 〈π(s)∗x, v〉 = 〈π(s−1)x, v〉 = 0

since by assumption π(s−1)x also lies in H1.

26

Proposition 2.9. Let π be a representation of a compact group G in a

Hilbert space H . Then there exists a positive definite hermitian form ϕ

which is invariant under the G -action, and which defines the same

topological structure on H .

Proof. By continuity of the mappings s→ π(s)v, the mappings

s −→ 〈π(s)v,π(s)w〉 (v, w ∈ H)

are also continuous (by continuity of scalar product in H ×H). We can

thus define

ϕ(v, w) =∫

G

〈π(s)v,π(s)w〉ds

using the Haar integral.

27

It is clear that ϕ is hermitian and positive. Let us show that it is

non-degenerate and defines the same topology on H . Since G is

compact, π(G) is also compact in Gl(H) (with the strong topology). In

particular, π(G) is simply bounded and thus uniformly bounded (uniform

boundedness principle ≡ Banach-Steinhaus theorem). Thus, there exists

a positive constant M > 0 with

||π(s)v|| ≤M ||v|| (∀s ∈ G, v ∈ H)

This implies

||v|| = ||π(s−1)π(s)v|| ≤M ||π(s)v|| ≤M2||v||

Thus

M−1||v|| ≤ ||π(s)v|| ≤M ||v||

28

Squaring and Integrating over G , we find

M−2||v||2 ≤ ϕ(v, v) ≤M2||v||2

Thus ϕ(v, v) = 0 implies ||v|| = 0 and v = 0. Thus ϕ and || · ||2 induce

equivalent topologies (equivalent norms) on H . Invariance of ϕ comes

from the invariance of the Haar measure.

ϕ(π(t)v,π(t)w) =∫

G

〈π(st)v,π(st)w〉ds =∫

G

f(st)ds

=∫

G

ft(s)ds =∫

G

f(s)ds = ϕ(v, w)

This shows that π is ϕ-unitary as desired.

These propositions imply any representation of a compact group in a

Hilbert space is equivalent to a unitary one, and any finite dimensional

representation (the dimension of a representation is the dimension of its

rep. space) is completely reducible (direct sum of irreducible ones.)

29

Definition 2.10 (left translations). In any space of functions on G ,

define the left translations by

[l(s)f ](x) = f(s−1x)

(If we do not want to identify elements of Lp(G) with functions or

classes of functions, we can simply extend translations from C(G) to

Lp(G) by continuity).

Thus we have

l(s) ◦ l(t) = l(st)

and we get homomorphisms

l : G→ Gl(E), s→ l(s)

with any E = Lp(G), 1 ≤ p <∞.

Exercise 2.11. Check that these homomorphisms are continuous in the

representation sense.

30

The above were the left regular representations of G. The right

regular representations of G in the Banach space Lp(G) are defined

similarly with

[r(s)f ](x) = f(xs) (f ∈ Lp(G))

With this definition, one has r(s) ◦ r(t) = r(st).

One can also consider the biregular representations of l × r of G×Gin Lp(G) defined as

[l × r(s, t)f ](x) = f(s−1xt) (f ∈ Lp(G))

and its restriction to the diagonal G→ G×G, s→ (s, s) which is the

adjoint representation of G . It is defined as

[Ad(s)f ](x) = f(s−1xs) (f ∈ Lp(G))

The regular representations are faithful, i.e π(s) = 11⇔ s = e

31

Let π : G→ Gl(E) and π′ : G′ → Gl(E′) be two representations. We

can define the external direct sum representation of G×G′ in E ⊕ E′

by

π ⊕ π′(s, s′) = π(s)⊕ π′(s′) (s ∈ G, s′ ∈ G′)

When G = G′, we can restrict this external direct sum to the diagonal

G of G×G, obtaining the usual direct sum of πand π′

π ⊕ π′ : G → Gl(E ⊕ E′)

s → π(s)⊕ π′(s)

The external tensor product π ⊗ π′ as a representation of G×G′ in

E × E′is defined as

π ⊗ π′(s, s′) = π(s)⊗ π′(s′) (s ∈ G, s′ ∈ G′)

32

We assume the two spaces E ,E′ are finite dimensional, thus this

algebraic tensor product is complete; in general some completion has to

be devised.

The usual tensor product of two representations of the same group

G is the restriction to the diagonal of the external tensor product

(G = G′) and is given by

π ⊗ π′(s) = π(s)⊗ π′(s) (s ∈ G)

33

For a given finite dimensional representation π : G→ Gl(E), define the

contragredient representation π. This representation acts in the dual

E′ of E (namely the space of linear forms on E) and

π(s) = tπ(s−1) (s ∈ G)

Since transposition reverses the order of composition of mappings,

namely t(AB) = tBtA, it is necessary to reverse the operations by

taking the inverse in the group. The above construction allows us to

conclude that π(st) = π(s)π(t) as is required for a representation.

34

Conjugate representation π: When E =H is a Hilbert space the

conjugate π of π is a representation acting on the conjugate H of H .

Recall that H has the same underlying additive group as H , but with

the scalar multiplication in H twisted by complex conjugation, namely

the external operation of scalars is given by

(a, v) −→ a · v = av ( we use a dot in H )

The inner product 〈·, ·〉− of H is defined as

〈v, w〉− = 〈v, w〉 = 〈w, v〉

This suggests that an element v ∈ H is written as v when we consider it

as an element of the dual Hilbert space H . With this notation we have:

av = a · v (a ∈ C) and 〈v, w〉− = 〈v, w〉

35

The identity map H → H , v → v is an anti-isomorphism. The

conjugate of π is defined as π(s) = π(s) in H . Since the (complex

vector) subspaces of H and H are the same by definition, π and π are

reducible or irreducible simultaneously. However it is important to

distinguish these two representations (in particular they are not always

equivalent). Any orthonormal basis (ei) of H is also an orthonormal

basis of H , but a decomposition v =∑viei in H gives rise to the

decomposition

v =∑

viei (complex conjugate components in H )

Thus the matrix representations associated with π and π are complex

conjugate to one another.

Exercise 2.12. Show that when π is unitary and finite dimensional, the

contragredient π and the conjugate π of π are equivalent.

36

2.1 Exercises

Exercise 2.13. Show that the left and right representations l and r of a

group G (in any Lp(G) space) are equivalent.

Exercise 2.14. If πand π′ are two representations of the same group

G (acting in respective Hilbert spaces H and H ′), show that the matrix

coefficients of π ⊗ π′ (with respect to bases (ei) in H and (e′j) in H ′

and ei ⊗ e′j in H ⊗H ′) are products of matrix coefficients of πand π′

(Kronecker product of matrices).

Exercise 2.15. Let 11n denote the identity representation of the group

G in dimension n (the space of this identity representation is thus Cn

and 11n(s) = idCn for all s ∈ G). Show that for any representation πof

G ,

π ⊗ 11n is equivalent to π ⊕ π ⊕ · · · ⊕ π (n terms)

37

Exercise 2.16. (Schur’s lemma) Let k be an algebraically closed field,

V a finite dimensional vector space over k and Φ any irreducible set of

operators in V (the only invariant subspaces, relatively to all operators

belonging to Φ are V and {0}). Then, if an operator A commutes with

all operators in Φ, A is a multiple of the identity operator (i.e. A is a

scalar operator).

Hint: Take an eigenvalue a in the algebraically closed field k and

consider A− a · I, which still commutes with all operators of Φ. Show

that the Ker(A− a · I)(6= {0}) is an invariant subspace.

38

3 Finite dimensional representations of

compact groups (Peter-Weyl theorem)

Theorem 3.17 (Peter-Weyl). Let G be a compact group. for any

s 6= e in G , there exists a finite dimensional irreducible representation

π of G such that π(s) 6= 11.

Proof. We start with two Lemmas.

Lemma 3.18. Let G be a compact group, k : G×G→ C a continuous

function and K : L2 → C(G) the operator with kernel k, namely:

(Kf)(x) =∫

G

k(x, y)f(y)dy

Then K is a compact operator. Moreover if k(x, y) = k(y, x) identically

on G×G, K is a Hermitian as an operator from L2(G) to C(G).

39

Lemma 3.19. Let K be a compact Hermitian operator (in some Hilbert

space H ). Then the spectrum S of K consists of eigenvalues. Each

eigenspace Hλ with respect to a non-zero eigenvalue λ ∈ S is finite

dimensional and the number of eigenvalues outside any neighborhood of

0 is finite. Moreover, S ⊆ R and

||K|| = sup{|λ| : λ ∈ S}

and the eigenspaces associated to distinct eigenvalues are orthogonal, i.e

Hλ ⊥ Hµ for λ 6= µ in S

40

Proof of Theorem 3.17: Assume that s 6= e in G and take an open

symmetric neighborhood V = V −1 of e in G such that s /∈ V 2. There

exists a positive continuous function f such that

f(e) > 0 , f(x) = f(x−1) = f(x) , Supp(f) ⊆ V

where Supp(f) denotes the support of f , namely the complement of the

largest open set on which f vanishes. Consider the function ϕ = f ∗ fdefined as

ϕ(x) =∫

G

f(y)f(y−1x)dy

The support of ϕ is contained in V 2 and

ϕ(s) = 0 (s /∈ V 2) , ϕ(e) = ||f ||2 > 0.

41

We also see that l(s)ϕ 6= ϕ. But the operator K with kernel

k(x, y) = f(y−1x) is compact (see Lemma 3.18) and the convergence in

quadratic mean of

f = f0 +∑

fi , fi ∈ Ker(K − λi) = Hi (λi ∈ Spec(K))

implies that

ϕ = Kf =∑

Kfi =∑

λifi

where

fi =1λiKfi ∈ Im(K) ⊆ C(G)

where we have uniform convergence holding in the series above. Since

l(s)ϕ 6= ϕ, we must have l(s)fi 6= fi for at least one index i. However

the definition of the kernel k shows that

k(sx, sy) = k(x, y) = f(y−1x) (s, x, y ∈ G)

42

The consequence of these identities is the translation invariance of all

the eigenspaces Hi of K. The left regular representation restricted to a

suitable finite dimensional subspace Hi (for any i, with l(s)fi 6= fi) will

furnish an example of a finite dimensional representation π with

π(s) 6= e.

The corollaries of this theorem are numerous and important.

Corollary 3.20. A compact group is commutative if and only if all its

finite dimensional irreducible representations have dimension 1.

Proof. Exercise.

43

Corollary 3.21 (Peter-Weyl). Any continuous function on a compact

group is a uniform limit of (finite) linear combinations of coefficients of

irreducible representations.

Proof. Let π be a (finite dimensional) irreducible representation of the

compact group G and take a basis in the representation space of π in

order to be able to identify in π : G→ Gln(C), the coefficients of π

being the continuous functions on G defined as

cij : g −→ cij(g) = 〈ei,π(g)ej〉

More generally if u and v are elements of H, we can define the

(function) coefficient cuv of π on G by

g −→ cuv (g) = 〈u,π(g)v〉

44

These functions are obviously finite linear combinations of the previously

defined matrix coefficients cij . Introduce the subspace V (π) of C(G)spanned by the cij , or equivalently by all cuv for v, u ∈ Hπ . Observe that

the subspaces of C(G) attached in this way to two equivalent

representations π and π′ coincide namely, V (π) = V (π′). Thus we can

form the algebraic sum (a priori this algebraic sum is not a direct sum)

AG =⊕

Vπ ⊆ C(G)

where the summation index π runs over all (classes of) finite

dimensional irreducible representations of G . The corollary can be

restated in the following form:

AG is a dense subspace of the Banach space C(G) in the

uniform norm

45

But this algebraic sum AG is a subalgebra of C(G) (the product of two

continuous functions being the usual pointwise product). The product of

the coefficients

cuv of π and γst of σ

is a coefficient of the representation π ⊗ σ (the coefficient of this

representation with respect to the two vectors u⊗ s and v ⊗ t). Taking

π and σ to be finite dimensional representations of G , π ⊗ σ will be

finite dimensional, hence completely reducible and all its coefficients (in

particular the product of cvu and γst ) are finite linear combinations of

coefficients of (finite dimensional) irreducible representations of G .

This subalgebra AG of C(G) contains the constants, is stable under

complex conjugation (because π is irreducible precisely when π is

irreducible) and separates points of G by the main Theorem 3.17. By

the Stone-Weistrass theorem the proof is complete.

46

3.1 Exercises

Exercise 3.22. Let G be a compact totally discontinuous group. Show

that AG is the algebra of all locally constant functions on G . (Observe

that a locally constant function on G is uniformly locally constant, hence

can be identified with a function on a quotient G/H where H is some

open subgroup of G . Conversely any finite dimensional representation

of G must be trivial on an open subgroup H of G . )

Exercise 3.23. Let G be any compact group. Show that AG consists of

the continuous functions f on G for which the left and right translates

of f generate a finite dimensional subspace of C(G). In particular if G1

and G2 are two compact groups, any continuous homomorphism

h : G1 → G2 has a transpose th : A2 → A1 where Ai = AGi , defined by

th(f) = f ◦ h. A priori this transpose is a linear mapping

th : C(G2)→ C(G1).

47

Exercise 3.24. Let G = Un(C) with its canonical representation π in

V = Cn. Since π is unitary, we can identify π with the contragredient

of π : it acts in the dual V ∗ of V .

(a) Let Apq denote the space of linear combinations of coefficients of the

representation

πpq = π⊗p ⊗ π⊗q in (V ∗)⊗p ⊗ V ⊗q = T p

q (V )

Prove that the sum of the subspaces Apq of C(G)is an algebra A (show

that ApqA

rs ⊆ Apr

qs), stable under conjugation (show that Apq = Aq

p),

which separates the points of G . Using the Stone-Weierstrass theorem,

conclude that A is dense in C(G).(b) Show that A = AG. (use part(a) to prove that any irreducible

representation of G appears as a subrepresentation of some πpq , or in

other words can be realized on a space of mixed tensors.)

48

Exercise 3.25. Let G be a closed subgroup of Un(C). Using the fact

that any finite dimensional representation of G appears in the restriction

of some finite dimensional representation of Un(C) (this is a

consequence of the theory of induced representations), show that G is a

real algebraic subvariety of Un(C). (The transpose of the embedding

G ↪→ Un(C) is the operation of restriction on polynomial functions and

is surjective. Hence AG is a quotient of the polynomial algebra A of

Un(C). By the exercise 3.24, A is generated by the coordinate functions

Un(C) −→ C , x = (xij) 7−→ xi

j

and their conjugates. )

49

4 Decomposition of the regular

representation

Lemma 4.1. Let V ⊆ L2(G) be a finite dimensional subspace which is

invariant under the right regular representation of G. Then V consists of

(classes of) continuous functions and each f ∈ V can be written as

f(x) = Tr(Aπ(x)) for some A ∈ End(V )

Here π denotes the restriction of the right regular representation to V .

50

Proof. Take an orthonormal basis (χi) of V and recall the coefficients cijof π defined as

π(x)χi =∑

j

cji (x)χj , x ∈ G

For f =∑

i aiχi, we have

π(x)f =∑

i

aiπ(x)χi =∑i,j

aicji (x)χj

Hence

f(x) = [r(x)f ](e) =∑i,j

cji (x)aij (with ai

j = aiχj(e))

Thus,

f(x) = Tr(Aπ(x))

as claimed.

51

Let (π, V ) be any finite dimensional representation of the compact

group G. For any endomorphism A ∈ End(V), we define the

corresponding coefficient cA of π by cA(x) = Tr(A · π(x)). The right

translates of these coefficients are easily identified as

[r(s)cA](x) = cA(xs) = Tr(Aπ(x)π(s))

= Tr(π(s) ·Aπ(x)) = Tr(Bπ(x)) = cB(x)

where B = π(s) ·A.

Consider the representation of G in End(V ) defined by

lπ(s)A = π(s) ·A (s ∈ G,A ∈ End(V ))

The above computations show that A→ cA is a G-morphism

c : End(V ) −→ C(G) ⊆ L2(G)

(intertwining lπ and r.)

52

Suppose now that (π, V ) is an irreducible finite dimensional

representation of the compact group G.

Write L2(G,π) for the image of End(V ) under the map c. Note that

the vector space L2(G,π) only depends on the equivalence class of π.

The representation (lπ ,End(V )) is equivalent to π ⊕ · · · ⊕ π (n times,

where n = dim(V )) and L2(G,π) is r-invariant, so the restriction of r

to L2(G,π) is equivalent to π ⊕ · · · ⊕ π (m times for some m ≤ n).

Thus, L2(G,π) has dimension mn ≤ n2.

If V ′ is a r-invariant subspace of L2(G) such that the restriction of r to

V ′ is equivalent to π, then V ′ is a subspace of L2(G,π) by Lemma 4.1.

Hence, L2(G,π) is the sum of all subrepresentations of (L2(G), r) which

are equivalent to π.

53

Definition 4.2. Let π be a finite dimensional irreducible representation

of a compact group G. The space L2(G,π) consisting of the sum of all

subspaces of the right regular representation which are equivalent to π is

called the isotypical component of π in L2(G).

Note that a function f ∈ L2(G) belongs to L2(G,π) precisely when the

right translates of f generate an invariant subspace (of the right regular

representation) equivalent to a finite multiple of π (that is, a finite

direct sum of subrepresentations equivalent to π).

54

We shall now prove that the dimension of an isotypical component

L2(G,π) is exactly (dim π)2.

55

Recall the G-morphism c : End(V)→ L2(G) , A 7→ cA := Tr(Aπ). The

fact that cA 6= 0 for A 6= 0 will be deduced from a computation of the

quadratic norm of these coefficient functions. It is easier to start with

the case of rank ≤ 1 linear mappings. We use the isomorphisms

V ⊗ V −→ End(V ) (V = dual of V )

defined as follows: If u ∈ V and v ∈ V , the operator corresponding to

u⊗ v is

u⊗ v : x→ u(x)v = 〈u, x〉v

56

The image of u⊗ v consists of multiples of v, and u⊗ v has rank 1when u and v are non-zero (quite generally, decomposable tensors

corresponding to operators of rank ≤ 1). The coefficient cA with respect

to the operator A = u⊗ v coincides with the previously defined

coefficient

cuv = 〈u,π(x)v〉 = cu⊗v(x)

57

Lemma 4.3. Let π and σ be two representations of a compact group G

and A : Vπ → Vσ be a linear mapping. Then

A\ =∫

G

σ(g)Aπ(g)−1dg

is a G-morphism from Vπ to Vσ , namely A\ ∈ HomG(Vπ , Vσ).

Proof. We have

A\π(s) =∫

σ(g)Aπ(g)−1π(s)dg =∫

σ(g)Aπ(s−1g)−1dg

and replacing g by sg (i.e. s−1g by g)

A\π(s) =∫

σ(sg)Aπ(g)−1dg = σ(s)A\

58

Thus the averaging operation (given by the Haar integral) of Lemma 4.3

leads to a projector

\ : Hom(Vπ , Vσ)→ HomG(Vπ , Vσ) , A→ A\

In particular when π and σ are disjoint, i.e HomG(Vπ , Vσ) = 0, we

must have A\ = 0. This is certainly the case when π and σ are

non-equivalent irreducible representations (Schur’s lemma). Another

case of special interest is π = σ, finite dimensional and irreducible.

Schur’s lemma gives HomG(Vπ , Vσ) = C and thus A = λAid is a scalar

operator.

59

Proposition 4.4. If π is a finite dimensional irreducible representation

of the compact group G in V , the projector

End(V )→ EndG(V ) = C id, A→ A\ = λA id,

is given explicitly by the following formula:

A\ =∫

G

π(g)Aπ(g)−1dg =Tr(A)dimV

idV

Proof. Since we know a priori that the operator A\ is a scalar operator

λAid, we can determine the value of the scalar by taking traces in the

defining equalities

λATr(idV ) = Tr

(∫G

π(g)Aπ(g)−1dg

)=∫

G

Tr(π(g)Aπ(g)−1

)dg

=∫

G

Tr(A)dg = Tr(A)

60

Theorem 4.5 (Schur’s orthogonality relations). Let G be a compact

group and π,σ be a two finite dimensional irreducible representations of

G. Assume that π and σ are unitary. Then

(a) If π and σ are non-equivalent, L2(G,π) and L2(G, σ) are

orthogonal in L2(G).(b) If π and σ are equivalent, L2(G,π) = L2(G,σ) and the inner

product of the two coefficients of this space is given by

〈cuv , cxy〉 =∫

G

〈u,π(g)v〉〈x,π(g)y〉dg = 〈u, x〉〈v, y〉/dimV

(c) More generally in the case π = σ, the inner product of general

coefficients is given by

〈cA, cB〉 =∫

G

Tr(Aπ(g))Tr(Bπ(g))dg = Tr(A∗B)/dimV

61

Proof. (a) follows from Lemma 4.3 and (b) follows similarly from

Proposition 4.4. It will be enough to show how (b) is derived. For this

purpose, we consider the particular operators v ⊗ y (v ∈ Vπ , y ∈ Vπ)

and apply the result of the proposition∫G

π(g)(v ⊗ y)π(g)−1dg =Tr(v ⊗ y)

dim VidV =

〈v, y〉dimV

idV

Let us apply this operator to the vector u and take the inner product

with the vector x

〈x,∫

G

π(g)(v ⊗ y)π(g)−1u dg〉 =〈v, y〉dimV

〈x, u〉 =〈u, x〉〈v, y〉

dimV

62

But we have

π(g)(v ⊗ y)π(g)−1u = π(g)〈v,π(g−1)u〉y= 〈π(g)v, u〉π(g)y = 〈u,π(g)v〉π(g)y

hence

〈x,∫

G

π(g)(v ⊗ y)π(g)−1u dg〉 =∫

G

〈u,π(g)v〉〈x,π(g)y〉 dg = 〈cuv , cxy〉

as expected. Finally (c) follows from (b) by linearity since the operators

v ⊗ y (of rank ≤ 1) generate End(V ).

In particular we see that if 0 6= A ∈ End(V ),

||cA||2 = Tr(A∗A)/ dim V 6= 0

and the mapping c : End(V )→ L2(G,π) is one-to-one and onto. The

dimension of this isotypical component is thus (dim V )2.

63

Corollary 4.6. The Hilbert space L2(G) is the Hilbert sum of all

isotypical components

L2(G) =⊕

L2(G,π)

The summation index π runs over equivalence classes of finite

dimensional irreducible representations of the compact group G.

64

Proof. We have already seen that the isotypical subspaces L2(G,π) are

mutually orthogonal to each other. Thus our corollary will be proved if

we show that the algebraic sum

AG =⊕

L2(G,π) ⊆ C(G)

is dense in the Hilbert space L2(G).

But AG consists of coefficients of finite dimensional representations of G

(we have proved that all finite dimensional representations are

completely reducible), and the Peter-Weyl theorem 3.17 has shown that

AG is dense in C(G) for the uniform norm. A fortiori AG will be dense in

L2(G) for the quadratic norm.

65

Corollary 4.7. Any (continuous, topologically) irreducible representation

of a compact group G in a Banach space is finite dimensional.

Proof. Let σ : G→ Gl(E) be such a representation and let E′ denote

the (topological) dual of E, namely E′ is the Banach space of

continuous linear forms on E. By the Hahn-Banach theorem, for each

0 6= x ∈ E, there is a continuous linear form x′ ∈ E′ with x′(x) 6= 0. For

u ∈ E′ and v ∈ E we can consider the corresponding coefficient of σ

cuv ∈ C(G) : g → cuv (g) = 〈u,σ(g)v〉

66

Letting v vary in E, we get a linear mapping

Q : E → C(G) ⊆ L2(G) , v → cuv

Since G is compact and the mappings g → σ(g)v are continuous (by the

definition of continuous representations), the sets σ(G)v are compact,

hence bounded in E (for each v ∈ E). By the uniform boundedness

principle (Banach-Steinhaus theorem), the set σ(G) of operators in E is

equicontinuous and bounded

supg∈G||σ(g)|| = M <∞

Hence

||Qv|| = supg∈G|cuv (g)| = sup

g∈G|〈u,σ(g)v〉|

≤ ||u||E′ sup

g∈G||σ(g)v||

E≤M ||u||

E′ ||v||E

67

This proves that Q is continuous from E into C(G) (equipped with the

uniform norm); it’s kernel is a proper and closed subspace F 6= E if

u 6= 0 (in this case u(v) 6= 0 for some v ∈ E and thus

cuv (e) = 〈u, v〉 = u(v) 6= 0).

Take v ∈ E with Q(v) 6= 0, and apply the orthogonal Hilbert sum

decomposition of the preceding corollary to Q(v).∑Pπ(Qv) = Qv 6= 0

with

Pπ = orthogonal projector from L2(G) onto L2(G,π)

68

This implies that there is a π (finite dimensional irreducible

representation of G) with PπQv 6= 0. For this π, we consider the

composite

EQ−→ L2(G)

Pπ−→ L2(G,π)

and its kernel which is a proper closed subspace Fπ 6= E. But Q is a

G-morphism (intertwining σ and the right regular representation)

cuσ(x)v(g) = 〈u,σ(g)σ(x)v〉 = cuv (gx)

which implies that Q(σ(x)v) = ρ(x)Q(v). Since Pπ is also a

G-morphism, the kernel Fπ of the composite PπQ must be an invariant

subspace of E. However E is irreducible by assumption so that

Fπ = {0}, and the composite PπQ is one-to-one (into):

dim E ≤ dim L2(G,π) = (dim V )2

69

Definition 4.8. The dual G of a compact group G is the set of

equivalence classes of irreducible representations of G.

Let π be an irreducible representation of the compact group G and,

$ = [π] its equivalence class ($ ∈ G). We say that π is a model of $

in this case i.e. when π ∈ $. The dimension of $ is defined as

dim (π) = dim (Vπ) independently from the model chosen. Similarly

the isotypical component of $ (in the right regular representation) is

defined as L2(G,$) = L2(G,π) independently from the model π

chosen for $. By the finiteness of the dimension of the irreducible

representations of the compact group G namely Corollaries 4.6 and 4.7

L2(G) =⊕

$∈GL2(G,$)

Instead of π ∈ $ ∈ G we shall write more simply π ∈ G.

70

Proposition 4.9. Let G be a compact group. Then the following

properties are equivalent

(i) The dual G is countable.

(ii) L2(G) is separable.

(iii) G is metrizable.

Proof. The equivalence between (i) and (ii) is obvious since L2(G) is the

Hilbert sum of the finite dimensional isotypical components L2(G,$)over the index set G. Moreover G can always be embedded in a product∏

G

U(π) with U(π) ∼= Udim (π)(C) (metrizable group)

71

Since any countable product of metrizable topological spaces is

metrizable, we see that (i)⇒(iii). Finally, the implication (iii)⇒(ii) is a

classical application of the Stone-Weirstrass theorem.

72

4.1 Exercises

Exercise 4.10. Let V be a finite dimensional vector space and V be its

dual and for u ∈ V , u ∈ V denote by u⊗ v the operator x→ u(x)v as

defined in the beginning of this Section. Show

(a) Tr(u⊗ v) = 〈u, v〉 = u(v) (intrinsic definition of Tr),

(b) (u⊗ v) · (x⊗ y) = 〈u, y〉x⊗ v,(c) tAu⊗Bv = B · (u⊗ v) ·A for A,B ∈ End(V ).Moreover, identifying the dual of V ⊗ V with the dual of V ⊗ V in the

obvious canonical way show

(d) t(u⊗ v) = v ⊗ uAssume now that V is now a representation space for a group G. Using

(a) and (c) prove

(e) cu⊗v = cuv ( i.e. Tr(π(x) · u⊗ v) = 〈u,π(x)v〉).

73

Exercise 4.11. Let G be a compact group and (π, V ) be a finite

dimensional representation of G. We denote by V G the subspace of

invariants of V :

V G = {v ∈ V : π(g)v = v ∀g ∈ G}

(a) Check that the operator P =∫

Gπ(g)dg is a projector from V onto

V G. If π is unitary, P is the orthogonal projector on this subspace.

(b) For two finite dimensional representations π and σ of G, consider

Hom(Vπ , Vσ) as a representation space of G via the action

g ·A = π(g) ·Aσ(g)−1

Observe that

Hom(Vπ , Vσ)G = HomG(Vπ , Vσ)

and deduce a proof of the Lemma 4.3 from this observation.

74

(c) Using the G-isomorphism

Vπ ⊗ Vσ −→ Hom(Vπ , Vσ)

with Vπ ⊗ Vσ being equipped with the representation π ⊗ σ, conclude

that the projector \ : Vπ ⊗ Vσ → (Vπ ⊗ Vσ)G is given by∫G

(π ⊗ σ)(g)dg

75

5 Convolution, Plancherel formula and

Fourier Inversion

Definition 5.1 (Convolution). On a compact group G, the convolution

of two continuous functions f and g is defined as

f ∗ g(x) =∫

G

f(y)g(y−1x)dy

Defining f∗(x) = f(x−1), we can also write

f ∗ g(x) =∫

G

f(xy)g(y−1)dy =∫

G

f(xy−1)g(y)dy

=∫

G

f∗(yx−1)g(y)dy = 〈r(x−1)f∗, g〉

76

The Cauchy-Schwartz inequality gives:

|f ∗ g(x)| ≤ ||f∗||2||g||2 = ||f ||2||g||2

whence

||f ∗ g||∞ = supx∈G|f ∗ g(x)| ≤ ||f ||2||g||2

Consequently the convolution product can be extended by continuity

from C(G) to L2(G) and by definition

∗ : L2(G)× L2(G)→ C(G), , (f, g)→ f ∗ g

is a continuous bilinear mapping still satisfying the above inequality. On

the other hand the preceding formulas show:

〈f, g〉 = f∗ ∗ g(e) , ||f ||22 = f∗ ∗ f(e)

77

The convolution product for functions in L1(G) can be defined by the

same integral formula, but this will not converge for every x ∈ G and the

result will no longer be continuous in general. To see what happens,

take f, g ∈ L1(G). By Fubini’s theorem,∫|f ∗ g(x)|dx ≤

∫dx

∫dy|f(y)g(y−1x)|

=∫dy|f(y)|

∫dx|g(y−1x)|

= ||g||1∫|f(y)|dy = ||f ||1||g||1 <∞

78

These inequalities prove that∫f(y)g(y−1x)dy converges absolutely for

almost all x ∈ G and the result f ∗ g ∈ L1(G) satisfies:

||f ∗ g||1 ≤ ||f ||1||g||1

This convolution product is associative and L1(G) is an algebra for

convolution. This algebra has no unit element in general (more precisely,

it has no unit when G is not discrete i.e. G is not finite). We shall see

that this algebra L1(G) is commutative exactly when G is commutative.

79

5.1 Integration of representations

Assume that π is a unitary representation of the compact group G in a

Hilbert space H. We can “extend” π to a representation of L1(G) by

the formula

π1(f) =∫

G

f(x)π(x)dx (f ∈ L1(G))

These integrals converge absolutely in norm: ||π(x)|| = 1 implies∫||f(x)π(x)||dx =

∫|f(x)|dx = ||f ||1

Thus,

||π1(f)|| ≤ ||f ||1 (f ∈ L1(G))

80

Although G is not really embedded in L1(G) (when G is infinite), we

consider π1 as an extension of π. Later on we shall even drop the index

1, writing π instead of π1. The association

π : G→ Gl(H) ; π1 : L1(G)→ End(H)

can even be made when π is a representation in a Banach space since

π(G) is always a bounded set of End(H): being weakly compact, it is

uniformly bounded.

81

5.2 Comparison of several norms

Let A ∈ End(V ) be any endomorphism in V . Take any orthonormal

basis (ei) of V and assume that A is represented by the matrix (aij) in

the basis (ei). Obviously

||A||2HS =∑i,j

|aij |2

defines a norm on End(V ) called the Hilbert-Schmidt norm (a priori this

norm depends on the choice of the orthonormal basis (ei)).

82

If B is another endomorphism, represented by the matrix (bij) (in the

same basis), a small computation shows that

Tr(A∗B) =∑ij

aijb

ij

This shows that the Hilbert-Schmidt norm is derived from the

Hilbert-Schmidt inner product

〈A,B〉HS =∑i,j

aijb

ij = Tr(A∗B)

on End(V ), and is in particular independent from the choice of the

orthonormal basis (ei) of V .

83

Return to a compact group G and a unitary irreducible representation

π ∈ G in some finite dimensional Hilbert space V = Vπ . The spaces

V ⊗ V , End(V ) , L2(G,π)

are G-isomorphic. We shall give explicit isomorphisms between them

keeping track of the various norms involved.

84

We have introduced the coefficients

cuv (x) = 〈u,π(x)v〉 (u ∈ V , v ∈ V )

and the more general coefficients (linear combinations of the preceding

ones)

cA(x) = Tr(Aπ(x)) (A ∈ End(V ))

with the relation

cA = cuv for A = u⊗ v

85

For fixed u ∈ V , the linear mapping

cu : V → L2(G,π) , v → cuv

is a G-morphism form π to r, the right regular representation.

86

Similarly if v ∈ V is fixed,

l(s)cuv (x) = cuv (s−1x) = 〈u,π(s−1)π(x)v〉

= 〈tπ(s−1)u,π(x)v〉 = cπ(s)uv (x)

shows that the linear mapping

cv : V → L2(G,π) , u→ cuv

is a G-morphism from π to l (the left regular representation). Summing

up,

c : V ⊗ V → L2(G,π) , u⊗ v → cuv

is a π ⊗ π to l × r (biregular representation) G×G-morphism.

87

Note that

[l × r(s, t)]cA(x) = cA(s−1xt) = Tr(Aπ(s−1)π(x)π(t))

= Tr(π(t)Aπ(s)−1π(x)) = cπ(t)Aπ(s)−1(x)

So, the corresponding action of G×G on End(V ) is defined by

(s, t) ·A = π(t)Aπ(s)−1

88

In the following diagram of G-morphisms, V ⊗ V is equipped with the

inner product

〈u⊗ v, x⊗ y〉 = 〈u, x〉〈v, y〉

(We use the Riesz isomorphism between V and V ). This inner product

corresponds to the Hilbert-Schmidt norm

〈u⊗ v, x⊗ y〉 = Tr((u⊗ v)∗x⊗ y)

89

Schur’s orthogonality relations (Theorem 4.5) say

〈cuv , cxy〉 =1

dim π〈u, x〉〈v, y〉 (dim π = dim V ),

and hence

c : u⊗ v → cuv is√

dim π−1× an isometry map

91

The inverse of c is nearly the extension π1 of π. Let us compute

π1(cuv ). For this purpose, we apply this operator to a vector y and

compute the inner product of the result with a vector x

〈x,π1(cuv )y〉 =∫cuv (s)〈x,π(s)y〉ds = 〈cuv , cxy〉

This quantity will vanish when π is not equivalent to π.

92

However, Schur’s relations give

〈x,π1(cuv )y〉 = 〈cuv , cxy〉 = (dim π)−1〈u, x〉〈v, y〉

= (dim π)−1〈x, 〈v, y〉u〉 = 〈x, (dim π)−1v ⊗ u(y)〉

This implies

π1(cuv ) = (dim π)−1v ⊗ u

and by linearity

π1(cA) = (dim π)−1A∗

93

Since the cA are the coefficients of π, we see that on L2(G, π),f → π1(f) is of the form

π1∣∣L2(G,π)

=√

dim π−1× an isometry map

The composite:

End(V ) → L2(G,π)conj−→ L2(G, π) → End(V )

A → cA cA = f → π1(f)

can be identified with

(dim π)−1 · (A→ A∗) = (dim π)−1 × an isometry map

94

NOTE: From now on, we write π1(f) as simply π(f).

Theorem 5.2 (Plancherel theorem). Let G be a compact group. For

π ∈ G denote by Pπ : L2(G)→ L2(G,π), the orthogonal projector on

the isotypical component of π, and for f ∈ L2(G), let fπ = Pπ(f) so

that the series∑

G fπ converges to f in L2(G). Then

(a) fπ(x) = dim π · Tr(π(f)π(x)). Here f(x) = f(x−1)(b) ||fπ ||22 = dim π · ||π(f)||2HS (Hilbert-Schmidt norm on RHS).

(c) ||f ||22

=∑

π∈G dim π · ||π(f)||22

(Parseval equality)

95

Proof. The orthogonal projection fπ of f in L2(G,π), is the element

cA of this space having the same inner product with all elements cB of

L2(G,π). Let us determine A as a function of f . We must have

〈cB , f〉 = 〈cB , fπ〉 = 〈cB , cA〉 = (dim π)−1Tr(B∗A)

But

〈cB , f〉 =∫cB(x)f(x)dx =

∫Tr(Bπ(x))f(x)dx

=∫Tr(π(x−1)B∗)f(x)dx = Tr(B∗

∫π(x−1)f(x)dx)

= Tr(B∗π(f))

Comparing the two results obtained for all B, we indeed find

A = (dim π) · π(f)

96

This gives

fπ(x) = cA(x) = Tr(Aπ(x)) = (dim π) · Tr(π(f)π(x))

as asserted in part (a). Moreover Schur’s relations show that

||fπ ||22 = ||cA||22 = (dim π)−1||A||2HS = (dim π) · ||π(f)||2HS

Thus (b) is proved and (c) follows from the observation that f and f

have the same L2(G): we can interchange f and f . Also observe that

the dimensions of π and π are the same.

97

5.3 Exercises

Exercise 5.3. Let G be a compact group, f, g ∈ L1(G). For any

representation σ of G (in a Banach space), prove

σ(f ∗ g) = σ(f) · σ(g)

If σ is unitary, prove also σ(f) = σ(f∗). (recall that f∗(x) = f(x−1))

Exercise 5.4. Show that the “extensions” of the regular representations

of a compact group G are given by

l(f)(ϕ) = f ∗ ϕ , r(g)(ϕ) = ϕ ∗ g

where f, g ∈ L1(G) and ϕ ∈ L2(G) (recall that g(x) = g(x−1)).Conclude from this that

||f ∗ ϕ||2 ≤ ||f ||1 ||ϕ||2

98

Moreover using exercise 5.3 deduce the associativity

(f ∗ g) ∗ ϕ = f ∗ (g ∗ ϕ)

Here f, g ∈ L1(G) and ϕ ∈ L2(G) or all three functions in C(G). Also

check that for any representation σ of G

σ(l(x)f) = σ(x)σ(f) , σ(r(x)f) = σ(f)σ(x−1)

Exercise 5.5. Let G be a compact group and denote by

L1inv = L1

inv(G) the closure of L1(G) of the subspace of continuous

functions f satisfying f(xy) = f(yx) (or equivalently f(y−1xy) = f(x))for all x, y ∈ G. Show that L1

inv is contained in the center of L1(G) (as

convolution algebra: prove that f ∗ g = g ∗ f for f, g ∈ L1inv). For any

irreducible representation π : G→ Gl(V ) prove that

π(f) = (dim π)−1〈χ, f〉1V (f ∈ L1inv)

where χ(g) = Tr(π(g)) and 〈χ, f〉 =∫χ(g)f(g)dg.

99

Hint: Use Schur’s lemma to prove that π(f) is a scalar operator and

then take traces to determine the value of the constant in this multiple

of the identity.

Exercise 5.6. Let σ : G→ Gl(V ) be a unitary representation of a

compact group G. Check that σ(1) = P (here 1 is the constant

function 1 in L1(G)) is the orthogonal projector V → V G on the

subspace of G-invariants of V . (Hint: show that 1 ∗ 1 = 1 and 1∗ = 1 in

L1(G); more generally 1 ∗ f = f ∗ 1 is the constant function∫f(x)dx.)

Exercise 5.7. Show that the “extended” left regular representation

l = l1 : L1(G)→ End(L2(G))

has trivial kernel {0}. (Hint: Let 0 6= f ∈ L1(G) and construct a

sequence (gn) ⊆ C(G) with gn ≥ 0,∫gn(x)dx = 1 and

l(f)(gn) = f ∗ gn → f 6= 0). Conclude that if 0 6= f ∈ L1(G), there

exists a π ∈ G such that π(f) 6= 0. (continued on the next page)

100

Finally prove that

L1(G) commutative⇐⇒ G commutative

Exercise 5.8. Let G be a compact group, π ∈ G and consider the

adjoint representation of G in End(V ) (V = Vπ) defined by the

following composition

Ad: G −→ G×G −→ End(V )

s −→ (s,s) −→ (A→ π(s)Aπ(s)−1)

(s,t) −→ (A→ π(t)Aπ(s)−1)

Prove that the multiplicity of the identity representation in this adjoint

representation is 1. (This identity representation acts on the subspace of

scalar operators: Schur’s lemma).

101

Exercise 5.9. The decomposition of the biregular representation of a

compact group G in L2(G) gives the decomposition of the left (resp.

right) regular representation simply by the composition with

i1 : G → G×G (resp. i2 : G → G×Gs → (s, e) s → (e, s))

Conclude that

l ∼= ⊕π ⊗ 1 ∼= ⊕dim π · π = dim π · πr ∼= ⊕1⊗ π ∼= ⊕dim π · π

102

6 Characters and Group algebras

Let (π, V ) be a finite dimensional representation of a compact group G.

The character χ = χπ of π is the (complex valued) continuous function

on G defined by

χ(x) = Tr(π(x))

Note that this is the function cA for A = idV ∈ End(V ).

When dim (V ) = 1, χ and π can be identified. In this case χ is a

homomorphism.

Quite generally, since the trace satisfies the identity Tr(AB) = Tr(BA),we see that the characters of two equivalent representations are equal.

103

Moreover, characters satisfy

χ(xy) = χ(yx) as χ(y−1xy) = χ(x) (x, y ∈ G)

Thus characters are invariant functions

χ ∈ Cinv = {f ∈ C(G) : f(y−1xy) = f(x), x and y ∈ G}

Observe that

L1inv = closure of C(G) in L1(G)

L2inv = closure of C(G) in L2(G)

Invariant functions are also called central functions, they belong to the

center L1(G) with respect to convolution.

104

Still quite generally, the character of the contragredient π of π is given

by

χπ(x) = Tr(π(x)) = Tr(tπ(x−1)) = Tr(π(x−1)) = χ(x−1)

hence χπ = χ

When π is unitary, π(x−1) = π(x)∗ (π is equivalent to π) and χ is the

complex conjugate of χ.

105

One can also check without difficulty that for two finite dimensional

representations π and σ of G

χπ⊕σ = χπ + χσ , χπ⊗σ = χπ · χσ

106

When π is irreducible, Schur’s lemma shows that the elements z in the

center Z of G are mapped to scalar operators by π: π(z) = λzidV so

that χ(z) = λzdim (V ). Thus the restriction of (dim V )−1χ to the

center Z is a homomorphism

λ : Z −→ C×

This is the central character of π: it is independent of the special

model chosen in the equivalence class of π.

In particular if λ(z) is not contained in {±1} , π and π are not

equivalent: their central characters are different.

Also observe that χ(e) = dim (V ) (= dim π)

107

Proposition 6.1. Any continuous central function f ∈ Cinv on a

compact group G is a uniform limit of linear combinations of characters

of irreducible representations of G.

108

Proof. Let ε > 0. By the Peter-Weyl theorem 3.17, we know that there

is a finite dimensional representation of (σ, V ) and a A ∈ End(V ) with

|f(x)− Tr(Aσ(x))| < ε (x ∈ G)

In this representation, replace x by one of its conjugates yxy−1:

|f(x)− Tr(Aσ(yxy−1))| = |f(x)− Tr(σ(y−1)Aσ(y)σ(x))| < ε

Integrating over y, we have

|f(x)− Tr(Bσ(x))| < ε where B =∫

σ(y−1)Aσ(y)dy

By the invariance of the Haar measure, the operator B commutes with

all the operators σ(x).

109

Hence, if we decompose σ into isotypical components

σ ∼=⊕πnππ ∼=

⊕π

π ⊗ 1nπ

the operator B will have the form

B =⊕

iddim π ⊗Bπ

and

Bσ(x) = σ(x)B ∼=⊕

π ⊗Bπ

Tr(Bσ(x)) =∑

aπχπ(x) (aπ = Tr(Bπ))

This shows that

|f(x)−∑finite

aπχπ(x)| < ε

110

Theorem 6.2. Let π and σ be two finite dimensional representations of

a compact group G with respective characters χπ and χσ . Then

〈χπ , χσ〉 = dim HomG(Vπ , Vσ)

111

Proof. By Lemma 4.3, we know that the integral∫π(x)⊗ σ(x)dx

is an expression for the projector

\ : Vπ ⊗ Vσ −→ (Vπ ⊗ Vσ)G

Hom(Vπ , Vσ) −→ HomG(Vπ , Vσ)

The dimension of the image space is the trace of this projector. Thus

〈χπ , χσ〉 =∫χπ(x)χσ(x)dx = dim HomG(Vπ , Vσ)

112

Corollary 6.3. Let π be a finite dimensional representation of G. Then

π is irreducible ⇐⇒ ||χπ ||2 =√〈χπ , χπ〉 = 1

Corollary 6.4. Let π,σ ∈ G. Then

〈χπ , χσ〉 = δπσ (= 1 if π is equivalent to σ, 0 otherwise)

113

Corollary 6.5. Let σ be a finite dimensional representation of G and

σ =⊕

I nππ (summation over a finite subset I ⊆ G) be a

decomposition into irreducible components. Then

(a) nπ = 〈χπ , χσ〉 is well determined

(b) ||χσ ||2 =∑

I n2π

Proof. Let Vσ =⊕

(Vτ ⊗ Cnτ ). Since each G-morphism Vσ → Vπ

must vanish on all isotypical components Vτ ⊗ Cnτ where τ is not

equivalent to π, we have

HomG(Vσ , Vπ) = HomG(Vπ ⊗ Cnπ , Vπ)

= Cnπ ⊗ HomG(Vπ , Vπ) = Cnπ

This proves assertion (a). Moreover (b) follows from Pythagoras

theorem and Corollary 6.3

114

Corollary 6.6. The set of characters (χπ)π∈G is an orthonormal basis

of the Hilbert space L2(G)inv. Every function f ∈ L2(G)inv can be

expanded in the series

f =∑G

〈χπ , f〉χπ (convergence in L2(G))

115

Theorem 6.7. Let G be a compact group. For π ∈ G, let Pπ denote

the projector L2(G)→ L2(G,π) onto the isotypical component of π (in

the right regular representation). Then Pπ is given by the convolution

with the normalized character ϑπ = dim π · χπ

Pπ : f → fπ = Pπf = f ∗ ϑπ

Proof. We have already seen that

fπ(x) = dim π · Tr(π(f)π(x))

Thus

fπ(x) = dim π · Tr(∫

f(y−1)π(y)π(x)dy)

= dim π ·∫f(y)Tr(π(y−1x))dy = f ∗ ϑπ(x)

116

Exercise 6.8. Let H1 and H2 be two Hilbert spaces. Prove that for any

operator A in H1 ⊗H2 which commutes to all operators T ⊗ 1T ∈ End(H1) can be written in the form 1⊗B for some B ∈ End (H2).(Hint: Introduce an orthonormal basis (ei) of H1 and write A as an

matrix of blocks with respect to this basis

A(ej ⊗ x) =∑

i

ei ⊗Aijx (Ai

j ∈ End(H2)).

Using the commutations (Pj ⊗ 1)A = A(Pj ⊗ 1) where Pj is the

orthogonal projector on Cej , conclude that Aij = 0 for i 6= j. Finally,

using the commutation relations of A with the operators Uji ⊗ 1

Uji(ei) = ej , Uji(ek) = 0 for k 6= i,

conclude that Aii = B ∈ End(H2) is independent of i.)

117

Exercise 6.9. Check that the formula

f =∑〈χπ , f〉χπ

coincides with the Fourier inversion formula.

118

7 Induced representations and

Frobenius-Weil reciprocity

Suppose that K is a closed (hence compact) subgroup of G.

Recall that K\G is the space of right cosets Kg, g ∈ G.

119

Suppose for the moment that K\G is finite, i.e.

K\G = {Kg1, . . . ,Kgn} for some n so that G is the disjoint union of

Kg1, . . . ,Kgn.

For each s ∈ G we have an associated permutation π(s) of {1, . . . , n}that sends i to the unique j with Kgis

−1 = Kgj .

We can define an representation ρ of G on Cn by

ρ(s)(a1, . . . , an) := (aπ−1(1), . . . , aπ−1(n))

120

Equivalently, if we think of Cn as the space of functions from K\G into

C, then, for s ∈ G and a coset L ∈ K\G,

ρ(s)f(L) = f(Ls)

The space of functions from K\G into C can be identified with the

space of functions from G to C that are constant on right cosets of K,

that is, with the space of functions f : G→ C such that

f(kx) = f(x), k ∈ K, x ∈ G

Note that ρ is just the right regular representation for G restricted to

this subspace.

Can we build other representations of G by similar constructions?

121

We return to the case where K is an arbitrary closed subgroup of G (so

that K\G is not necessarily finite).

122

The canonical projection Gp→ K\G pushes the Haar measure ds on G

forward to a measure dx on K\G characterized by∫K\G

f(x)dx =∫

G

f(p(s))ds (f ∈ C(K\G))

Lemma 7.1. Negligible sets in K\G (relative to the measure dx) are

those sets N for which p−1(N) is negligible in G (relative to the Haar

measure ds of G). Moreover, for any f ∈ C(G) (or by extension for any

f ∈ L1(G)) ∫K\G

[∫K

f(kx) dk]dx =

∫G

f(x)dx

In particular, the measure dx is invariant under left translations from G

in K\G.

Proof. Exercise.

123

Now let (σ, Vσ) be a unitary representation of K. We define the Hilbert

space L2(G,Vσ) as the completion of the space C(G,Vσ) of continuous

functions G→ Vσ with the norm

||f ||2 =∫

G

||f(x)||2dx

The norm under the integral sign is computed in Vσ .

Note that L2(G,Vσ) is a Hilbert space with inner product

〈f, g〉 =∫

G

〈f(x), g(x)〉dx

The inner product under the integral sign is computed in Vσ .

124

The elements

f ∈ L2(G,Vσ) such that f(kx) = σ(k)f(x) for all k ∈ K

constitute a subspace H ⊆ L2(G,Vσ).

Since ||f(x)|| only depends on the coset Kx of x for f ∈ H (σ is

assumed to be unitary), we can take the norm and inner product on H

to be defined by

||f ||2H =∫

K\G||f(x)||2 dx

〈f, g〉H =∫

K\G〈f(x), g(x)〉 dx

As before, the norm and the inner product under the integral sign are

computed in Vσ .

125

Note that if f ∈ H, r is the analogue of the right regular representation

of G on L2(G,Vσ) (that is, (r(s)h)(x) = h(xs) for h ∈ L2(G,Vσ) and

s, x ∈ G) and k ∈ K, then

(r(s)f)(kx) = f(kxs) = σ(k)f(xs) = σ(k)(r(s)f)(x),

so that r(s)f ∈ H also. That is, H is r-invariant.

The induced representation ρ = IndGK(σ) is by definition the “right

regular representation” r of G restricted to H ⊆ L2(G,Vσ).

The induced representation is unitary.

For example, if σ is the identity representation of K (in dimension 1),

Vσ = C, L2(G,Vσ) = L2(G), then H is simply L2(K \G) and we get

the construction considered at the start of this section.

126

Write HG for the G-fixed elements in H (with respect to the action

given by ρ).

As before, write V Kσ for the K-fixed elements of Vσ (with respect to the

action given by σ).

Proposition 7.2. (1) The linear map HG → V Kσ given by f → f(e) is

an isomorphism of vector spaces.

(2) Let (π,Hπ) be a unitary representation of G. Then there is an

equivalence

π ⊗ IndGK(σ) ∼→ IndG

K(π|K ⊗ σ) : Hπ⊗H → H

given by v ⊗ f → ϕ with ϕ(x) = [π(x)v]⊗ f(x)

127

Proof. The elements of HG are certainly functions f : G→ Vσ which

are (equal nearly everywhere to a) constant

f(x) = f(ex) = r(x)f(e) = f(e)

In particular,

f(k) = f(e), k ∈ K

By definition of H, f ∈ H implies

f(k) = f(ke) = σ(k)f(e), k ∈ K

Thus, f(e) = σ(k)f(e) for all k ∈ K and so f(e) ∈ V Kσ , giving part (1)

of the proposition.

128

To check part (2), let us first show that the functions ϕ (as defined in

the proposition) belong to the space of the induced representation

IndGK(π|K ⊗ σ)

ϕ(kx) = [π(kx)v]⊗ f(kx) = [π(k)π(x)v]⊗ [σ(k)f(x)]

= [[π|K ⊗ σ](k)](ϕ(x))

129

Next we show that v⊗ f → ϕ is a G-morphism (intertwining π⊗ ρ) and

ρ = IndGK(π|K ⊗ σ), which we recall is the restriction of the the right

regular representation to H. Note that

[π ⊗ ρ](s)(v ⊗ f) = [π(s)v]⊗ [ρ(s)f ]

is mapped to the function on G given by

x 7→ [π(x)π(s)v]⊗ [(ρ(s)f)(x)] = [π(xs)v]⊗ f(xs)

= (ρ(s)ϕ)(x)

as desired.

130

Now we check that v ⊗ f → ϕ is isometric and hence injective. If (ei) is

an orthonormal basis of Hπ , every element of Hπ⊗H can be written

uniquely as∑ei ⊗ fi with

∑||fi||2 <∞, and such an element has its

image the function ϕ =∑ϕi given by

x 7→∑

π(x)ei ⊗ fi(x)

The norm of ϕ is

||ϕ||2H

=∫

K\G||ϕ(x)||2dx =

∫K\G||∑

π(x)ei ⊗ fi(x)||2dx

=∫

K\G

∑||fi(x)||2dx =

∑∫K\G||fi(x)||2dx

=∑||fi||2H = ||

∑ei ⊗ fi||2

The third equality is justified by the fact that π(x)ei is also an

orthonormal basis of Hπ , since π is unitary.

131

Finally to see that v ⊗ f → ϕ is onto, it is enough to see that all

continuous functions Φ ∈ H belong to the image. The function Φ has a

unique expansion in the orthonormal basis (π(x)ei) of Hπ :

Φ(x) =∑

π(x)ei ⊗ fi(x) (fi(x) ∈ V )

||Φ(x)||2 =∑||fi(x)||2

Therefore∑π(kx)ei ⊗ fi(kx) = Φ(kx) = [π(k)⊗ σ(k)]Φ(x)

=∑

[π(k)π(x)ei]⊗ [σ(k)fi(x)]

The uniqueness of the decomposition gives fi(kx) = σ(k)fi(x). Thus,

fi ∈ H, as required.

132

Note that if we consider C as a trivial G- or K- space, then

HomG(C,H) ∼= HG

and

HomK(C, Vσ) ∼= V Kσ

(We identify A ∈ HomG(C,H) with the image h ∈ H of 1 ∈ C and

observe that the assumption that A intertwines with the induced

representation H is equivalent to the assumption that ρ(g)h = h for all

g ∈ G. A similar comment holds for HomK(C, Vσ).)

Therefore, the isomorphism of part (1) of the preceding proposition can

be written as

HomG(C,H) ∼→ HomK(C, Vσ)

This form admits the following generalization.

133

Theorem 7.3 (Frobenius-Weil). Let (π,Hπ) be a unitary

representation of a compact group group G and (σ, Vσ) a unitary

representation of one of its closed subgroups K. Put ρ = IndGK(σ) and

H = Hρ. Then there is a canonical isomorphism

MorG(Hπ ,Hρ)∼→ MorK(Hπ , Vσ)

where we take for morphisms between two representations spaces, the

Hilbert-Schmidt morphisms. (If our spaces are finite dimensional, then

MorG is what we have written before as HomG and, similarly, MorK is

just HomK .)

134

Proof. We have seen in finite dimensions that in the identification

Hπ ⊗Hρ∼→ Hom(Hπ ,Hρ)

the representation π ⊗ ρ is transformed into the representation

A→ ρ(s)Aπ(s)−1 (s ∈ G,A ∈ Hom(Hπ ,Hρ))

Much the same thing happens for infinite dimensional unitary

representations , we complete the algebraic tensor product of Hilbert

spaces and get an isomorphism with the space of Hilbert-Schmidt

operators.

Hπ⊗Hρ∼→ MorHS(Hπ ,Hρ)

135

Thus G-morphisms Hπ → Hρ correspond to G-invariants in Hπ⊗Hρ.

In other words,

MorG(Hπ ,Hρ) ∼= (Hπ⊗Hρ)G

Since Hπ⊗Hρ = H can be identified with the space of the

representation of G induced from the representation π|K ⊗σ of K (part

(2) of Proposition 7.2), we infer from part (1) of Proposition 7.2 that

(Hπ⊗Hρ)G ∼= HG ∼= (Hπ⊗Vσ)K

The conclusion follows from the identity

(Hπ⊗Vσ)K ∼= MorK(Hπ , Vσ)

136

Corollary 7.4. Consider (π,Hπ) ∈ G and (σ, Vσ) ∈ K. Then the

multiplicity of π in IndGK(σ) is the same as the multiplicity of σ in π|K

Proof. Denote the (infinite dimensional in general) space of

ρ = IndGK(σ) by Hρ.

Any G-morphism Hπ → Hρ must send Hπ into the isotypical

component π in Hρ: this isotypical component is isomorphic to a

⊕IHπ whence

MorG(Hπ ,⊕IHπ) =⊕

I

MorG(Hπ ,Hπ) =⊕

I

C

by Schur’s lemma.

137

On the other hand, every K-morphism Hπ → Vσ must vanish off of

those components of Hπ into a direct sum of irreducibles (for K) which

are not equivalent to (σ, Vσ).

Let us write the direct sum of the copies of (σ, Vσ) in Hπ as Vσ ⊗Cm.

Then

MorK(Hπ , Vσ) = MorK(Vσ ⊗ Cm, Vσ) ∼→ Mor(Cm,MorK(Vσ , Vσ))

138

To see the isomorphism

MorK(Vσ ⊗ Cm, Vσ) ∼→ Mor(Cm,MorK(Vσ , Vσ)),

note that if (ei) is a basis for Cm we can write any element of Vσ ⊗Cm

as∑

i vi ⊗ ei. Any map A ∈ MorK(Vσ ⊗Cm, Vσ), is specified by the m

maps v 7→ A(v ⊗ ei) = Ai belonging to MorK(Vσ , Vσ), and we can

think of these m maps as a single map from Cm to MorK(Vσ , Vσ)defined by ei 7→ Ai.

139

Since MorK(Vσ , Vσ) = CidV (Schur’s lemma), we have

MorK(Hπ , Vσ) ∼= dual of Cm

The isomorphism of the theorem implies equality of the dimension of the

spaces. They are respectively

Card(I) = multiplicity of π in ρ = IndGK(σ)

m = multiplicity of σ in π|K

140

8 Representations of the symmetric group

8.1 Young subgroups, tableaux and tabloids

If λ = (λ1, λ2, . . . , λl) is a partition of n, then write λ ` n. We also use

the notation |λ| =∑

i λi, so that a partition of n satisfies |λ| = n

Definition 8.1. Suppose λ = (λ1, λ2, . . . , λl) ` n. The Ferrers diagram

or shape, of λ is an array of n dots having l left-justified rows with row i

containing λi dots for 1 ≤ i ≤ l.Definition 8.2. For any set T , let ST be the set of permutations of T .

Let λ = (λ1, λ2, . . . , λn) ` n. Then the corresponding Young subgroup

of Sn is

Sλ = S{1,2,...λ1} × S{λ1+1,λ1+2,...,λ1+λ2} × S{n−λl+1,n−λl+2,...,n}

141

Now consider the representation (1 ↑Sn

Sλ), by which we mean the

representation of Sn induced by the trivial representation of the subgroup

Sλ. If π1, π2, . . . , πk is a transversal for Sλ, then the vector space

V λ = C{π1Sλ, π2Sλ, . . . , πkSλ}

is a module for our induced representation .

Definition 8.3. Suppose λ ` n. A Young tableau of shape λ, is an array

t obtained by replacing the dots of the Ferrers diagram with the numbers

1, 2, . . . , n bijectively.

Definition 8.4. Two λ-tableaux t1 and t2 are row equivalent, t1 ∼ t2, if

the corresponding rows of the two tableaux contain the same elements.

A tabloid of shape λ or λ-tabloid is then

{t} = {t1|t1 ∼ t}

where shape(t) = λ

142

Now π ∈ Sn acts on a tableau t = (ti,j) of shape λ ` n as follows:

πt = (π(ti,j))

This induces an action on tabloids by letting

π{t} = {πt}

Exercise: Check that this is well defined, namely independent of the

choice of t.

Definition 8.5. Suppose λ ` n. Let

Mλ = C{{t1}, . . . , {tk}}

where {t1}, . . . , {tk}, is a complete list of λ-tabloids. Then Mλ is called

the permutation module corresponding to λ.

143

Definition 8.6. Any G-module M is cyclic if there is a v ∈M such that

M = CGv

where Gv = {gv|g ∈ G}. In this case we say that M is generated by v.

Proposition 8.7. If λ ` n, then Mλ is cyclic, generated by any given

λ-tabloid. In addition, dim Mλ = n!/λ!, the number of λ-tabloids.

Theorem 8.8. Consider λ ` n with the Young subgroup Sλ and tabloid

{tλ}, as before. Then V λ = CSnSλ and Mλ = CSn{tλ} are isomorphic

as Sn-modules.

Proof. Let π1, π2, . . . , πk be a transversal for Sλ. Define a map:

θ : V λ →Mλ

by θ(πiSλ) = {πitλ} for i = 1, 2, . . . , k and linear extension. It is not

hard to verify that θ is the desired Sn-isomorphism of modules.

144

8.2 Dominance and Lexicographic ordering

Definition 8.9. Suppose λ = (λ1, . . . , λl) and µ = (µ1, µ2, . . . , µm) are

partitions of n. Then λ dominates µ, written as λ� µ if

λ1 + λ2 + . . .+ λi ≥ µ1 + µ2 + . . . µi

for all i ≥ 1. If i > l (respectively, i > m), then we take λi (respectively,

µi) to be zero.

145

Lemma 8.10 (Dominance lemma for partitions). Let tλ and sµ be

tableaux of shape λ and µ respectively. If for each index i, the elements

of row i of sµ are all in different columns in tλ, then λ� µ.

Proof. By hypothesis, we can sort the entries in each column of tλ so

that the elements of rows 1, 2, . . . , i of sµ all occur in the first i rows of

tλ.

Now note that

λ1 + λ2 + · · ·+ λi = number of elements in the first i rows of tλ

≥ number of elements of sµ in the first i rows of tλ

= µ1 + µ2 + · · ·+ µi

146

Definition 8.11. Let λ = (λ1, λ2, . . . , λl) and µ = (µ1, µ2, . . . , µm) be

partitions of n. Then λ < µ in lexicographic order if for some index i,

λj = µj for j < i and λi < µi

Proposition 8.12. If λ, µ ` n with λ� µ then λ ≥ µ

Proof. If λ 6= µ, then find the first index i where they differ. Thus,∑i−1j=1 λj =

∑i−1j=1 µj and

∑ij=1 λj >

∑ij=1 µj (since λ� µ). So

λi > µi.

147

8.3 Specht Modules

Definition 8.13. Suppose now that the tableau t has rows

R1, R2, . . . Rl and columns C1, C2, . . . , Ck. Then

Rt = SR1 × SR2 × . . .× SRl

and

Ct = SC1 × SC2 × . . .× SCk

are the row-stabilizer and column-stabilizer of t respectively.

Note that our equivalence classes can be expressed as {t} = Rtt.

148

In general, given a subset H ⊆ Sn, we can form the group algebra

elements

H+ =∑π∈H

π

and

H− =∑π∈H

sgn(π)π

For a tableau t, the element R+t is already implicit in the corresponding

tabloid by the remark at the end of the previous paragraph. However we

will also need to make use of

κtdef= C−t =

∑π∈Ct

sgn(π)π

Note that if t has columns C1, C2, . . . , Ck, then κt factors as

κt = κC1κC2 . . . κCk

149

Definition 8.14. If t is a tableau, then the associated polytabloid is

et = κt{t}

150

Lemma 8.15. Let t be a tableau and π be a permutation. Then

1. Rπt = πRtπ−1

2. Cπt = πCtπ−1

3. κπt = πκtπ−1

4. eπt = πet

Proof. 1. We have the following equivalent statements:

σ ∈ Rπt ←→ σ{πt} = {πt}

←→ π−1σπ{t} = {t}

←→ π−1σπ ∈ Rt

←→ σ ∈ πRtπ−1

The proofs of 2 and 3 are similar to that of part 1.

151

4. We have

eπt = κπt{πt} = πκtπ−1{πt} = πκt{t} = πet

Definition 8.16. For any partition λ, the corresponding Specht module,

Sλ is the submodule of Mλ spanned by the polytabloids et, where t is of

shape λ

152

Proposition 8.17. The Sλ are cyclic modules generated by any given

polytabloid.

Proof. This follows from part 4 of Lemma 8.15

153

8.4 The Submodule theorem

Recall that H− =∑

π∈H(sgnπ)π for any subset H ⊆ Sn. If H = {π},then we write π− for H−. We need the unique inner product on Mλ for

which

〈{t}, {s}〉 = δ{t},{s} (8.1)

154

Lemma 8.18 (Sign Lemma). Let H ≤ Sn be a subgroup. Then

1. If π ∈ H, then

πH− = H−π = (sgnπ)H−

Equivalently, π−H− = H−.

2. For any u,v ∈Mλ,

〈H−u,v〉 = 〈u,H−,v〉

3. If the transposition (b, c) ∈ H, then we can factor

H− = k(ε− (b, c))

where k ∈ C[Sn]4. If t is a tableau with b, c in the same row of t and (b, c) ∈ H, then

H−{t} = 0

155

Proof. Exercise

156

Corollary 8.19. Let t = tλ be a λ-tableau and s = sµ be a µ-tableau,

where λ, µ ` n. If κt{s} 6= 0, then λ� µ. And if λ = µ, then

κt{s} = ±et

Proof. Suppose b and c are two elements in the same row of sµ. Then

they cannot be in the same column of tλ, for if so then κt = k(ε− (b, c))and κt{s} = 0 by parts 3 and 4 in the preceding lemma. Thus the

dominance lemma 8.10 yields λ� µ.

If λ = µ, then we must have {s} = π{t} for some π ∈ Ct by the same

argument that established the dominance lemma. Using part 1 of the

preceeding lemma yields

κt{s} = κtπ{t} = (sgnπ)κt{t} = ±et

157

Corollary 8.20. If u ∈Mµ and shape t = µ, then κtu is multiple of et.

Proof. We can write u =∑

i ci{si}, where the si are µ-tableaux. By

the previous corollary, κtu =∑

i±ciet.

158

Theorem 8.21 (Submodule Theorem). Let U be a submodule of

Mµ. Then

U ⊇ Sµ or U ⊆ Sµ⊥

In particular the Sµ are irreducible.

Proof. Consider u ∈ U and a µ-tableau t. By the preceding corollary, we

know that κtu = fet for some field element f . There are two cases,

depending on which multiples can arise.

Suppose that there exists a u and a t with f 6= 0. Then since u is in the

submodule U , we have fet = κtu ∈ U . Thus et ∈ U (since f is

nonzero) and Sµ ⊆ U (since Sµ is cyclic).

159

On the other hand, suppose we always have κtu = 0. We claim that this

forces U ⊆ Sµ⊥. Consider any u ∈ U . Given an arbitrary µ- tableau t,

we can apply part 2 of the sign lemma to obtain

〈u, et〉 = 〈u, κt{t}〉 = 〈κtu, {t}〉 = 〈0, {t}〉 = 0

Since the et span Sµ, we have u ∈ Sµ⊥, as claimed.

160

Proposition 8.22. Suppose θ ∈ HomSn(Sλ,Mµ) is nonzero. Then

λ� µ. Moreover, if λ = µ, then θ is multiplication by a scalar.

Proof. Since θ 6= 0, there is some basis vector et such that θ(et) 6= 0.Because 〈·, ·〉 is an inner product with complex scalars, Mλ = Sλ ⊕ Sλ⊥.

Thus we can extend θ to an element of HomSn(Mλ,Mµ) by setting

θ(Sλ⊥) = 0. So

0 6= θ(et) = θ(κt{t}) = κtθ({t}) = κt(∑

i

ci{si})

where the si are µ-tableaux. By Corollary 8.19, we have λ� µ.

In the case λ = µ, Corollary 8.20 yields θ(et) = cet for some constant c.

So for any permutation π,

θ(eπt) = θ(πet) = πθ(et) = π(cet) = ceπt

Thus θ is multiplication by c.

161

Theorem 8.23. The Sλ for λ ` n form a complete list of irreducible

Sn-modules.

Proof. The Sλ are irreducible by the submodule theorem.

Since we have the right number of modules for a full set, it suffices to

show that they are pairwise inequivalent. But if Sλ ∼= Sµ, then there is a

nonzero homomorphism θ ∈ HomSn(Sλ,Mµ), since Sµ ⊆Mµ. Thus

λ� µ (Proposition 8.22). Similarly µ� λ, so λ = µ.

162

8.5 Standard Tableaux and a Basis for Sλ

Definition 8.24. A tableau t is standard if the rows and columns of t

are increasing sequences. In this case we also say that the corresponding

tabloid and polytabloid are standard.

Theorem 8.25. The set

{et : t is a standard λ-tableau}

is a basis for Sλ.

Proof. Omitted

163

8.6 The Branching Rule

Definition 8.26. If λ is a diagram, then an inner corner of λ is a node

(i, j) ∈ λ whose removal leaves the Ferrers diagram of a partition. Any

partition obtained by such a removal is denoted by λ−. An outer corner

of λ is a node (i, j) /∈ λ whose addition produces the Ferrers diagram of

a partition. Any partition obtained by such an addition is denoted by λ+

164

Lemma 8.27. We have

fλ =∑λ−

fλ−

Proof. Every standard tableau of shape λ ` n consists of n in some

inner corner together with a standard tableau of shape λ− ` n− 1. The

result follows.

165

Theorem 8.28 (Branching Rule). If λ ` n, then

1. Sλ ↓Sn−1∼=⊕

λ− Sλ− , and

2. Sλ ↑Sn+1∼=⊕

λ+ Sλ+

Proof. 1. Let the inner corners of λ appear in rows r1 < r2 < · · · < rk.

For each i, let λi denote the partition of λ− obtained by removing the

corner cell in row ri. In addition, if n is at the end of row ri of tableau t

(respectively, in row ri of tabloid {ti}), then ti (respectively, {ti} ) will

be the array obtained by removing the n.

Now given any group G with module V and submodule W , it is easy to

see that

V ∼= W ⊕ (V/W ),

where V/W is the quotient space. Thus it suffices to find a chain of

subspaces

{0} = V (0) ⊂ V (1) ⊂ V (2) ⊂ · · · ⊂ V (k) = Sλ

166

such that V (i)/V (i−1) ∼= Sλi

as Sn−1 modules for 1 ≤ i ≤ k. Let V (i)

be the vector space spanned by the standard polytabloids et where n

appears in t at the end of one of rows r1 through ri. We show that the

V (i) are our desired modules as follows.

Define maps θi : Mλ →Mλi

by linearly extending

{t} θi→

{ti} if n in row ri of {t},0 otherwise.

Verify that θi is an Sn−1-homomorphism. Furthermore, for standard t

we have

etθi→

eti if n is in row ri of t,

0 if n is in row rj of t, where j < i.

167

This is because any tabloid appearing in et, t standard, has n in the

same row or higher than in t.

Since the standard polytabloids form a basis for the corresponding

Specht module,

θiV(i) = Sλi

and

V (i−1) ⊆ kerθi.

We can construct the chain

{0} = V (0) ⊆ V (1)∩kerθ1 ⊆ V (1) ⊆ V (2)∩kerθ2 ⊆ V (2) ⊆ · · · ⊆ V (k) = Sλ

But

dimV (i)

V (i) ∩ kerθi= dim θiV

(i) = fλi

168

By the preceding lemma, the dimensions of these quotients add up to

dim Sλ. Since this leaves no space to insert extra modules, the chain

must have equality for the first, third, etc. containments. Furthermore,

V (i)

V (i−1)∼=

V (i)

V (i) ∩ kerθi

∼= Sλi

as desired.

2. We will show that this part follows from the first by Frobenius

reciprocity. In fact, parts 1 and 2 can be shown to be equivalent by the

same method.

Let χλ be the character of Sλ. If Sλ ↑Sn+1∼= ⊕µ`n+1mµSµ, then by

taking characters, χλ ↑Sn+1∼=∑

µ`n+1mµχµ.

169

The multiplicities are given by

mµ = 〈χλ ↑Sn+1 , χµ〉

= 〈χλ, χµ ↓Sn〉

= 〈χλ,∑µ−

χµ−〉

=

1 if λ = µ−

0 otherwise

=

1 if µ = λ+

0 otherwise

170

9 Symmetric Functions

9.1 Symmetric functions in general

Let x = (x1, x2, . . .) be a set of indeterminates. A homogeneous

symmetric function of degree n over Q is a formal power series

f(x) =∑α

cαxα

(a) α = (α1, α2, . . .) ranges over all sequence of non-negative integers

that sum to n (i.e. weak compositions of n)

(b) cα ∈ Q

(c) xα stands for the monomial xα11 xα2

2 . . .

(d) f(xw(1), xw(2), . . .) = f(x1, x2, . . .) for every permutation w of the

positive integers.

171

The set of all homogeneous symmetric functions of degree n over Q is

denoted as Λn.

If f ∈ Λm and g ∈ Λn, then it is clear that fg ∈ Λm+n (where fg is a

product of the formal power series). Hence if we define

Λ = Λ0 ⊕ Λ1 ⊕ · · · (vector space direct sum)

Then Λ has the structure of a Q-algebra

172

9.2 Monomial Symmetric Functions

Given λ = (λ1, λ2, . . .) ` n, define a symmetric function mλ(x) ∈ Λn by

mλ =∑α

xα

where the sum ranges over all distinct permutations α = (α1, α2, . . .) of

the entries of the vector λ = (λ1, λ2, . . .).

We call mλ a monomial symmetric function. Clearly if

f =∑

α cαxα ∈ Λn then f =

∑λ`n cλmλ. The set {mλ : λ ` n} is a

(vector space) basis for Λn and hence that

dim Λn = p(n)

the number of partitions of n. Moreover the set {mλ : λ ∈ Par} is a

basis for Λ.

173

9.3 Elementary Symmetric Functions

We define the elementary symmetric functions eλ for λ ∈ Par by the

formula

en = m1n =∑

i1<...<in

xi1 · · ·xin, n ≥ 1 (with e0 = m∅ = 1)

eλ = eλ1eλ2 · · · , if λ = (λ1, λ2, . . .)

174

Suppose that A = (aij)i,j≥1 is an integer matrix with finitely many

nonzero entries with row and column sums

ri =∑

j

aij

cj =∑

i

aij

Define the row-sum vector row(A) and column-sum vector col(A) by

row(A) = (r1, r2, . . .)

column(A) = (c1, c2, . . .)

A (0, 1)-matrix is a matrix all of whose entries are either 0 or 1.

175

Proposition 9.1. Let λ ` n, and let α = (α1, α2, . . .) be a weak

composition of n. Then the coefficient Mλα of xα in eλ is equal to the

number of (0, 1)-matrices A = (aij)i,j≥1 satisfying row(A) = λ and

col(A) = α. That is,

eλ =∑µ`n

Mλµmµ (9.1)

176

Corollary 9.2. Let Mλµ be given by Equation (9.1). Then Mλµ = Mµλ.

That is, the transition matrix between the bases {mλ : λ ` n} and

{eλ : λ ` n} is a symmetric matrix.

177

Proposition 9.3. We have∏i,j

(1 + xiyj) =∑λ,µ

Mλµmλ(x)mµ(y) (9.2)

=∑

λ

mλ(x)eλ(y) (9.3)

Here λ and µ range over Par. (It suffices to take |λ| = |µ|, since

otherwise Mλµ = 0.)

178

Proof. A monomial xα11 xα2

2 · · · yβ11 yβ2

2 · · · = xαyβ appearing in the

expansion of∏

(1 + xiyj) is obtained by choosing a (0, 1)-matrix

A = (aij) with finitely many 1’s, satisfying∏i,j

(xiyj)aij = xαyβ

But ∏i,j

(xiyj)aij = xrow(A)ycol(A)

so the coefficient of xαyβ in the product∏

(1 + xiyj) is the number of

(0, 1)-matrices satisfying row(A) = α and col(A) = β. Hence equation

(9.2) follows. Equation (9.3) is then a consequence of (9.1)

179

Theorem 9.4. Let λ, µ ` n. Then Mλµ = 0 unless λ′ � µ, while

Mλλ′ = 1. Hence the set {eλ : λ ` n} is a basis for Λn (so

{eλ : λ ∈ Par} is a basis for Λ). Equivalently, e1, e2, . . . are algebraically

independent and generate Λ as a Q-algebra, which we write as

Λ = Q[e1, e2, . . .]

180

Proof. Suppose Mλµ 6= 0 so by Proposition 9.1 there is a (0, 1)-matrix

A with row(A) = λ and col(A) = µ. Let A′ be the matrix with

row(A′) = λ and with its 1′s left justified, i.e. A′ij = 1 precisely for

1 ≤ j ≤ λi. For any i the number of 1′s in the first i columns of A′

clearly is not less than the number of 1′s in the first i columns of A, so

by definition of dominance order we have col(A′) � col(A) = µ.

But col(A′) = λ′, so λ′ � µ as desired. Moreover it is easy to see that

A′ is the only (0, 1)-matrix with row(A′) = λ and col(A′) = λ′, so

Mλ,λ′ = 1.

181

The previous argument shows the following. Let λ1, λ2, . . . , λp(n) be an

ordering of Par(n) that is compatible with the dominance order and such

that the “reverse conjugate” order (λp(n))′, . . . , (λ2)′, (λ1)′ is also

compatible with the dominance order. (Exercise: give an example of

such an order). Then the matrix (Mλµ), with the row order λ1, λ2, . . .

and column order (λ1)′, (λ2)′, . . . is upper triangular with 1′s on the

main diagonal. Hence it is invertible, so {eλ : λ ` n} is a basis for Λn.

(In fact it is a basis for ΛnZ since the diagonal entries are actually 1′s,

and not merely nonzero.)

182

The set {eλ : λ ∈ Par} consists of all monomials ea11 e

a22 . . . (where

ai ∈ N,∑ai <∞). Hence the linear independence of {eλ : λ ∈ Par} is

equivalent to the algebraic independence of e1, e2, . . . as desired

183

9.4 Complete Homogeneous Symmetric functions

Define the complete homogeneous symmetric functions (or just complete

symmetric functions) hλ for λ ∈ Par by the formulas

hn =∑λ`n

mλ =∑

i1≤...≤in

xi1 · · ·xin(with h0 = m∅ = 1)

(9.4)

hλ = hλ1hλ2 · · · if λ = (λ1, λ2, . . .)

184

Proposition 9.5. Let λ ` n, and let α = (α1, α2, . . .) be a weak

composition of n. Then the coefficient Nλα of xα in hλ is equal to the

number of N-matrices A = (aij) satisfying row(A) = λ and col(A) = α.

That is,

hλ =∑µ`n

Nλµmµ, (9.5)

185

Corollary 9.6. Let Nλµ be given by Equation (9.5). Then Nλµ = Nµλ ,

i.e the transition matrix between the bases {mλ : λ ` n} and

{hλ : λ ` n} is a symmetric matrix.

Note that by a Corollary in the next section (Corollary 9.9), it follows

that the set {hλ : λ ` n} is indeed a basis.

186


(1− xiyj)−1 =∑λ,µ

Nλµmλ(x)mµ(y) (9.6)

=∑

λ

mλ(x)hλ(y) (9.7)

where λ and µ range over Par (and where it suffices to take |λ| = |µ|).

187

9.5 An Involution

Since Λ = Q[e1, e2, . . .], an algebra endomorphism f : Λ→ Λ is

determined uniquely by its values f(en), n ≥ 1; and conversely any

choice of (f(en)) ∈ Λ determines an endomorphism f . Define an

endomorphism ω : Λ→ Λ by ω(en) = hn, n ≥ 1. Thus (since ω

preserves multiplication), ω(eλ) = hλ for all partitions λ.

Theorem 9.8. The endomorphism ω is an involution i.e. ω2 = 1 (the

identity automorphism), or equivalently ω(hn) = en. (Thus, ω(hλ) = eλ

for all partitions λ.)

188

Proof. Consider the formal power series

H(t) :=∑n≥0

hntn ∈ Λ [[t]]

E(t) :=∑n≥0

entn ∈ Λ [[t]]

Check the identities

H(t) =∏n

(1− xnt)−1 (9.8)

E(t) =∏n

(1 + xnt) (9.9)

Hence H(t)E(−t) = 1. Equating the coefficients of tn on both sides

yields

0 =n∑

i=0

(−1)ieihn−i, n ≥ 1 (9.10)

189

Conversely, if∑n

i=0(−1)iuihn−i = 0 for all n ≥ 1, for certain ui ∈ Λwith u0 = 1, then ui = ei. Now apply ω to Equation (9.10) to obtain

0 =n∑

i=0

(−1)ihiω(hn−i) = (−1)nn∑

i=0

(−1)iω(hi)hn−i,

whence ω(hi) = ei as desired.

190

Corollary 9.9. The set {hλ : λ ` n} is a basis for Λn (so {hλ : λ ∈ Par}is a basis for Λ). Equivalently, h1, h2, . . . are algebraically independent

and generate Λ as a Q-algebra, which we write as

Λ = Q[h1, h2, . . .]

Proof. Theorem 9.8 shows that the endomorphism ω : Λ→ Λ defined

by ω(en) = hn is invertible (since ω = ω−1), and hence is an

automorphism of Λ. The proof now follows from Theorem 9.4.

191

9.6 Power Sum Symmetric Functions

We define a fourth set pλ of symmetric functions indexed by λ ∈ Par

and called the power sum symmetric functions as follows:

pn = mn =∑

i

xni , n ≥ 1 (with p0 = m∅ = 1)

pλ = pλ1pλ2 · · · if λ = (λ1, λ2, . . .)

192

Proposition 9.10. Let λ = (λ1, . . . , λl) ` n, where l = l(λ), and set

pλ =∑µ`n

Rλµmµ (9.11)

Let k = l(µ). Then Rλµ is equal to the number of ordered partitions

π = (B1, B2, . . . Bk) of {1, . . . , l} such that

µj =∑i∈Bj

λi, 1 ≤ j ≤ k (9.12)

193

Proof. Rλµ is the coefficient of xµ = xµ11 xµ2

2 . . . in

pλ = (∑

xλ1i )(

∑xλ2

i ) · · ·

To obtain the monomial xµ in the expansion of the product, we choose

the term xλj

ijfrom each factor

∑x

λj

i so that∏

j xλj

ij= xµ. Define

Br = {j : ij = r}. Then (B1, . . . Bk) will be an ordered partition of

{1, . . . , l} satisfying Equation (9.12), and conversely every such ordered

partition gives rise to a term xµ.

194

Corollary 9.11. Let Rλµ be as in Equation (9.11). Then Rλµ = 0unless µ� λ, while

Rλλ =∏

i

mi(λ)! (9.13)

where λ = 〈1m1(λ)2m2(λ) · · · 〉. Hence {pλ : λ ` n} is a basis for Λn (so

{pλ : λ ∈ Par} is a basis for Λ). Equivalently, p1, p2, . . . are algebraically

independent and generate Λ as a Q-algebra, i.e.

Λ = Q[p1, p2, . . .]

195

We now consider the effect of the involution ω on pλ. For any partition

λ = 〈1m1(λ)2m2(λ) · · · 〉 define

zλ = 1m1m1!2m2m2! · · · (9.14)

If w ∈ Sn, then the cycle type ρ(w) of w is the partition

ρ(w) = (ρ1, ρ2, . . .) ` n such that the cycle lengths of w (in its

factorization into disjoint cycles) are ρ1, ρ2, . . .. The number of

permutations w ∈ Sn of a fixed cycle type ρ = 〈1m12m2 · · · 〉 is given by

#{w ∈ Sn : ρ(w) = ρ} =n!

1m1m1!2m2m2! · · ·= n!z−1

ρ (9.15)

The set {v ∈ Sn : ρ(v) = ρ} is just the conjugacy class in Sn containing

w. For any finite group G, the order #Kw of the conjugacy class Kw is

equal to the index [G : C(w)] of the centralizer of w.

196

Hence:

Proposition 9.12. Let λ ` n. Then zλ is equal to the number of

permutations v ∈ Sn that commute with a fixed wλ of cycle type λ.

197

For a partition λ = 〈1m12m2 · · · 〉, define

ελ = (−1)m2+m4+··· = (−1)n−l(λ) (9.16)

Thus for any w ∈ Sn, εp(w) is +1 if w is an even permutation and −1otherwise, so the map Sn → {±1} defined by w 7→ εp(w) is the usual

“sign homomorphism”

198


(1− xiyj)−1 = exp∑n≥1

1npn(x)pn(y)

=∑

λ

z−1λ pλ(x)pλ(y) (9.17)

∏i,j

(1 + xiyj) = exp∑n≥1

1n

(−1)n−1pn(x)pn(y)

=∑

λ

z−1λ ελpλ(x)pλ(y) (9.18)

199

Proof. We have

log∏i,j

(1− xiyj)−1 =∑i,j

log(1− xiyj)−1

=∑i,j

∑n≥1

1nxn

i ynj

=∑n≥1

1n

(∑i

xni

)∑j

ynj

=∑n≥1

1npn(x)pn(y)

200

Proposition 9.14. Let λ ` n. Then

ωpλ = ελpλ

In other words, pλ is an eigenvector for ω corresponding to the

eigenvalue ελ.

Proof. Regard ω as acting on symmetric functions in the variables

y = (y1, y2, . . .). Those in the variables x are regarded as scalars.

201

Apply ω to Equation (9.17). We obtain

ω∑

λ

z−1λ pλ(x)pλ(y) = ω

∏i,j

(1− xiyj)−1

=∑

v

mv(x)ωhv(y) (by (9.6))

=∑

v

mv(x)ev(y) ( by Theorem 9.8)

=∏i,j

(1 + xiyj) (by (9.2))

=∑

λ

z−1λ ελpλ(x)pλ(y) (by 9.18)

Since the pλ(x)’s are linearly independent, their coefficients in the first

and last sums of the above chain of equalities must be the same. In

other words, ωpλ(y) = ελpλ(y), as desired.

202

Proposition 9.15. We have

hn =∑λ`n

z−1λ pλ (9.19)

en =∑λ`n

ελz−1λ pλ (9.20)

(9.21)

Proof. Substituting y = (t, 0, 0, . . .) in Equation (9.17) immediately

yields Equation (9.19). Equation (9.20) is similiarly obtained from

(9.18), or by applying ω to (9.19).

203

9.7 A Scalar product

Define a scalar product on Λ by requiring that {mλ} and {hµ} be dual

bases i.e.

〈mλ, hµ〉 = δλµ (9.22)

for all λ, µ ∈ Par. Notice that 〈·, ·〉 respects the grading of Λ, in the

sense that if f and g are homogeneous then 〈f, g〉 = 0 unless

deg f = deg g.

204

Proposition 9.16. The scalar product 〈·, ·〉 is symmetric, i.e.

〈f, g〉 = 〈g, f〉 for all f, g ∈ Λ.

Proof. The result is equivalent to Corollary 9.6. More specifically, it

suffices by linearity to prove 〈f, g〉 = 〈g, f〉 for some bases {f} and {g}of Λ. Take {f} = {g} = {hλ}. Then

〈hλ, hµ〉 =

⟨∑ν

Nλνmν , hµ

⟩= Nλµ (9.23)

Since Nλµ = Nµλ by Corollary 9.6, we have 〈hλ, hµ〉 = 〈hµ, hλ〉 as

desired.

205

Lemma 9.17. Let {uλ} and {vλ} be bases of Λ such that for all λ ` nwe have uλ, vλ ∈ Λn. Then {uλ} and {vλ} are dual bases if and only if∑

λ

uλ(x)vλ(y) =∏i,j

(1− xiyj)−1

206

Proof. Write mλ =∑

ρ ζλρuρ and hµ =∑

ν ηµνvν . Thus

δλµ = 〈mλ, hµ〉 =∑ρ,ν

ζλρηµν〈uρ, vν〉 (9.24)

For each fixed n ≥ 0, regard ζ and η as matrices indexed by Par(n), and

let A be the matrix defined by Aρν = 〈uρ, vν〉. Then (9.24) is equivalent

to I = ζAηt, where t denotes the transpose and I is the identity matrix.

Therefore

{uλ} and {vλ} are dual bases ⇐⇒ A = I

⇐⇒ I = ζηt

⇐⇒ I = ζtη

⇐⇒ δρν =∑

λ

ζλρηλν (9.25)

207

Now by Proposition 9.7 we have∏i,j

(1− xiyj)−1 =∑

λ

mλ(x)hλ(y)

=∑

λ

(∑ρ

ζλρuρ(x)

)(∑ν

ηλνvν(y)

)

=∑ρ,ν

(∑λ

ζλρηλν

)uρ(x)vν(y)

Since the power series uρ(x)vν(y) are linearly independent over Q, the

proof follows from (9.25).

208

Proposition 9.18. We have

〈pλ, pµ〉 = zλδλµ (9.26)

Hence the pλ’s form an orthogonal basis of Λ. (They don’t form an

orthonormal basis, since 〈pλ, pλ〉 6= 1)

Proof. By Proposition 9.13 and Lemma 9.17 we see that {pλ} and

{pµ/zµ} are dual bases, which is equivalent to (9.26).

209

Corollary 9.19. The scalar product 〈·, ·〉 is positive definite i.e.

〈f, f〉 ≥ 0 for all f ∈ Λ, with equality if and only if f = 0.

Proof. Write (uniquely) f =∑

λ cλpλ. Then

〈f, f〉 =∑

c2λzλ

The proof follows since each zλ > 0.

210

Proposition 9.20. The involution ω is an isometry, i.e.

〈ωf, ωg〉 = 〈f, g〉 for all f, g ∈ Λ.

Proof. By the bilinearity of the scalar product, it suffices to take f = pλ

and g = pµ. The result then follows from Propositions 9.14 and

9.18.

211

9.8 The Combinatorial Definition of Schur Functions

The fundamental combinatorial objects associated with Schur functions

are semistandard tableaux. Let λ be a partition. A semistandard

(Young) tableaux (SSYT) of shape λ is an array T = (Tij) of positive

integers of shape λ (i.e., 1 ≤ i ≤ l(λ), 1 ≤ j ≤ λi) that is weakly

increasing in every row and strictly increasing in every column. The size

of an SSYT is its number of entries.

212

If T is an SSYT of shape λ then we write λ = sh(T ). Hence the size of

T is just |sh(T )|. We may also think of an SSYT of shape λ as the

Ferrers diagram of λ whose boxes have been filled with positive integers

(satisfying certain conditions).

213

We say that T has type α = (α1, α2, . . .), denoted α = type(T ), if T

has αi = αi(T ) parts equal to i. For any SSYT T of type α (or indeed

for any multiset on P with possible additional structure), write

xT = xα1(T )1 x

α2(T )2 · · ·

214

There is a generalization of SSYTs of shape λ that fits naturally into the

theory of symmetric functions. If λ and µ are partitions with µ ⊆ λ (i.e.

µi ≤ λi for all i), then define a semistandard tableau of (skew) shape

λ/µ to be an array T = (Tij) of positive integers of shape λ/µ (i.e.

1 ≤ i ≤ l(λ), µi < j ≤ λi) that is weakly increasing in every row and

strictly increasing in every column.

215

We can similarly extend the definition of a Ferrers diagram of shape λ to

one of shape λ/µ. Thus an SSYT of shape λ/µ may be regarded as a

Ferrers diagram of shape λ/µ whose boxes have been filled with positive

integers (satisfying certain conditions), just for “ordinary shapes” λ.

216

The definitions of type(T ) and xT carry over directly from SSYTs T of

ordinary shape to those of skew shape.

Definition 9.21. Let λ/µ be a skew shape. The skew Schur function

sλ/µ = sλ/µ(x) of shape λ/µ in the variables x = (x1, x2, . . .) is the

formal power series

sλ/µ(x) =∑T

xT

summer over all SSYTs T of shape λ/µ. If µ = ∅, then we call sλ(x)the Schur function of shape λ.

217

Theorem 9.22. For any skew shape λ/µ, the skew Schur function sλ/µ

is a symmetric function.

Proof. It suffices to show that sλ/µ is invariant under interchanging xi

and xi+1. Suppose that |λ/µ| = n and that α = (α1, α2, . . .) is a weak

composition of n. Let

α = (α1, α2, . . . , αi−1, αi+1, αi, αi+2, . . .).

If Tλ/µ,α denotes the set of all SSYTs of shape λ/µ and type α, then we

seek the bijection ϕ : Tλ/µ,α → Tλ/µ,α.

218

Let T ∈ Tλ/µ,α. Consider the parts of T equal to i or i+ 1. Some

columns of T will contain no such parts, while some others will contain

two such parts, viz., one i and i+ 1. These columns we ignore. The

remaining parts equal to i or i+ 1 occur once in each column, and

consist of rows with a certain number r of i’s followed by a certain

number s of i+ 1’s. (Of course r and s depend on the row in question.)

For example, a portion of T could loook as follows:

i

i i i i︸︷︷︸r=2

i+ 1 i+ 1 i+ 1 i+ 1︸︷︷︸s=4

i+ 1

i+ 1 i+ 1

219

In each such row convert the r i′s and s i+ 1’s to s i’s and r i+ 1’s

i

i i i i i i︸︷︷︸s=4

i+ 1 i+ 1︸︷︷︸r=2

i+ 1

i+ 1 i+ 1

It’s easy to see that the resulting array ϕ(T ) belongs to Tλ/µ,α, and that

ϕ establishes the desired bijection.

220

If λ ` n and α is a weak composition of n, then let Kλα denote the

number of SSYTs of shape λ and type α. Kλα is called a Kostka

number. By Definition 9.21 we have

sλ =∑α

Kλαxα

summed over all weak compositions α of n, so by Theorem 9.22 we have

sλ =∑µ`n

Kλµmµ (9.27)

221

More generally, we can define the skew Kostka number Kλ/ν,α as the

number of SSYTs of shape λ/ν and type α, so that if |λ/ν| = n then

sλ/ν =∑µ`n

Kλ/ν,µmµ (9.28)

222

Consider the number Kλ,1n , also denoted by fλ. By definition, fλ is the

number of ways to insert the numbers 1, 2, . . . , n into the shape λ ` n,

each number appearing once, so that every row and column is increasing.

Such an array is called a standard Young tableau(SYT) (or just standard

tableau) of shape λ. The number fλ has several alternative

combinatorial interpretations as given by the following proposition.

223

Proposition 9.23. Let λ ∈ Par. Then the number fλ counts the

objects in items (a)-(e) below. We illustrate these objects with the case

λ = (3, 2).

(a) Chains of partitions. Saturated chains in the interval [∅, λ] of

Young’s lattice Y , or equivalently, sequences ∅ = λ0, λ1, . . . , λn = λ of

partitions (which we identify with their diagrams) such that λi is

obtained from λi−1 by adding a single square.

∅ ⊂ 1 ⊂ 2 ⊂ 3 ⊂ 31 ⊂ 32

∅ ⊂ 1 ⊂ 2 ⊂ 21 ⊂ 31 ⊂ 32

∅ ⊂ 1 ⊂ 2 ⊂ 21 ⊂ 22 ⊂ 32

∅ ⊂ 1 ⊂ 11 ⊂ 21 ⊂ 31 ⊂ 32

∅ ⊂ 1 ⊂ 11 ⊂ 21 ⊂ 22 ⊂ 32

(b) Linear extensions. Let Pλ be the poset whose elements are the

squares of the diagram of λ, with t covering s if t lies directly to the

224

right or directly below s(with no squares in between). Such posets are

just the finite order ideals of N× N. Then fλ = e(Pλ), the number of

linear extensions of Pλ

(c) Ballot sequences. Ways in which n voters can vote sequentially in an

election for candidates A1, A2, . . . , so that for all i, Ai receives λi votes,

and so that Ai never trails Ai+1 in the voting. (We denote such a

voting sequence as a1a2 · · · an, where the k-th voter votes for Aak.)

11122 11212 11221 12112 12121

(d) Lattice permutations. Sequences a1a2 · · · an in which i occurs λi

times, and such that in any left factor a1a2 · · · aj , the number of i’s is at

least as great as the number of i+ 1’s (for all i). Such a sequence is

called a lattice permutation(or Yamanouchi word or ballot sequence) of

type λ.

11122 11212 11221 12112 12121

225

(e) Lattice paths. Lattice 0 = v0, v1, . . . , vn in Rl (where l = l(λ)) from

the orign v0 to vn = (λ1, λ2, . . . , λl), with each step a unit coordinate

vector, and staying within the region (or cone) x1 ≥ x2 ≥ · · · ≥ xl ≥ 0.

226

Define a reverse SSYT or column-strict plane partition of (skew) shape

λ/µ to be an array of positive integers of shape λ/µ that is weakly

decreasing in rows and strictly decreasing in columns. Define the type α

of a reverse SSYT exactly as for ordinary SSYT.

Define Kλ/µ,α to be the number of reverse SSYTs of shape λ/µ and

type α.

Proposition 9.24. Let λ/µ be a skew partition of n, and let α be a

weak composition of n. Then Kλ/µ,α = Kλ/µ,α.

Proof. Suppose that T is a reverse SSYT of shape λ and type

α = (α1, α2, . . .). Let k denote the largest part of T . The

transformation Tij 7→ k + 1− Tij shows that Kλα = Kλα, where

α = (αk, αk−1, . . . , α1, 0, 0, . . .). But by Theorem 9.22 we have

Kλα = Kλα, and the proof is complete.

227

Proposition 9.25. Suppose that µ and λ are partitions with |µ| = |λ|and Kλµ 6= 0. Then λ� µ. Moreover Kλλ = 1.

Proof. Suppose Kλµ 6= 0. By definition, there exists an SSYT T of

shape λ and type µ. Suppose that a part Tij = k appears below the

k-th row (i.e. i > k). Then we have 1 ≤ T1k < T2k < · · · < Tik = k for

i > k, which is impossible. Hence the parts 1, 2, . . . , k all apppear in the

first k rows, so that µ1 + µ2 + · · ·+ µk ≤ λ1 + λ2 + · · ·+ λk, as desired.

Moreover, if µ = λ then we must have Tij = i for all (i, j), so

Kλλ = 1.

228

Corollary 9.26. The Schur functions sλ with λ ∈ Par(n) form a basis

for Λn, so {sλ : λ ∈ Par} is a basis for Λ. In fact, the transition matrix

Kλµ which expresses the sλ’s in terms of the mµ’s, with respect to any

linear ordering of Par(n) that extends dominance order, is lower

triangular with 1’s on the main diagonal.

Proof. Proposition 9.25 is equivalent to the assertion about Kλµ. Since

a lower triangular matrix with 1’s on the main diagonal is invertible, it

follows that {sλ : λ ∈ Par(n)} is a Q-basis for Λn.

229

Corollary 9.27. The Schur functions sλ with λ ∈ Par(n) form a basis

for Λn, so {sλ : λ ∈ Par} is a basis for Λ. In fact, the transition matrix

Kλµ which expresses the sλ’s in terms of the mµ’s, with respect to any

linear ordering of Par(n) that extends dominance order, is lower

triangular with 1’s on the main diagonal.

Proof. Proposition 9.25 is equivalent to the assertion about Kλµ. Since

a lower triangular matrix with 1’s on the main diagonal is invertible, it

follows that {sλ : λ ∈ Par(n)} is a Q-basis for Λn.

230

9.9 The RSK Algorithm

The basic operation of the RSK algorithm consists of the row insertion

P ← k of a positive integer k into a nonskew SSYT P = (Pij). The

operation P ← k is defined as follows: Let r be the largest integer such

that P1,r−1 ≤ k. (If P11 > k then let r = 1.) If P1r doesn’t exist (i.e. P

has r − 1 columns), then simply place k at the end of the first row. The

insertion process stops, and the resulting SSYT is P ← k. If on the

other hand, P has at least r columns, so that P1r exists, then replace

P1r by k. The element then “bumps” Pir := k′ into the second row, i.e.

insert k′ into the second row of P by the insertion rule just described.

Continue until an element is inserted at the end of a row (possibly as the

first element of the next row). The resulting array is P ← k.

231

Lemma 9.28. (a) When we insert k into an SSYT P , then the insertion

path moves to the left. More precisely, if (r, s), (r + 1, t) ∈ I(P ← k)then t ≤ s.

(b) Let P be an SSYT, and let j ≤ k. Then I(P ← j) lies strictly to the

left of I((P ← j)← k). More precisely, if (r, s) ∈ I(P ← j) and

(r, t) ∈ I((P ← j)← k), then s < t. Moreover, I((P ← j)← k) does

not extend below the bottom of I(P ← j). Equivalently

#I((P ← j)← k) ≤ #I(P ← j)

232

Proof. (a) Suppose that (r, s) ∈ I(P ← k). Now either Pr+1,s > Prs

(since P is strictly increasing in columns) or else there is no (r + 1, s)entry of P . In the first case, Prs cannot get bumped to the right of

column s without violating the fact that the rows of P ← k are weakly

increasing, since Prs would be to the right of Pr+1,s on the same row.

The second case is clearly impossible, since we would otherwise have a

gap in row r + 1. Hence (a) is proved.

(b) Since a number can only bump a strictly larger number, it follows

that k is inserted in the first row of P ← j strictly to the right of j.

Since the first row of P is weakly increasing, j bumps an element no

larger than the element k bumps. Hence by induction I(P ← j) lies

strictly to the left of I((P ← j)← k).

233

The bottom element b of I(P ← j) was inserted at the end of its row.

By what was just proved, if I((P ← j)← k) has an element c in this

row, then it lies to the right of b. Hence c was inserted at the end of the

row, so the insertion procedure terminates. It follows that

I((P ← j)← k) can never go below the bottom of I(P ← j).

234

Corollary 9.29. If P is an SSYT and k ≥ 1, then P ← k is also an

SSYT.

Proof. It is clear that the rows of P ← k are weakly increasing. Now a

number a can only bump a larger number b. By Lemma 9.28(a), b does

not move to the right when when it is bumped. Hence b is inserted

below a number that is strictly smaller than b, so P ← k remains an

SSYT.

235

Now let A = (aij) be a N-matrix with finitely many nonzero entries. We

will say that A is an N-matrix of finite support. We can think of A as

either an infinite matrix or as an m× n matrix when aij = 0 for i > m

and j > n.

236

Associate with A a generalized permutation of two-line array wA defined

by

wA =

i1 i2 i3 . . . im

j1 j2 j3 . . . jm

(9.29)

where (a) i1 ≤ i2 ≤ · · · ≤ im

(b) if ir = is, and r ≤ s, then jr ≤ js,

(c) for each pair (i, j), there are exactly aij values of r for which

(ir, jr) = (i, j)

237

It is easily seen that A determines a unique two line array wA satisfying

(a)− (c), and conversely any such array corresponds to a unique A.

238

We now associate with A (or wA) a pair (P,Q) of SSYTs of the same

shape, as follows. Let wA be given by (9.29). Begin with

(P (0), Q(0)) = (∅, ∅) (where ∅ denotes the empty SSYT). If t < m and

(P (t), Q(t)) are defined, then let

(a) P (t+ 1) = P (t)← jt+1;

(b) Q(t+ 1) be obtained from Q(t) by inserting it+1 (leaving all parts of

Q(t) unchanged) so that P (t+ 1) and Q(t+ 1) have the same shape.

The process ends at (P (m), Q(m)), and we define

(P,Q) = (P (m), Q(m)). We denote this correspondence by AA→ (P,Q)

and call it the RSK algorithm. We call P the insertion tableau and Q

the recording tableau of A or of wA

239

Theorem 9.30. The RSK algorithm is a bijection between N-matrices

A = (aij)i,j≥1 of finite support and ordered pairs (P,Q) of SSYT of the

same shape. In this correspondence,

j occurs in P exactly∑

i

aij times (9.30)

i occurs in Q exactly∑

j

aij times (9.31)

(These last two conditions are equivalent to type(P ) = col(A),type(Q) = row(A)).

240

Proof. By Corollary 9.29, P is an SSYT. Clearly, by definition of the

RSK algorithm P and Q have the same shape, and also (9.30) and

(9.31) hold. Thus we must show the following: (a) Q is an SSYT , and

(b) the RSK algorithm is a bijection, i.e., given (P,Q) , one can uniquely

recover A.

To prove (a), first note that since the elements of Q are inserted in

weakly increasing order, it follows that the rows and columns of Q are

weakly increasing. Thus we must show that the columns of Q are

strictly increasing, i.e. no two equal elements of the top row of wA can

end up in the same column of Q. But if ik = ik+1 in the top row, then

we must jk ≤ jk+1. Hence by Lemma 9.28(b), the insertion path of

jk+1 will always lie strictly to the right of the path for jk, and will never

extend below the bottom of jk’s insertion path. It follows that the

bottom elements of the two insertion paths lie in different columns, so

the columns of the Q are strictly increasing as desired.

241

The above argument establishes an important property of the RSK

algorithm: Equal elements of Q are inserted strictly left to right.

It remains to show that the RSK algorithm is a bijection. Thus given

(P,Q) = (P (m), Q(m)), let Qrs be the rightmost occurrence of the

largest entry of Q (where Qrs is the element of Q in row r and column

s). Since equal elements of Q are inserted left to right, it follows that

Qrs = im, Q(m− 1) = Q(m) \Qrs (i.e., Q(m) with the element Qrs

deleted), and that Prs was the last element of P to be bumped into

place after inserting jm into P (m− 1). But it is then easy to reverse the

insertion procedure P (m− 1)← jm.

242

Prs must have been bumped by the rightmost element Pr−1,t of row

r − 1 of P that is smaller than Prs. Hence remove Prs from P , replace

Pr−1,t with Prs, and continue by replacing the rightmost element of row

r − 2 of P that is smaller than Pr−1,t with Pr−1,t,etc. Eventually some

element jm is removed from the first row of P . We have thus uniquely

recovered (im, jm) and (P (m− 1), Q(m− 1)). By iterating this

procedure we recover the entire two-line array wA. Hence the RSK

algorithm is injective.

243

To show surjectivity, we need to show that applying the procedure of the

previous paragraph to an arbitrary pair (P,Q) of SSYTs of the same

shape always yields a valid two-line array

wA =

i1 i2 i3 . . . im

j1 j2 j3 . . . jm

(9.32)

Clearly, i1 ≤ i2 ≤ · · · ≤ im, so we need to show that if ik = ik+1 then

jk ≤ jk+1. Let ik = Qrs and ik+1 = Quv, so r ≥ u and s < v. When

we begin to apply inverse bumping to Puv, it occupies the end of its row

(row u).

244

Hence when we apply inverse bumping to Prs, its “inverse insertion

path” intersects row u strictly to the left of the column v. Thus at row

u the inverse insertion path of Prs lies strictly to the left of that of Puv.

By a simple induction argument (essentially the “inverse” of Lemma

9.28(b)), the entire inverse insertion path of Prs lies strictly to the left

of that of Puv. In particular, before removing ik+1 the two elements jk

and jk+1 appear in the first row with jk to the left jk+1. Hence

jk ≤ jk+1 as desired, completing the proof.

245

When the RSK algorithm is applied to a permutation matrix A (or a

permutation w ∈ Sn), the resulting tableaux P,Q are just standard

Young tableaux (of the same shape). Conversely, if P and Q are SYTs

of the same shape, then the matrix A satisfying ARSK→ (P,Q) is a

permutation matrix. Hence the RSK algorithm sets up a bijection

between the symmetric group Sn and pairs (P,Q) of SYTs of the same

shape λ ` n. In particular, if fλ denotes the number of SYTs of shape

λ, then we have the fundamental identity∑λ`n

(fλ)2 = n! (9.33)

246

Although permutation matrices are very special cases N-matrices of finite

support, in fact the RSK algorithm for arbitrary N-matrices A can be

reduced to the case of permutation matrices. Namely, given the two line

array wA, say of length n, replace the first row by 1, 2, . . . , n. Suppose

the second row of wA has ci i’s. Then replace the 1’s in the second row

from left-to-right with 1, 2, . . . , c1, next the 2’s from left to-right with

c1 + 1, c2 + 1, . . . , c1 + c2 etc. until the second row becomes a

permutation of 1, 2, . . . , n. Denote the resulting two-line array by wA.

247

Lemma 9.31. Let

wA =

i1 i2 i3 . . . in

j1 j2 j3 . . . jn

be a two-line array, and let

wA =

1 2 3 . . . n

j1 j2 j3 . . . jn

Suppose that wA

RSK→ (P , Q). Let (P,Q) be the tableaux obtained from

P and Q by replacing k in P by ik, and jk in Q by jk. Then

wARSK→ (P,Q). In other words, the operation wA 7→ wA “commutes”

with the RSK algorithm.

248

Proof. Suppose that when the number j is inserted into a row at some

stage of the RSK algorithm, it occupies the k-th position in the row. If

this number j were replaced by a larger number j + ε, smaller than any

element of the row which is greater than j, then j + ε would also be

inserted in at the k-th position. From this we see that the insertion

procedure for elements j1, j2, . . . , jn exactly mimics that for

j1, j2, . . . jn, and the proof follows.

The process of replacing wA with wA, P with P , etc is called

standardization

249

9.10 Some consequences of the RSK algorithm

Theorem 9.32 (Cauchy identity). We have∏i,j

(1− xiyj)−1 =∑

λ

sλ(x)sλ(y) (9.34)

Proof. Write

∏i,j

(1− xiyj)−1 =∏i,j

∑aij≥0

(xiyj)aij

(9.35)

A term xαyβ in this expansion is obtained by choosing an N-matrix

At = (aij)t (the transpose of A) of finite support with row(A) = α and

col(A) = β.

250

Hence the coefficient of xαyβ in (9.35) is the number of Nαβ of

N-matrices A with row(A) = α and col(A) = β. This statement is also

equivalent to (9.6). On the other hand the coefficient of xαyβ in∑λ sλ(x)sλ(y) is the number of pairs (P,Q) of SSYT of the shape λ

such that type(P ) = α and type(Q) = β. The RSK algorithm sets up a

bijection between the matrices A and the tableau pairs (P,Q), so the

proof follows.

251

Corollary 9.33. The Schur functions form an orthonormal basis for Λ,

i.e. 〈sλ, sµ〉 = δλµ

Proof. Combine Corollary 9.27 and Lemma 9.17.

252

Corollary 9.34. Fix partitions µ, ν ` n. Then∑λ`n

KλµKλν = Nµν = 〈hµ, hν〉

where Kλµ and Kλν denote Kostka numbers, and Nµν is the number

N-matrices A with row(A) = µ and col(A) = ν.

Proof. Take the coefficient of xµyν on both sides of (9.34).

253

Corollary 9.35. We have

hµ =∑

λ

Kλµsλ (9.36)

In other words, if M(u, v) denotes the transition matrix from the basis

{vλ} to the basis {uλ} of Λ (so that uλ =∑

µM(u, v)λµvµ), then

M(h, s) = M(s,m)t

We give three proofs of this corollary, all essentially equivalent

First proof. Let hµ =∑

λ aλµsλ. By Corollary 9.33, we have

aλµ = 〈hµ, sλ〉. Since 〈hµ,mν〉 = δµν by definition (9.22) of the scalar

product 〈, 〉, we have from Equation (9.27) that 〈hµ, hsλ〉 = Kλµ.

254

Second Proof. Fix µ. Then

hµ =∑A

xcol(A)

=∑

(P,Q)

xQ by the RSK algorithm

=∑

λ

Kλµ

∑Q

xQ

=∑

λ

Kλµsλ

where (i) A ranges over all N-matrices with row(A) = µ

(ii) (P,Q) ranges over all pairs of SSYT of the same shape with

type(P ) = µ and

(iii) Q ranges over all SSYT of shape λ.

255

Third proof Take the coefficient of mµ(x) on both sides of the identity∑λ

mλ(x)hλ(y) =∑

λ

sλ(x)sλ(y)

The two sides are equal by (9.7) and (9.34)

256


hn1 =

∑λ`n

fλsλ (9.37)

Proof. Take the coefficients of x1x2 . . . xn on both sides of (9.34). To

obtain a bijective proof, consider the RSK algorithm ARSK→ (P,Q) when

col(A) = 〈1n〉.

257

9.11 Symmetry of the RSK algorithm

Theorem 9.37. Let A be an N-matrix of finite support, and suppose

that ARSK→ (P,Q). Then At RSK→ (Q,P ), where t denotes the

transpose.

Proof. Let wA =(uv

)be the two-line array associated to A. Hence

w′A =(

vu

)sorted i.e., sort the columns of

(vu

)so that the columns are

weakly increasing in lexicographic order. It follows from Lemma 9.31

that we may assume u and v have no repeated elements.

258

Consider

wA =

u1 . . . un

v1 . . . vn

=

u

v

Suppose the ui’s and vj ’s are distinct, define the inversion poset

I = I(A) = I((uv

)) as follows. The vertices of I are the columns of

(uv

).

For notational convenience, we denote a columna

b as ab. Define ab < cd

in I if a < c and b < d.

259

Lemma 9.38. The map ϕ : I(A)→ I(At) defined by ϕ(ab) = ϕ(ba) is

an isomorphism of posets.

260

Now given the inversion poset I = I(A), define I1 to be the set of

minimal elements of I, then I2 to be the set of minimal elements of

I − I1, then I3 to be the set of minimal elements of I − I1 − I2 etc.

Note that since Ii is an antichain of I, its elements can be labeled

(ui1, vi1), (ui2, vi2), . . . , (uini, vini

) (9.38)

where ni = #Ii such that

ui1 < ui2 < . . . < uini

vi1 > vi2 > . . . > vini

(9.39)

261

Lemma 9.39. Let I1, . . . , Id be the (nonempty) antichains defined

above, labeled as in (9.39). Let ARSK→ (P,Q). Then the first row of P

is v1n1v2n2 · · · vdnd, while the first row of Q is u11u21 · · ·ud1. Moreover,

if (uk, vk) ∈ Ii, then vk is inserted into the i-th column of the first row

of the tableau P (k − 1) in the RSK algorithm.

Proof. Induction on n, the case n = 1 being trivial. Assume the

assertion for n− 1, and let u

v

=

u1 u2 · · · un

v1 v2 · · · vn

,

u

v

=

u1 u2 · · · un−1

v1 v2 · · · vn−1

262

Let P (n− 1), Q(n− 1) be the tableaux obtained after inserting

v1, . . . , vn−1, and let the antichains I ′i := Ii((uv

)), 1 ≤ i ≤ e (where

e = d− 1 or e = d), be given by (ui1, vi1), . . . , (uimi , vimi) where

ui1 < · · · < uimi and vi1 > · · · vimi . By the induction hypothesis, the

first row of P (n− 1) is v1m1 v2m2 · · · veme , while the first row of Q is

u11u21 · · · ue1. Now we insert vn into P (n− 1). If vimi > vn, then

I ′i ∪ (un, vn) is an antichain of I((uv

)). Hence (un, vn) ∈ Ii(

(uv

)) if i is

the least index for which vimi > vn. If there is no such i, then (un, vn)is the unique element of the antichain Id(

(uv

)) of I(

(uv

)). These

conditions mean that vn is inserted into the i-th column of P (n− 1), as

claimed. We start a new i-th column exactly when vn = vd1, in which

case un = ud1, so un is inserted into the i-th column of the first row of

Q(n− 1), as desired.

263

Proof. of Theorem 9.37 If the antichain Ii((uv

)) is given by 9.38 such

that (264) is satisfied, then by Lemma 9.38 the antichain Ii((

vu

)) is just

(vimi, uimi

), . . . (vi2, ui2), (vi1, ui1, )

where

vimi< . . . < vi2 < vi1

uimi> . . . ui2 > > ui1

Hence by Lemma 9.39, if At RSK→ (P ′, Q′), then the first row of P ′ is

u11u21 · · ·ud1, and the first row of Q′ is vim1v2m2 · · · vdmd. Thus by

Lemma 9.39, the first rows of P ′ and Q′ agree with the first rows of P

and Q, respectively.

264

When the RSK algorithm is applied to(uv

), the element vij , 1 ≤ j < mi,

gets bumped into the second row of P before the element 1 ≤ s < mr,

if and only if ui,j+1 < ur,s+1. Let P and Q denote P and Q with their

first rows removed. It follows that a

b

:=

u12 · · · u1m1 u22 · · · u2m2 · · ·ud2 · · ·udmd

v11 · · · v1m1−1 v21 · · · v2m2−1 · · · vd1 · · · vdmd−1

sorted

RSK→ (P , Q)

265

Similarly let (P ′, Q′) denote P ′ and Q′ with their first rows removed.

Applying the same argument to(

vu

)rather than

(uv

)yields

a′

b′

:=

v1m1−1 · · · v11 v2m2−1 · · · v21 · · · vdmd−1 · · · vd1

u1m1 · · · u12 u2m2 · · · u22 · · ·udmd · · ·ud2

sorted

RSK→ (P ′, Q′)

But(ab

)=(

b′

a′

)sorted, so by induction on n (or on the number of rows)

we have (P ′, Q′) = (Q, P ) and the proof follows.

266

Corollary 9.40. Let A be a N-matrix of finite support, and let

ARSK→ (P,Q). Then A

RSK→ (P,Q). Then A is symmetric i.e. (A = At)if and only if P = Q.

Proof. Immediete from the fact that At RSK→ (Q,P ).

267

Corollary 9.41. Let A = At and ARSK→ (P, P ), and let

α = (α1, α2, . . .) where αi ∈ N and∑αi <∞. Then the map A 7→ P

establishes a bijection between symmetric N-matrices with row(A) = α

and SSYTs of type α.

Proof. Follows from Corollary 9.40 and Theorem 9.30.

268


1∏i(1− xi) ·

∏i<j(1− xixj)

=∑

λ

sλ(x) (9.40)

summed over all λ ∈ Par.

Proof. The coefficient of xα on the left side is the number of symmetric

N-matrices A with row(A) = α while the coefficient of xα on the right

hand side is the number of SSYTs of type α. Now apply Corollary

9.41.

269

Corollary 9.43. We have∑λ`n

fλ = #{w ∈ Sn : w2 = 1}

the number of involutions in Sn.

Proof. Let w ∈ Sn and wRSK→ (P,Q) where P and Q are SYT of the

same shape λ ` n. The permutation matrix corresponding to w is

symmetric if and only if w2 = 1. By Theorem 9.37 this is the case if and

only if P = Q, and the proof follows.

Alternatively, take the coefficient of x1, · · ·xn on both sides of

(9.40)

270

9.12 The dual RSK Algorithm

There is a variation of the RSK algorithm that is related to the product∏(1 + xiyj) in the same way that the RSK algorithm itself is related to∏(1− xiyj)−1. We call this variation the dual RSK algorithm and

denote it by ARSK∗

→ (P,Q). The matrix A will now be a (0, 1) matrix of

finite support. Form the two line array wA just as before. The RSK∗

algorithm proceeds exactly like the RSK algorithm, except that an

element i bumps the leftmost element ≥ i, rather than the leftmost

element > i. (In particular, RSK and RSK∗ agree for permutation

matrices.) It follows that each row of P is strictly increasing.

271

Theorem 9.44. The RSK∗ algorithm is a bijection between

(0, 1)-matrices A of finite support and pairs (P,Q) such that P t (the

transpose of P ) and Q are SSYTs with sh(P ) = sh(Q). Moreover,

col(A) = type(P ) and row(A) = type(Q).

272

Theorem 9.45. We have∏i,j

(1 + xiyj) =∑

λ

sλ′(x)sλ(y)

273

Lemma 9.46. Let ωy denote ω acting on the y variables only (so we

regard the xi’s as constants commuting with ω). Then

ωy

∏(1− xiyj)−1 =

∏(1 + xiyj)

Proof. We have

ωy

∏(1− xiyj)−1 = ωy

∑λ

mλ(x)hλ(y) ( by Proposition 9.7)

=∑

λ

mλ(x)eλ(y) ( by Theorem 9.8 )

=∏

(1 + xiyj) ( by Proposition 9.3 )

274

Theorem 9.47. For every λ ∈ Par we have

ωsλ = sλ′

Proof. We have∑λ

sλ(x)sλ′(y) =∏

(1 + xiyj) ( by Theorem 9.45)

= ωy

∏(1− xiyj)−1 ( by Lemma 9.46)

= ωy

∑λ

sλ(x)sλ(y) ( by Theorem 9.32)

=∑

λ

sλ(x)ωy(sλ(y))

Take the coefficient of sλ(x) on both sides. Since the sλ(x)’s are linearly

independent, we obtain sλ′(y) = ωy(sλ(y)), or just sλ′ = ωsλ.

275

9.13 The Classical definition of the Schur functions

Let α = (α1, α2, . . . αn) ∈ Nn and w ∈ Sn. As usual write

xα = xα11 · · ·xαn

n and define

w(xα) = xαw(1)1 · · ·xαw(n)

n

Now define

aα = aα(x1, . . . , xn) =∑

w∈Sn

εww(xα) (9.41)

where

εw =

1 if w is an even permutation

−1 if w is an odd permutation

276

Note that the right-hand side of equation (9.41) is just the expansion of

a determinant, namely

aα = det(xαj

i )ni,j=1

Note also that aα is skew-symmetric, i.e. w(aα) = εwaα, so aα = 0unless all the αi’s are distinct. Hence assume that

α1 > α2 > · · · > αn ≥ 0, so α = λ+ δ, where λ ∈ Par, l(λ) ≤ n, and

δ = δn = (n− 1, n− 2, . . . , 0). Since αj = λj + n− j, we get

aα = aλ+δ = det(x

λj+n−ji

)n

i,j=1(9.42)

277

For instance,

a421 = a211+210 =

∣∣∣∣∣∣∣∣x4

1 x21 x1

1

x42 x2

2 x12

x43 x2

3 x13

∣∣∣∣∣∣∣∣Note in particular that

aδ = det(xn−ji ) =

∏1≤i<j≤n

(xi − xj) (9.43)

the Vandermonde determinant. If for some i 6= j we put xi = xj in aα,

then because aα is skew-symmetric (or because the i-th row and j-th

row of the determinant (9.42) become equal), we obtain 0.

278

Hence aα is divisible by xi − xj and thus by aδ (in the ring

Z[x1, . . . xn]). Thus aα/aδ ∈ Z[x1, . . . , xn]. Moreover, since aα and aδ

are skew-symmetric, the quotient is symmetric, and is clearly

homogeneous of degree |α| − |δ| = |λ|. In other words, aα/aδ ∈ Λ|λ|n .

279

Theorem 9.48. We have

aλ+δ/aδ = sλ(x1, . . . , xn)

280

Proof. There are many proofs of this result. We give one that can be

extended to give an important result on skew Schur functions(Theorem

9.51).

Applying ω to 9.36 and replacing λ by λ′ yields

eµ =∑

λ

Kλ′µsλ

Since the matrix (Kλ′µ) is invertible, it suffices to show that

eµ(x1, . . . , xn) =∑

Kλ′µaλ+δ

aδ

or equivalently (always working with n variables),

aδeµ =∑

λ

Kλ′µaλ+δ (9.44)

281

Since both sides of (9.44) are skew-symmetric, it is enough to show that

the coefficient of xλ+δ in aδeµ is Kλ′µ. We multiply aδ by eµ by

successively multiplying eµ1 , eµ2 , . . .. Each partial product aδeµ1 · · · eµk

is skew-symmetric, so any term xi11 · · ·xin

n appearing in aδeµ1 · · · eµkhas

all exponents ij distinct. When we multiply such a term xi11 · · ·xin

n by a

term xm1 · · ·xmjfrom eµk+1 (so j = µk+1), either two exponents

become equal, or the exponents maintain their relative order. If two

exponents become equal then that term disappears from aδeµ1 · · · eµk+1 .

Hence to get the term xλ+δ, we must start with the term xδ in aδ and

successively multiply by a term xα1of eµ1 , then xα2

of eµ2 etc., keeping

the exponents strictly decreasing. The number of ways to do this is the

coefficient of xλ+δ in aδeµ.

282

Given the terms xα1, xα2

, . . . as above, define an SSYT

T = T (α1, α2, . . .) as follows: Column j of T contains an i if the

variable xj occurs in xαi

(i.e. the j-th coordinate of αi is equal to 1).

For example, suppose n = 4, λ = 5332, λ′ = 44311, λ+ δ = 8542, µ =3222211, xα1

= x1x2x3, xα2

= x1x2, xα3

= x3x4, xα4

= x1x2, xα5

=x1x4, x

α6= x1, x

α7= x3. Then T is given by

1113

2235

447

5

6

283

It is easy to see that the map (α1, α2, . . .) 7→ T (α1, α2, . . .) gives a

bijection between ways of building up the term xλ+δ from xδ (according

to the rules above) and SSYT of shape λ′ and type µ, so the proof

follows.

284

From the combinatorial definition of Schur functions it is clear that

sλ(x1, . . . , xn) = 0 if l(λ) > n. It is not hard to check thatt

dim (Λn) = #{λ ∈ Par : l(λ) ≤ n}. It follows that the set

{sλ(x1, . . . , xn) : l(λ) ≤ n} is a basis for Λn. (This also follows frm a

simple extension of the proof of Corollary 9.27). We define on Λn a

scalar product 〈, 〉n by requiring that {sλ(x1, . . . , xn)} is an orthonormal

basis. If f, g ∈ Λ, then we write 〈f, g〉n as short for

〈f(x1, . . . .xn), g(x1, . . . , xn)〉n. Thus

〈f, g〉 = 〈f, g〉n

provided that every monomial appearing in f involves at most n distinct

variables e.g., if deg f ≤ n

285

Corollary 9.49. If f ∈ Λn, l(λ) ≤ n, and δ = (n− 1, n− 2, . . . , 1, 0),then

〈f, sλ〉n = [xλ+δ]aδf

the coefficient of xλ+δ in aδf .

Proof. All functions will be in the variables x1, . . . , xn. Let

f =∑

l(λ)≤n cλsλ. Then by Theorem 9.48 we have

aδf =∑

l(λ)≤n

cλaλ+δ,

so

〈f, sλ〉n = cλ = [xλ+δ]aδf.

286

Let us now consider a “skew generalization” of Theorem 9.48. We

continue to work in n variables x1, . . . , xn. For any

λν ∈ Par, l(λ) ≤ n, l(ν) ≤ n, consider the expansion

sνeµ =∑

λ

Lλ′

ν′µsλ,

or equivalently (multiplying by aδ),

aν+δeµ =∑

λ

Lλ′

ν′µaλ+δ (9.45)

Arguing as in the proof of Theorem 9.48 shows that Lλ′

ν′µ is equal to the

number of ways to write

λ+ δ = ν + δ + α1 + α2 + · · ·+ αk,

287

Here l(µ) = k, each αi is a (0, 1)-vector with µi 1’s, and each a partial

sum ν + δ + α1 + · · ·+ αi has strictly decreasing coordinates. Define a

skew SSYT T = Tλ′/ν′(α1, . . . , αk) of shape λ′/ν′ and type µ by the

condition that i appears in column j of T if the j-th coordinate of αi is

a 1. This establishes a bijection which shows that Lλ′

ν′µ is equal to the

skew Kostka number Kλ′/ν′,µ, the number of skew SSYTs of shape

λ′/ν′ and type µ (see Equation (9.28)). (If ν′ * λ′ then this number is

0.)


sνeµ =∑

λ

Kλ′/ν′,µsλ. (9.46)

Proof. Divide (9.45) by aδ and let n→∞.

288

Theorem 9.51. For any f ∈ Λ, we have

〈fsν , sλ〉 = 〈f, sλ/ν〉.

In other words, the two linear transformations Mν : Λ→ Λ and

Dν : Λ→ Λ defined by Mνf = sνf and Dνsλ = sλ/ν are adjoint with

respect to the scalar product 〈·, ·〉. In particular

〈sµsν , sλ〉 = 〈sµ, sλ/ν〉. (9.47)

289

Proof. Apply ω to (9.46) and replace ν by ν′ and λ by λ′. We obtain

sνhµ =∑

λ

Kλ/ν,µsλ

Hence

〈sνhµ, sλ〉 = Kλ/ν,µ = 〈hµ, sλ/ν〉, (9.48)

by (9.28) and the fact that 〈hµ,mρ〉 = δµρ by definition of 〈·, ·〉. But

equation (9.48) is linear in hµ, so since {hµ} is a basis for Λ, the proof

follows.

290

Theorem 9.52. For any λ, ν ∈ Par we have ωsλ/ν = sλ′/ν′ .

Proof. By equation (9.47) and the fact that ω is an isometry, we have

〈ω(sµsν), ωsλ〉 = 〈ωsµ, ωsλ/ν〉.

Hence by Theorem 9.47 we get

〈sµ′sν′ , sλ′〉 = 〈sµ′ , ωsλ/ν〉 (9.49)

On the other hand, substituting λ′, µ′, ν′ for λ, µ, ν respectively in (9.47)

yields

〈sµ′sν′ , sλ′〉 = 〈sµ′ , sλ′/ν′〉 (9.50)

From Equations (9.49) and (9.50) there follows ωsλ/ν = sλ′/ν′

291

9.14 The Jacobi-Trudi Identity

Theorem 9.53. Let λ = (λ1, . . . , λn) and µ = (µ1, . . . , µn) ⊆ λ. Then

sλ/µ = det(hλi−µj−i+j)ni,j=1 (9.51)

where we set h0 = 1 and hk = 0 for k < 0.

292

Proof. Let cλµν = 〈sλ, sµsν〉, so

sµsν =∑

λ

cλµνsλ sλ/µ =∑

ν

cλµνsν

Then ∑λ

sλ/µ(x)sλ(y) =∑λ,ν

cλµνsν(x)sλ(y)

=∑

ν

sν(x)sµ(y)sν(y)

= sµ(y)∑

ν

hν(x)mν(y)

293

Let y = (y1, . . . , yn). Multiplying by aδ(y) gives∑λ

sλ/µ(x)aλ+δ(y) =

(∑ν

hν(x)mν(y)

)aµ+δ(y)

=

(∑α∈Nn

hα(x)yα

)( ∑w∈Sn

εwyw(µ+δ)

)=∑

w∈Sn

∑α

εwhα(x)yα+w(µ+δ)

Now take the coefficient of yλ+δ on both sides (so we are looking for

terms where λ+ δ = α+ w(µ+ δ)). We get

sλ/µ(x) =∑

w∈Sn

εwhλ+δ−w(µ+δ)(x) (9.52)

= det(hλi−µj−i+j(x))ni,j=1

294

Corollary 9.54 (Dual Jacobi-Trudi identity). Let µ ⊆ λ with λ1 ≤ n.

Then

sλ/µ = det(eλ′−µ′j−i+j)ni,j=1 (9.53)

295

Let fλ/µ be the number of SYT of shape λ/µ.

Corollary 9.55. Let |λ/µ| = N and l(λ) ≤ n. Then

fλ/µ = N !det

(1

(λi − µj − i+ j)!

)n

i,j=1

(9.54)

296

9.15 The Murnaghan-Nakayama Rule

A skew shape λ/µ is connected if the interior of the diagram of λ/µ,

regarded as a union of solid squares, is a connected (open) set. A border

strip (or rim hook or ribbon) is a connected skew shape with no 2× 2square.

Given positive integers a1, . . . , ak, there is a unique border strip λ/µ (up

to translation) with ai squares in row i (i.e. ai = λi − µi). It follows

that the number of border strips of size n (up to translation) is 2n−1.

Define the height ht(B) of a border strip B to be one less than its

number of rows.

297

Theorem 9.56. For any µ ∈ Par and r ∈ N we have

sµpr =∑

λ

(−1)ht(λ/µ)sλ, (9.55)

summed over all partitions λ ⊇ µ for which λ/µ is a border strip of size

r.

298

Proof. Let δ = (n− 1, n− 2, . . . , 0), and let all functions be in the

variables x1, . . . , xn. In equation 9.41 let α = µ+ δ and multiply by pr.

We get

aµ+δpr =n∑

j=1

aµ+δ+rεj, (9.56)

where εj is the sequence with a 1 in the j-th place and 0 elsewhere.

Arrange the sequence µ+ δ + rεj in descending order. If it has two

terms equal, then it will contribute nothing to (9.56). Otherwise there is

some p ≤ q for which

µp−1 + n− p+ 1 > µq + n− q + r > µp + n− p,

in which case aµ+δ+rεj= (−1)q−paλ+δ, where λ is the partition

λ = (µ1, . . . , µp−1, µq + p− q + r, µp + 1, . . . , µq−1 + 1, µq+1, . . . , µn)

299

Such partitions are precisely those for which λ/µ is a border strip B of

size r, and q − p is just ht(B). Hence

aµ+δpr =∑

λ

(−1)ht(λ/µ)aλ+δ

Divide by aδ and let n→∞ to obtain 9.55

300

Let α = (α1, α2, . . .) be a weak composition of n. Define a border-strip

tableaux (or rim-hook tableaux) of shape λ/µ (where |λ/µ| = n) and

type α to be an assignment of positive integers to the squares of λ/µ

such that

(a) every row and column is weakly increasing

(b) the integer i appears αi times, and

(c) the set of squares occupied by i forms a border strip.

Equivalently, one may think of a border-strip tableau as a sequence

µ = λ0 ⊆ λ1 ⊆ · · · ⊆ λr ⊆ λ of partitions such that each skew shape

λi/λi+1 is a border strip of size αi (including the empty border-strip ∅when αi = 0).

301

Define the height ht(T ) of a border-strip tableau T to be

ht(T ) = ht(B1) + ht(B2) · · ·+ ht(Bk)

where B1, . . . , Bk are the (nonempty) border strips appearing in T .

302

Theorem 9.57. We have

sµpα =∑

λ

χλ/µ(α)sλ, (9.57)

where

χλ/µ(α) =∑T

(−1)ht(T) (9.58)

summed over all border-strip tableaux of shape λ/µ and type α

303


pα =∑

λ

χλ(α)sλ (9.59)

where χλ(α) is given by (9.58) with µ = ∅.

304


sλ/µ =∑

ν

z−1ν χλ/µ(ν)pν , (9.60)

where χλ/µ(ν) is given by (9.58).

Proof. We have from (9.57) that

χλ/µ(ν) = 〈sµpν , sλ〉

= 〈pν , sλ/µ〉,

and the proof follows from Proposition 9.18.

305

The orthogonality properties of the bases {sλ} and {pλ} translate into

orthogonality relations satisfied by the coefficients χλ(µ).

Proposition 9.60. (a) Fix µ, ν. Then∑λ

χλ(µ)χλ(ν) = zµδµν

(b) Fix λ, µ. Then ∑ν

z−1ν χλ(ν)χµ(ν) = δλµ

Proof. (a) Expand pµ and pν by (9.59) and take 〈pµ, pν〉.

(b) Expand sλ and sµ by (9.60) and take 〈sλ, sµ〉.

306

10 Characters of the Symmetric and Unitary

Groups

10.1 Characters of the Symmetric Group

Let CFn denote the set of all class functions (i.e. functions constant on

conjugacy classes) f : Sn → Q. Recall that CFn has a natural scalar

product defined by

〈f, g〉 =1n!

∑w∈Sn

f(w)g(w)

Sometimes by abuse of notation we write 〈φ, γ〉 instead of 〈f, g〉 when φ

and γ are representations of Sn with characters f and g.

307

If α = (α1, . . . , αl) is a vector of positive integers and

|α| := α1 + · · ·+ αl = n, then recall the Young subgroup Sα ⊆ Sn given

by

Sα = Sα1 × Sα2 × · · · × Sαl

where Sα1 permutes 1, 2, . . . , α1; Sα2 permutes

α1 + 1, α1 + 2, . . . α1 + α2 etc.

308

Consider the following linear transformations ch : CFn → Λn called the

Frobenius characteristic maps. If f ∈ CFn, then

ch f =1n!

∑w∈Sn

f(w)pρ(w)

=∑

µ

z−1µ f(µ)pµ

where f(µ) denotes f(w) for any type ρ(w) = µ. Equivalently, extending

the ground field Q to the algebra Λ and defining Ψ(w) = pρ(w), we have

ch f = 〈f,Ψ〉 (10.1)

309

Note that if fµ is the class function defined by

fµ(w) =

1, if ρ(w) = µ,

0, otherwise.

then chfµ = z−1µ pµ.

NOTE. Let ϕ : Sn → GL(V ) be a representation of Sn with character

χ. Some times by abuse of notation we will write ch ϕ or ch V instead

of ch χ.

310

Proposition 10.1. The linear transformation ch is an isometry i.e.,

〈f, g〉CFn = 〈ch f, ch g〉Λn .

Proof. We have (using Proposition 9.18)

〈ch f, ch g〉 = 〈∑

µ

z−1λ f(λ)pλ,

∑µ

z−1µ g(µ)pµ〉

=∑

λ

z−1λ f(λ)g(λ)

= 〈f, g〉

311

We now want to define a product on class functions that will correspond

to the ordinary product of symmetric functions under the characteristic

map ch . Let f ∈ CFm and g ∈ CFn. Define the pointwise product

f × g ∈ CF(Sm × Sn) by

(f × g)(u, v) = f(u)g(v).

If f and g are characters of representations of ϕ and ψ, then f × g is

just the character of the tensor product representation ϕ⊗ ψ of

Sm × Sn. Now define the induction product f ◦ g of f and g to be the

induction of f × g to Sm+n where as before Sm permutes 1, 2, . . . ,mwhile Sn permutes m+ 1,m+ 2, . . . ,m+ n. in symbols

f ◦ g = indSm+n

Sm×Sn(f × g).

312

Let CF = CF0 ⊕CF1 ⊕ · · · , and extend the scalar product on CFn to

all of CF by setting 〈f, g〉 = 0 if f ∈ CFm , g ∈ CFn, and m 6= n.

The induction product on characters extends to all of CF by

(bi)linearity. It is not hard to check that this takes CF into an

associative commutative graded Q-algebra with the identity 1 ∈ CF0.

Similarly we can extend the characteristic map ch to a linear

transformation ch : CF→ Λ.

313

Proposition 10.2. The characteristic map ch : CF→ Λ is a bijective

algebra homomorphism, i.e. ch is one-to-one and onto, and satisfies

ch (f ◦ g) = (ch f)(ch g)

314

Proof. Let resGHf denote the restriction of the class function f on G to

the subgroup H. We then have

ch (f ◦ g) = ch (indSm+n

Sm×Sn(f × g))

= 〈indSm+n

Sm×Sn(f × g),Ψ〉 by (10.1)

= 〈f × g, resSm+n

Sm×SnΨ〉Sm×Sn (by Frobenius reciprocity)

=1

m!n!

∑u∈Sm

∑v∈Sn

f(u)g(v)Ψ(uv)

=1

m!n!

∑u∈Sm

∑v∈Sn

f(u)g(v)Ψ(u)Ψ(v)

= 〈f,Ψ〉Sm〈g,Ψ〉Sn

= (ch f)(ch g)

315

Moreover, from the definition of ch and the fact that the power sums

pµ form a Q-basis for Λ it follows that ch is bijective.

316

Note that by Equation (9.19) and the definition of ch we have

ch 1Sn=∑λ`n

z−1λ pλ = hn (10.2)

Corollary 10.3. We have ch 1Sn

Sα= hα

Proof. Since 1Sn

Sα= 1Sα1

◦ 1Sα2◦ · · · ◦ 1Sαl

, the proof follows from

Proposition 10.2 and Equation (10.2).

317

Now let Rn denote the set of all virtual characters of Sn, i.e. functions

of Sn that are the difference of two characters (= integer linear

combinations of irreducible characters). Thus Rn is a lattice (discrete

subgroup of maximum rank) in the vector space CFn. The rank of Rn

is p(n), the number of partitions of n, and a basis consists of the

irreducible characters of Sn. This basis is the unique orthonormal basis

of Rn up to sign and order, since the transition matrix between two such

bases must be a integral orthogonal matrix and hence a signed

permutation. Define R = R0 ⊕R1 ⊕ · · · .

318

Proposition 10.4. The image of R under the characteristic map ch is

ΛZ. Hence ch : R→ ΛZ is a ring isomorphism.

319

Proof. It will suffice to find integer linear combinations of the characters

ηα of the representations 1Sn

Sαthat are irreducible characters of Sn. The

Jacobi-Trudi identity (Theorem 9.53) suggests we define the (possibly

virtual) characters ψλ = det(ηλi−i+j), where the product used in

evaluating the determinant is the induction product. Then by the

Jacobi-Trudi identity and Proposition 10.1 we have

ch (ψλ) = sλ (10.3)

Since ch is an isometry (Proposition 10.1) we get 〈ψλ, ψµ〉 = δλµ. As

pointed out above, this means that the class functions ψλ are, up to

sign, the irreducible characters of Sn. Hence the ψλ for λ ` n form a Zbasis for Rn, and the image of Rn is the Z-span of the sλ’s which is

just ΛnZ as claimed.

320

Theorem 10.5. Regard the functions χλ (where λ ` n) of Section 9.15

as functions on Sn given by χλ(w) = χλ(µ), where w has cycle type µ.

Then the χλ are the irreducible characters of the symmetry group Sn.

321

By the Murnaghan-Nakayama rule (Corollary 9.59 ), we have

ch (χλ) =∑

µ

z−1µ χλ(µ)pµ = sλ

Hence by Equation (10.3), we get χλ = ψλ. Since the ψλ, up to sign,

are the irreducible characters of Sn, it remains only to determine wether

χλ or −χλ is a character. But χλ(1n) = fλ, so χλ is an irreducible

character.

322

By definition, ηλ is the character of the module Mλ. It can be shown

that χλ is the character of the Specht module Sλ.

Proposition 10.6. Let α be a composition of n and λ ` n. Then the

multiplicity of the irreducible character χλ in the character ηα is just the

Kostka number Kλα. In symbols

〈ηα, χλ〉 = Kλα.

Proof. By Corollary 10.3 we have ch ηα = hα. Then the proof follows

from Corollary 9.35 and Theorem 10.5.

323

10.2 The characters of GL(n, C)

A linear representation of GL(V ) is a homomorphism

ϕ : GL(V )→ GL(W ), where W is a complex vector space. From now

on we assume that all representations are finite dimensional i.e

dim (W ) <∞. We call dim (W ) the dimension of the representation ϕ,

denoted dim (ϕ).

324

The representation ϕ is a polynomial representation if, after choosing

ordered bases for V and W , the entries of ϕ(A) are polynomials in the

entries of A ∈ GL(n,C). It is clear that the notion of polynomial

representations is independent of the choice of ordered bases of V and

W , since linear combinations of polynomials remain polynomials.

325

Fact: If ϕ is a polynomial representation of GL(V ), then there is a

symmetric polynomial char ϕ in dim V variables such that

Trϕ(A) = char ϕ(θ1, . . . , θn)

for all A ∈ GL(V ), where θ1, . . . , θn are the eigenvalues of A.

326

Theorem 10.7. The irreducible polynomial representations ϕλ of GL(V)

can be indexed by partitions λ of length at most n so that

char ϕλ = sλ(x1, . . . , xn)

327

Examples:

• If ϕ(A) = 1 (the trivial representation), then char ϕ = s∅ = 1.

• If ϕ(A) = A (the defining representation), then

char ϕ = x1 + · · ·+ xn = s1.

• If ϕ(A) = (detA)m for an positive integer m, then

char ϕ = (x1 · · ·xn)m = smn .

328

Proof. (Sketch) Let V be an n-dimensional complex vector space.

Then GL(V ) acts diagonally on the k-th tensor power V ⊗k i.e

A · (v1 ⊗ · · · vk) = (A · v1)⊗ · · · ⊗ (A · vk), (10.4)

and the symmetric group Sk acts on V ⊗k by permuting tensor

coordinates, i.e.

w · (v1 ⊗ · · · vk) = vw−1(1) ⊗ · · · vw−1(k) (10.5)

329

The actions of GL(V ) and Sk commute, so we have an action of

Sk ×GL(V ) on V ⊗k. A crucial fact is that the actions of GL(V ) and

Sk centralize each other. i.e the (invertible) linear transformations

V ⊗k → V ⊗k that commute with the Sk action are just those given by

Eqn. (10.4), while conversely the linear transformations that commute

with the GL(V ) actions are those generated (as a C algebra) by Eqn

(10.5). From this it can be shown that V ⊗k decomposes into irreducible

Sk ×GL(V )-modules as follows

V ⊗k =⊕

(Mλ ⊗ Fλ) (10.6)

where⊕

denotes the direct sum (the “double commutant theorem”).

330

Here the Mλ’s are nonisomorphic irreducible Sk modules, the Fλ’s are

nonisomorphic irreducible GL(V ) modules, and λ ranges over some

index set. We know (Theorem 10.5) that the irreducible representations

of Sk are indexed by partitions λ of k, so we choose the indexing so that

Mλ is the irreducible Sk module corresponding to λ ` k via Theorem

10.5. Thus we have constructed irreducible (or possibly 0)

GL(V )-modules Fλ. These modules afford polynomial representations

ϕλ, and the nonzero ones are inequivalent.

331

Next we compute the character of ϕλ. Let w ×A be an element of

Sk ×GL(V ), and let tr(w ×A) denote the trace of w ×A acting on

V ⊗k. Then by Equation (10.6) we have

tr(w ×A) =∑

λ

χλ(w) · tr(ϕλ(A)).

Let A have eigenvalues θ = (θ1, . . . , θn). A straightforward computation

shows that tr(w ×A) = pρ(w)(θ), so

pρ(w)(θ) =∑

λ

χλ(w)(char ϕλ)(θ)

But we know (Corollary 9.59) that

pρ(w) =∑

λ

χλ(w)sλ

332

Since the χλ’s are linearly independent, we conclude char ϕλ = sλ.

A separate argument shows that there are no other irreducible

polynomial characters.

333

Fact: The ϕλ remain irreducible when restricted to U(V ) because the

(dim V )2 entries of a general unitary matrix are algebraically

independent, and so every irreducible polynomial representation of

GL(V ) is still irreducible when restricted to U(V ).

334

11 Eigenvalues of random matrices

For n ∈ N, let Mn be a random n× n unitary matrix with distribution

given by Haar measure on the unitary group. The eigenvalues of Mn lie

on the unit circle T of the complex plane C. Write Ξn for the random

measure on T that places a unit mass at each of the eigenvalues of Mn.

That is, if the eigenvalues are {νn1, . . . , νnn}, then

Ξn(f) :=∫

Tf dΞn =

∑j

f(νnj)

335

Note that if f : T→ C has Fourier expansion f(eiθ) =∑

j∈Z fjeijθ,

then

Ξn(f) = nf0 +∞∑

j=1

fj Tr (M jn) +

∞∑j=1

f−j Tr (M jn),

where Tr denotes the trace.

Questions about the asymptotic behaviour of cn(Ξn(fn)− E[Ξn(fn)])for a sequence of test functions {fn} and sequence of norming constants

{cn} may therefore be placed in the larger framework of questions about

the asymptotic behaviour of∑∞

j=1(anj Tr (M jn) + bnj Tr (M j

n)) for

arrays of complex constants {anj : n ∈ N, j ∈ N} and

{bnj : n ∈ N, j ∈ N}.

336

Definition 11.1. A complex random variable is said to be standard

complex normal if the real and imaginary parts are independent centred

(real) normal random variables with common variance 12 .

337

11.1 Moments of Traces

Theorem 11.2. a) Consider a = (a1, . . . , ak) and b = (b1, . . . , bk)with aj , bj ∈ {0, 1, . . .}. Let Z1, Z2, . . . Zk be independent standard

complex normal random variables. Then for

n ≥ (∑k

j=1 jaj) ∨ (∑k

j=1 jbj),

E

k∏j=1

(Tr (M j

n))aj(

Tr (M jn))bj

= δab

k∏j=1

jajaj !

= E

k∏j=1

(√jZj

)aj(√

jZj

)bj

.b) For any j, k,

E[Tr (M j

n) Tr (Mkn)]

= δjk(j ∧ n).

338

Proof. (a) Define the simple power sum symmetric function pj to be the

symmetric function pj(x1, . . . , xn) = xj1 + · · ·+ xj

n. Let µ be the

partition (1a1 , 2a2 , . . . , kak) of the integer K = 1a1 + 2a2 + · · ·+ kak

and set pµ =∏

j paj

j to be the corresponding compound power sum

symmetric function. Associate µ with the conjugacy class of the

symmetric group on K letters that consists of permutations with aj

j–cycles for 1 ≤ j ≤ k. We have the expansion

pµ =∑λ`K

χλµsλ,

339

Here the sum is over all partitions of K, the coefficient χλµ is the

character of the irreducible representation of the symmetric group

associated with the partition λ evaluated on the conjugacy class

associated with the partition µ, and sλ is the Schur function

corresponding to the partition λ

340

Given an n× n unitary matrix U , write sλ(U) (resp. pµ(U)) for the

function sλ (resp. pµ) applied to the eigenvalues of U . Writing `(λ) for

the number of parts of the partition λ (that is, the length of λ), the

functions U 7→ sλ(U) are irreducible characters of the unitary group

when `(λ) ≤ n and sλ(U) = 0 otherwise. Thus

E[sλ(Mn)sπ(Mn)

]= δλπ1(`(λ) ≤ n),

Set ν = (1b1 , 2b2 , . . . , kbk) and L = 1b1 + 2b2 + · · ·+ kbk .

341

We have

E

k∏j=1

(Tr (M j

n))aj(

Tr (M jn))bj

= E

[pµ(Mn)pν(Mn)

]= E

(∑λ`K

χλµsλ(Mn)

)(∑π`L

χπν sπ(Mn)

)= δKL

∑λ`K

χλµχ

λν1(`(λ) ≤ n).

(11.1)

342

When K ≤ n, all partitions of K are necessarily of length at most n,

and so, by the second orthogonality relation for characters of the

symmetric group, the rightmost term of (11.1) becomes

δKLδµν

k∏j=1

jajaj ! = δab

k∏j=1

jajaj !,

which coincides with the claimed mixed moment of√jZj , 1 ≤ j ≤ k.

343

(b) We have from (11.1) that

E[Tr (M j

n) Tr (Mkn)]

= δjk

∑λ`j

∣∣∣χλ(j)

∣∣∣2 1(`(λ) ≤ n),

where (j) is the partition of j consisting of a single part of size j. Now

χλ(j) = 0 unless λ is a hook partition (that is, a partition with at most

one part of size greater than 1), in which case

χλ(j) = (−1)`(λ)−1

Since there are j ∧ n hook partitions of j of length at most n, part (b)

follows.

344

11.2 Linear combination of Traces

Theorem 11.3. Consider an array of complex constants

{anj : n ∈ N, j ∈ N}. Suppose there exists σ2 such that

limn→∞

∞∑j=1

|anj |2(j ∧ n) = σ2.

Suppose also that there exists a sequence of positive integers

{mn : n ∈ N} such that

limn→∞

mn/n = 0

and

limn→∞

∞∑j=mn+1

|anj |2(j ∧ n) = 0.

345

Then∑∞

j=1 anj Tr (M jn) converges in distribution as n→∞ to σZ,

where Z is a complex standard normal random variable.

346

Proof. Recall from Theorem 11.2 that E[ Tr (M jn)] = 0 and

E[ Tr (M jn) Tr (Mk

n)] = δjk(j ∧ n). Consequently, the series∑∞j=1 anj Tr (M j

n) converges in L2 for each n and

limn→∞ E[|∑∞

j=mn+1 anj Tr (M jn)|2] = 0.

It therefore suffices to show that σ−1∑mn

j=1 anj Tr (M jn) converges in

distribution as n→∞ to a complex standard normal random variable.

Let Z0, Z1, Z2, . . . be a sequence of independent complex standard

normals.

347

From Theorem 11.2 we know that

E

mn∑j=1

anj Tr (M jn)

α

mn∑j=1

anj Tr (M jn)

β

= E

mn∑j=1

anj

√jZj

α

mn∑j=1

anj

√jZj

β

= E

mn∑

j=1

|anj |2j

1/2

Z0

α

mn∑j=1

|anj |2j

1/2

Z0

β

,

provided that αmn ≤ n and βmn ≤ n. The result now follows by

convergence of moments for complex normal distributions and the

assumption that mn/n→ 0.

348

Theorem 11.4. Consider arrays of complex constants

{anj : n ∈ N, j ∈ N} and {bnj : n ∈ N, j ∈ N}. Suppose there exist σ2,

τ2, and γ such that

limn→∞

∞∑j=1

|anj |2(j ∧ n) = σ2,

limn→∞

∞∑j=1

|bnj |2(j ∧ n) = τ2,

and

limn→∞

∞∑j=1

anjbnj(j ∧ n) = γ.

Suppose also that there exists a sequence of positive integers

{mn : n ∈ N} such that

limn→∞

mn/n = 0

349

and

limn→∞

∞∑j=mn+1

(|anj |2 + |bnj |2)(j ∧ n) = 0.

Then∑∞

j=1(anj Tr (M jn) + bnj Tr (M j

n)) converges in distribution as

n→∞ to X + iY , where (X,Y ) is a pair of centred jointly normal real

random variables with

E[X2] =12(σ2 + τ2 + 2<γ),

E[Y 2] =12(σ2 + τ2 − 2<γ),

and

E[XY ] = =γ.

350

Given f ∈ L2(T) (where we define L2(T) to be the space of real–valued

square–integrable functions), write

fj :=12π

∫e−ijθf(θ) dθ, j ∈ Z,

for the Fourier coefficients of f .

Recall that a positive sequence {ck}k∈N is said to be slowly varying if

limk→∞

cbλkc

ck= 1, λ > 0,

351

Theorem 11.5. Suppose that f ∈ L2(T) is such that the sequence

{∑k

j=−k |fj |2j}k∈N is slowly varying. Then

Ξn(f)− E[Ξn(f)]√∑nj=−n |fj |2|j|

converges in distribution to a standard normal random variable as

n→∞.

352

Let H122 denote the space of functions f ∈ L2(T) such that

‖f‖212

:=∑j∈Z|fj |2|j| <∞,

and define an inner product on H122 by

〈f, g〉 12

:=∑j∈Z

fj¯jg|j|.

353

Alternatively, H122 is the space of functions f ∈ L2(T) such that

116π2

∫∫(f(φ)− f(θ))2

sin2(

φ−θ2

) dθ dφ <∞, (11.2)

Moreover,

〈f, g〉 12

=1

16π2

∫∫(f(φ)− f(θ)) (g(φ)− g(θ))

sin2(

φ−θ2

) dθ dφ

354

Theorem 11.6. If f1, . . . , fk ∈ H122 with E[Ξn(fh)] = n

∫fj(θ) dθ = 0

for 1 ≤ h ≤ k, then the random vector (Ξn(f1), . . . ,Ξn(fk)) converges

in distribution to a jointly normal, centred random vector

(Ξ(f1), . . . ,Ξ(fk)) with E[Ξ(fh)Ξ(f`)] = 〈fh, f`〉 12.

355

For 0 ≤ α < β < 2π write Nn(α, β) for the number of eigenvalues of

Mn of the form eiθ with θ ∈ [α, β]. That is, Nn(α, β) = Ξn(f) where f

is the indicator function of the arc {eiθ : θ ∈ [α, β]}. Note that

E[Nn(α, β)] = n(β − α)/2π.

356

Theorem 11.7. As n→∞, the finite–dimensional distributions of the

processes

Nn(α, β)− E[Nn(α, β)]1π

√log n

, 0 ≤ α < β < 2π,

converge to those of a centred Gaussian process

{Z(α, β) : 0 ≤ α < β < 2π} with the covariance structure

E[Z(α, β)Z(α′, β′)] =

1, if α = α′ and β = β′,

12 , if α = α′ and β 6= β′,

12 , if α 6= α′ and β = β′,

− 12 , if β = α′,

0, otherwise.

357

1 construction of haar measure - ucsd mathematicsmath.ucsd.edu/~nwallach/haarmeasure.pdf · 1...

Documents