The Hopf algebra of symmetric functions

(1)

CHAPTER 2

The Hopf algebra of symmetric functions

1. Symmetric polynomials

The theory of symmetric polynomials is as old as algebra itself, and had been actually its main topic for more than two centuries.

The story begins with the relations between the coefficients of a polynomial and its roots

(287) P(x) =

n

Y

i=1

(x−x_i) =

n

X

k=0

(−1)^ke_k(X)x^n−k, (X = (x₁, . . . , x_n))

attributed to Fran¸cois Vi`ete, which were known in the sixteenth century, and certainly to the ancient civilisations in the case of quadratic polynomials.

As is well known, the

(288) e_k(x₁, . . . , x_n) = X

i1<i2<...<ik

x_i₁x_i₂· · ·x_i_k

are theelementary symmetric polynomials, and the fundamental result of the theory states that every symmetric polynomial is the x_i is expressible in a unique way as a polynomial in the e_k.

The first developments, which aimed at the solution of algebraic equations, con- sisted essentially in expressing various families of symmetric polynomials in terms of each other. For example, thepower-sums

(289) p_m(X) =

n

X

k=1

x^m_k

or thecomplete homogeneous symmetric polynomials

(290) h_m(X) = X

k1+...+kn=m

x^k₁¹x^k₂²· · ·x^k_nⁿ (sum of all monomials of degree m), and the obvious linear basis

(291) m_λ(X) = Σx^λ:= X

µ∈S_n(λ)

x^µ₁¹x^µ₂²· · ·x^µ_nⁿ sum of alldistinct permutations of the monomial

(292) x^λ =x^λ₁¹x^λ₂²· · ·x^λ_nⁿ

where we assume λ₁ ≥λ₂ ≥ . . .≥ λ_n ≥0, and S_n(λ) stands for the set of distinct permutations of λ.

51

(2)

We denote bySym(X) the algebra of symmetric polynomials (as usual, over some field Kof characteristic 0). It is the algebra of invariants of the natural action ofS_n on K[X]

(293) (σf)(x₁, . . . , x_n) =f(x_σ(1), . . . , x_σ(n)). It is naturally graded

(294) Sym(X) =M

k≥0

Sym_k(X) and its Hilbert series is

(295) H(t) =X

k≥0

dim (Sym_k(X))t^k = 1

(1−t)(1−t²)· · ·(1−tⁿ)

A nonincreasing sequenceλ= (λ₁ ≥λ₂ ≥. . .≥λ_r ≥. . .) of nonnegative integers, with finitely many nonzero terms, is called apartition of the integerm=P

iλ_i, also called the weight |λ|of λ. We write then λ⊢m. The number r of nonzero terms is called its length and is denoted byℓ(λ). Thus, the dimension ofSym_m(x₁, . . . , x_n) is equal to the number of partitions ofm of length at mostn.

If we let n tend to infinity, we can get rid of the condition “length at most n”.

In this way, we arrive at the notion of symmetric “functions”, which are defined as the symmetric “polynomials” in infinitely many variables (meaning, as usual, formal series of bounded degree). In this way, we can deal simultaneously with symmetric polynomials in any number of variables, and as we shall see, the resulting algebra acquires some extra structure.

2. Symmetric functions

Let now X ={x₁, x₂, . . .}be an infinite set of variables, which will be referred to as an alphabet. The elementary and complete symmetric functions of X e_n and h_n are best defined by their generating series

λ_t(X) := Y

i≥1

(1 +tx_i) =X

n≥0

e_n(X)tⁿ, (296)

σ_t(X) := Y

i≥1

(1−tx_i)⁻¹ =X

n≥0

h_n(X)tⁿ = (λ−t(X))⁻¹. (297)

Alternatively, one may writeE(t;X) instead ofλ_t(X) andH(t;X) instead ofσ_t(X).

Exercise 2.1. Thehn and theendetermine each other via the relation

(298) hn−e1hn−1+e2hn−2− · · ·+ (−1)ⁿen= 0 forn≥1, andh0=e0= 1.

We obtain the power-sums by taking a logarithm (299) logσ_t(X) =X

i≥1

log 1

1−tx_i

=X

i≥1

X

n≥1

xⁿ_i

n tⁿ =X

n≥1

p_n(X)tⁿ n

(3)

2. SYMMETRIC FUNCTIONS 53

so that

(300) σ_t(X) = exp

"

X

n≥1

p_n(X)tⁿ n

# .

Picking the coefficient of tⁿ, we find

(301) h_n(X) =X

µ⊢n

p_µ z_µ

where z_µ = Q

ii^mⁱm_i! if m_i is the number of occurences of the part i in µ (this is often written µ= (1^m¹2^m²· · ·n^mⁿ)).

Alternatively, taking the logarithmic derivatives of (296) and (297), we obtain the recurrrence relations

ne_n = p₁e_n−1−p₂e_n−2+p₃e_n−3− · · ·+ (−1)ⁿ⁻¹p_n, (302)

nh_n = p₁h_n−1+p₂h_n−2+p₃h_n−3+· · ·+p_n, (303)

respectively attributed to Newton and Wronski.

Exercise 2.2. These relations provide a short way to prove thatSym(X) is a polynomial algebra in either familyek,hk orpk. Assume first thatX={x1, . . . , xn}. Then, the Jacobian ofp1, . . . , pn

is a Vandermonde determinant, so that they are algebraically independent. Next, ifλis a partition of lengthr,

(304) pλ:=pλ1· · ·pλr =cλmλ+· · ·

wherecλis a nonzero integer and the dots stand for a linear combination ofmµsuch thatℓ(µ)< r.

This proves that thepk generateSym(X) overQ(we need to invert thecλ, which implies that the same is true of the ek and of thehk. We shall see that these last two families actually generate Sym(X) overZ.

Naturally, the monomial symmetric function m_λ(X) is defined as above as the (now infinite) sum of all distinct permutations of the monomial x^λ.

At this point, we have at our disposal four bases ofSym(X): threemultiplicative bases e_λ,h_λandp_λ, and the monomial basism_λ. To understand the relations between them, let us take a second set of variables Y = {y1, y2, . . .}, and form the product alphabet

(305) XY :={x_iy_j|i, j≥1}.

We can evaluate our symmetric functions on it, and consider the following generating series

(306) K(X, Y) =σ₁(XY) = Y

i,j≥1

(1−x_iy_j)⁻¹ called the Cauchy kernel.

Writing it in the form

(307) K(X, Y) =Y

i≥1

σ_x_i(Y) =Y

i≥1

X

ni≥0

xⁿ_iⁱh_n_i(Y)

(4)

we obtain the expansion

(308) K(X, Y) =X

λ

m_λ(X)h_λ(Y) Observing that

(309) p_n(XY) =p_n(X)p_n(Y), we have also

(310) K(X, Y) =X

λ

p_λ(X)p_λ(Y) z_λ .

Exercise 2.3. Check this.

The multiplicative property (309) of the p_n can be nicely completed by a similar additive property if we defineX +Y as the (disjoint) union of X and Y. Then,

(311) p_n(X +Y) = X

z∈X⊔Y

zⁿ =p_n(X) +p_n(Y)

so that the power-sums appear as some kind of homomophisms for something which would look like an algebra of alphabets. We would then have as well

λ_t(X +Y) =λ_t(X)λ_t(Y) so thate_n(X +Y) =

n

X

i=0

e_i(X)e_n−i(Y), (312)

σ_t(X+Y) =σ_t(X)σ_t(Y) so that h_n(X +Y) =

n

X

i=0

h_i(X)h_n−i(Y).

(313)

This can be made precise. Symmetric functions actually define functions onmul- tisets of variables, or even of scalars if this does not lead to divergent series. Because of the symmetry, we can make sense of the substitution of, say

(314) A={q, q,(−2),(−2),(−2), x³y, x³y, i√ 2}

in a symmetric function: just specialize any two variables to q, three other ones to (−2), again two other ones to x³y, a last one to i√

2, and set all the remaining ones to zero. For example, with the above multiset,

(315) p_n(A) = 2qⁿ+ 3(−2)ⁿ+ 2(x³y)ⁿ+ (i√ 2)ⁿ and this defines all symmetric functions of A.

A multiset can be conveniently replaced by the formal sum of its elements, e.g., (316) A= 2q+ 3(−2) + 2x³y+ (i√

2) Note that the term 3(−2) should not be interpreted as −6.

Exercise 2.4. Check thatp2(−2,−2,−2) is indeed different fromp2(−6).

Exercise 2.5. Computeh2(A),e2(A),e8(A).

(5)

2. SYMMETRIC FUNCTIONS 55

To avoid confusion, it is best to interpret symmetric functions as operators on polynomial rings, and to allow specialization of the variables only after application of the operators. The power sums are then ring homomorphisms defined by

(317) p_n(z) =zⁿ if z is a variable, and p_n(1) = 1 so that p_n(α) =p_n(α·1) =α for any scalar α.

Exercise 2.6. Computehn(α) anden(α).

Hint: σt(α) = (1−t)^−α.

A ring Rendowed with such an action ofSymis called aψ-ring, as the operators p_nare ofted denoted byψⁿ (and sometimes called Adams operations). This defines as well the action of thee_k, called exteriors powers and often denoted by Λ^k or λ^k, and of the h_k, called symmetric powers and often denoted by S^k or σ^k. A ring endowed with an action of the e_k is called a λ-ring¹.

The origin of these ideas can be traced back to the following facts from linear algebra. If P(x) is the characteristic polynomial of a square matrix M, and the x_i are its eigenvalues, then e_k(X) = tr Λ^k(M), where Λ^k(M) is the kth exterior power of M, i.e., the matrix whose entries are the minors of orderk ofM.

It is often more convenient to assume that the x_i are the reciprocals of the eigenvalues, so that

(318) |I−tM|=

n

Y

i=1

(1−x_it) =

n

X

k=0

e_k(X)(−t)^k is invertible as a formal power series, and its inverse

(319) |I−tM|⁻¹ =

n

Y

i=1

(1−x_it)⁻¹ =X

k≥0

h_k(X)t^k

has as coefficients the complete homogeneous symmetric functionsh_k(X), which can be interpreted as the traces of the symmetric powers S^k(M). This last statement is essentially McMahon’s “Master Theorem”.

The power sums p_k(X) = P

x^k_i are obviously the traces of the powers M^k, and at a more advanced level, one knows that the traces of the images of M under the irreducible polynomial representation of GL_n, labelled by partitions λ, are the so- called Schur functions s_λ(X), to be defined later.

Exercise 2.7. There are three basic kinds of polynomials in n variables. The algebra K[X] of ordinary polynomials is obtained by taking mutually commuting variables xi. The algebra of noncommutative polynomials (or free associative algebra)KhAiis built from noncommmuting lettersai. The third kind, anticommutative polynomials, or the Grassmann algebra Λ_K(η), is built from anti- commuting variablesηi. That is,ηiηj=−ηjηi(and in particular,η_i²= 0). IfV is then-dimensional vector space spanned by the variables, these algebras are respectively called thesymmetric algebra S(V), the tensor algebra T(V) and theexterior algebra Λ(V). All these algebras are graded. If f is an endomorphism ofV, it induces endomorphismsS^k(f),T^k(f) and Λ^k(f) of the homogeneous components of these three algebras. Check that Λⁿ(V) is one dimensional, so that Λⁿ(f) acts by a scalar, and prove that this scalar is the determinant off. Then, prove that the matrix of Λ^k(f)

1Over a field of characteristic 0, there is no difference between ψ-rings andλ-rings.

(6)

is formed of the k×k-minors of the matrix off, and deduce that its trace is the k-th elementary symmetric function of the eigenvalues off. [Hint. – Do it first for a diagonal matrix, and invoke a density argument for the general case.] Compute similarly the traces ofS^k(f) andT^k(f).

Exercise 2.8. The following facts are obviously true for diagonal matricesD: (i) ifP(x) =|D−xI| is the characteristic polynomial ofD, theP(D) = 0. (ii) dete^D =e^tr^D. By a density argument, show that they are true for an arbitrary complex matrix.

3. Hopf algebras enter the scene

As we have seen, with a second alphabet Y, we can form symmetric functions of XY and X +Y, by the very simple rules (309) and (311). In each case, the result can be expanded as a sum of terms u(X)v(Y). Such a term can be naturally identified with a tensor product u⊗v, as the tensor product is nothing but a formal product having no other property than being linear in its arguments u and v.

With this interpretation, the multiplication of symmetric functions is the linear map u(X)v(Y)7→ u(X)v(X). These considerations allow us to reformulate the definition of an algebra, and to introduce the dual notion:

Definition 3.1. (i) An algebra is a vector space V endowed with a product (or multiplication), that is, a linear map µ: V ⊗V → V.

(ii) A coalgebra is a vector space V endowed with a coproduct (or comultiplication), that is, a linear map ∆ : V → V ⊗V.

Clearly, these notions are dual to each other: the dual of a product is a coproduct and conversely.

Thus, our operations on alphabets provide us with two (very different) coproducts on Sym(X). Let us see what they are good for. We shall start with X+Y, which will be denoted by ∆. We have thus

∆pn =pn⊗1 + 1⊗pn, (320)

∆e_n =Pn

i=0e_i⊗e_j, (321)

∆h_n =Pn

i=0h_i⊗h_j. (322)

On the last two equations, we can observe that the involutive algebra automorphism ω exchanging h_n and e_n is also an automophism of the coalgebra structure. The first equation says that the p_n are primitive elements (elements such that ∆f = f ⊗1 + 1⊗f).

Exercise 3.1. Computeω(pn).

It is also obvious that (f g)(X+Y) =f(X+Y)g(X+Y), so that ∆ is an algebra morphism from Sym to Sym⊗Sym.

Definition 3.2. A bialgebra is an algebra V which is also a coalgebra, and such that the coproduct ∆ is a morphism of algebras for the structure (a⊗b)·(a^′⊗b^′) = aa^′⊗bb^′ on V ⊗V (called tensor product of algebras).

Clearly, ∆ endows Sym with the structure of a bialgebra. This is not all. Obvi- ously, (X+Y) +Z =X+ (Y +Z) (recall that this is just the disjoint union of sets),

(7)

3. HOPF ALGEBRAS ENTER THE SCENE 57

so that

(323) (∆⊗I)◦∆(f) =f((X+Y) +Z) =f(X+ (Y +Z)) = (I⊗∆)◦∆(f). This property is called coassociativity. It is indeed dual to associativity, which can be expressed as

(324) µ◦(µ⊗I) =µ◦(I⊗µ).

Thus, Sym is an associative and coassociative bialgebra. It has of course a unit, the constant 1, which one may interpret as the linar map u:K→ Sym sending the scalar 1 to the unit element 1_SymofSym. Dually, one defines acounit on a coalgebra V as a linar map ǫ: V →K such thatµ◦(I⊗ǫ)◦∆ =I =I =µ◦(ǫ⊗I)◦∆.

Exercise 3.2. Check that this is indeed dual to the characteristic property of the unitu.

The counit of Sym is just the constant term map (projection onto the homogeneous component of degree 0).

There is more. If U is a coalgebra andV an algebra, one can define an operation

⋆called convolution on the space L(U, V) of linear mapsU →V by (325) (f ⋆ g)(u) =µ◦(f ⊗g)◦∆(u)

(this may not be well-defined in the infinite-dimensional case, but we shall never encounter this situation).

Exercise 3.3. IfU is coassociative andV is associative, then⋆is associative.

Exercise 3.4. The polynomial algebraK[x] can be endowed with a bialgebra structure defined by the coproduct ∆(x) =x⊗1 + 1⊗x(so that one may write ∆P =P(x+y)). LetA=K[x]^∗be the dual space, endowed with pointwise multiplication of linear forms. Letf, g∈Abe defined by their valuesf(xⁿ) =fn,g(xⁿ) =gn, and leth=f ⋆ g be their convolution. Check that

(326) X

n≥0

hn

tⁿ n! =



 X

n≥0

fn

tⁿ n!







 X

n≥0

gn

tⁿ n!



.

Exercise 3.5. The free associative algebraKhAi is a (coassociative) bialgebra for the coproduct defined on the letters by ∆(ai) = ai ⊗1 + 1⊗ai. Each permutation σ ∈ Sn defines a linear endomorphism of KhAiacting on words by gσ(w) = 0 is w is not of length n, and gσ(w) = wσ otherwise. Compute the convolutiongσ⋆ gτ, and make an interesting observation.

In particular, for endomorphisms of a bialgebra with unit and counit, u◦ǫ is the neutral element of convolution. When the identity map is invertible for this operation, its inverse is called anantipode. That is, an antipode is an endomorphism S such that

(327) (S ⋆ I)(x) :=µ◦(S⊗I)◦∆(x) =u◦ǫ(x) = (I ⋆ S)(x) For symmetric functions, it is clear that if we define X −Y by (328) p_n(X −Y) =p_n(X)−p_n(Y)

then, we have the antipode

(329) (Sf)(X) =f(−X)

Indeed, (S ⋆ I)(f) =f(X−X) =f(0) = constant term of f =u◦ǫ(f).

(8)

Exercise 3.6. Computehn(−X) anden(−X). Interpretingf(X−Y) as (I⊗S)◦∆(f), compute σt(X−Y).

Definition 3.3. A Hopf algebra is an associative, coassociative bialgebra with unit, counit and antipode.

Corollary 3.4. Sym is a Hopf algebra.

As an algebra, Sym is commutative. It is also cocommutative which means that

∆ commutes with the swap operator P(u⊗v) = v⊗u, or, less formally, that ∆ is dual to a commutative product.

Exercise 3.7. The Hopf structure of Symis a typical example of a Hopf algebra associated with a group. If we denote byGthe multiplicative group of formal power series with constant term 1, (330) G= 1 +tK[[t]] ={a(t) = 1 +a1t+a2t²+· · · },

we can interprethn as the coordinate function

(331) hn(a(t)) =an,

and Symas the algebra of polynomial functions onG. The standard coproduct for functions on a group turns a functionf(x) into a function of two variables ∆f(x, y) =f(xy), and the antipode is Sf(x) =f(x⁻¹). Check that this defines the same Hopf algebra structure as described above.

4. Duality

The definition of a Hopf algebra H is self-dual in the sense that each defining property comes with its dual notion. If the dual spaceH^′ is well-defined, which will always be the case in this book, it is then automatically a Hopf algebra. This is the case when H is finite-dimensional. When it is graded, that is

(332) H =M

n≥0

H_n with µ: H_m⊗H_n →H_m+n and ∆ : H_n → M

i+j=n

H_i⊗H_j

with eachH_n finite-dimensional, one can define the graded dual

(333) H^∗=M

n≥0

H_n^′.

This is a Hopf algebra for the dual maps ∆^∗, µ^∗, ǫ^∗, u^∗.

When H^∗ is isomorphic to H, one says that H is self-dual. In this case, there exists a scalar product on H such that

(334) hf g, hi=hf ⊗g,∆hi, where hu⊗v, u^′⊗v^′i:=hu, vihu^′, v^′i.

A closer look at the expansion (308) of the Cauchy kernel reveals that there is such a thing in Sym. Indeed, if, for two partitions λ, µ we denote by λ∪µ the partition composed of the parts of λand µ, on the one hand we have obviously

(335) h_λh_µ = h_λ∪µ

(9)

5. SCHUR FUNCTIONS 59

and on the other hand

σ₁((X +Y)Z) =X

ν

m_ν(X+Y)h_ν(Z) =σ₁(XZ)σ₁(Y Z)

=X

ν

X

λ∪µ=ν

m_λ(X)m_µ(Y)

! h_ν(Z) (336)

so that

(337) ∆m_ν = X

λ∪µ=ν

m_λ⊗m_µ. Hence, if we define the scalar product on Sym by

(338) hm_λ, h_µi=δ_λµ (Kronecker symbol) we have

(339) hm_ν, h_λh_µi=h∆m_ν, h_λ⊗h_µi which proves that Sym is self-dual.

The scalar product (338) is called Hall’s scalar product. With it, we can now understand the significance of the Cauchy kernel. It is indeed a reproducing kernel, which means that for any f ∈Sym(X),

(340) hK(X, Y), f(X)i=f(Y).

Or, interpreting h_µ(Y) as the dual basis of m_µ(X), this is the identity map of Sym, regarded as an element of Sym⊗Sym^∗.

This has the consequence that any pair of bases u_λ,v_λ satisfying

(341) K(X, Y) =X

λ

u_λ(X)v_λ(Y)

are dual to each other: hu_λ, v_µi= δ_λµ. In particular, the power-sum products form an orthogonal basis:

(342) hp_λ, p_µi=δ_λµz_λ. 5. Schur functions

5.1. Antisymmetric polynomials. A polynomial f(x₁, . . . , x_n) is said to be antisymmetric if it changes sign when two variables are swapped. Equivalently, (343) f(x_σ(1), . . . , x_σ(n)) = (−1)^inv(σ)f(x₁, . . . , x_n).

Hence, such a polynomial vanishes if one sets x_i = x_j for some j > i. As the polynomialsx_i−x_j are pairwise relatively prime,f must be divisible by their product

(344) ∆(x₁, . . . , x_n) =Y

j>i

(x_j−x_i) =

1 x₁ x²₁ . . . xⁿ⁻¹₁ 1 x₂ x²₂ . . . xⁿ⁻¹₂

... . .. ...

1 x_n x²_n . . . xⁿ⁻¹_n

(10)

a Vandermonde determinant. The quotient is then a symmetric polynomial. This how Schur functions arise. Let An be theantisymmetrization operator

(345) Anf(x₁, . . . , x_n) = X

σ∈S_n

ε(σ)f(x_σ(1), . . . , x_σ(n)) whereε(σ) = (−1)^inv(σ) is thesignature of σ.

Then, for a monomial x^α, An(x^α) = 0 unless the exponents α_i are all distinct.

In which case, since the result is antisymmetric, we can assume that α₁ > α₂. . . >

α_n ≥0, and write

(346) α=λ+ρ

whereλ= (λ₁ ≥λ₂ ≥. . .≥λ_n ≥0) is a partition, and (347) ρ= (n−1, n−2, . . . ,1,0) Then, the monomial antisymmetric functions

(348) A_α(x₁, . . . , x_n) =An(x^α)

form a basis of the space of antisymmetric polynomials, and the ratios

(349) s_λ = A_α

A_ρ

are symmetric polynomials, called Schur functions. Up to a sign, A_ρ is the Van- dermonde determinant, hence of degree ¹₂n(n−1) = |ρ|, so that s_λ is homogenous of degree |λ|. From the above discussion, it is clear that the s_λ form a basis of Sym(x₁, . . . , x_n).

Note that s_(n)=h_n and s₍₁ⁿ₎=e_n.

5.2. The Jacobi symmetrizer. The definition (349) can be rewritten as

(350) s_λ = X

σ∈Sn

σ

x^λ+ρ A_ρ

since A_ρ is antisymmetric and takes care of the signature of σ. The linear operator on K[x₁, . . . , x_n]

(351) Ω_nf = X

σ∈S_n

σ

f ·x^ρ A_ρ

is called the Jacobi symmetrizer. Its fundamental property is Proposition 5.1. If f is symmetric, then, for any g,

(352) Ω_n(f g) =fΩ_n(g).

(11)

Proof –By linearity, it is sufficient to check this property withf =p_k(X) andg =x^α. In this case,

Ω_n(p_k(X)x^α) = 1 A_ρ

n

X

j=1

X

σ∈Sn

ε(σ)x^α_σ(1)¹⁺ⁿ⁻¹· · ·x^α_σ(j)^j^+n−j+k· · ·x^α_σ(n)ⁿ

= 1 A_ρ

X

σ∈Sn

ε(σ)

n

X

j=1

x^k_σ(j)

!

x^α_σ(1)¹⁺ⁿ⁻¹· · ·x^α_σ(j)^j^+n−j· · ·x^α_σ(n)ⁿ

=p_k(X)Ω_n(x^α). (353)

5.2.1. Muir’s identity. Remark that (350) can be used to define s_λ(X) for an arbitraryλ∈Zⁿ. The result is then either 0 (ifλ+ρ has repeated parts), or plus or minus a Schur function:

(354) s_λ=ε(σ)s_µ if (λ+ρ)·σ−ρ=µ, a partition

Whenµdefined as above has negative parts, the result is still a Schur function up to a negative power ofx₁x₂· · ·x_n. When defining Schur functions of an infinite alphabet, this has to be changed, and as we shall see, the result is also zero in this case.

With these nonstandard Schur functions, we can now state Muir’s rule for the product of a monomial function and a Schur function:

Proposition 5.2. Let λ, µ be two partitions in at most n parts, and α=λ+ρ, β =µ+ρ. Then,

(355) m_µs_λ = X

γ∈Sn(β)

s_α+γ

Proof – It suffices to write

m_µs_λ =m_λ(X)Ω_n(x^α) = Ω_n(m_λ(X)x^α

= Ω_n



 X

γ∈Sn(β)

x^γ+α



= X

γ∈Sn(β)

s_α+γ. (356)

Exercise 5.1. Show that

(357) pn=

n−1

X

k=0

(−1)^ks(n−k,1^k). Hint –Apply Muir’s formula topn=m(n)·s∅.

5.3. The first Pieri formula. Applying Muir’s identity to m₍₁^k₎ = e_k = s₍₁^k₎, we obtain

(358) e_ks_λ = X

γ∈S_n(0^n−k1^k)

s_α+γ

and among theα+γ occuring in this expression, the only ones which are not weakly decreasing are those having two consecutive components of the formα_i, α_i+1+ 1 with α_i =α_i+1. But in this case, the Schur function s_α+γ is zero. Hence,

(12)

Proposition 5.3. The product of a Schur function by an elementary function is a multiplicity free sum of Schur functions

(359) e_ks_λ =X

µ

s_µ

where the sum is over all partitions µ whose diagram is obtained from λ by addingk boxes, no two ones in the same row.

5.4. The Cauchy kernel again.

Lemma 5.4 (Cauchy). Let X ={x₁, . . . , x_n}and Y ={y₁, . . . , y_n}. Then, (360) D(X, Y) := det

1 x_i+y_j

= ∆(X)∆(Y)

n

Y

i,j=1

1 x_i+y_j . Proof – Subtract the first column to the other ones. Since

(361) 1

x_i+y_j − 1

x_i+y₁ = y₁−y_j (xi₊y₁)(x_i+y_j),

we can extract a factor 1/(x_i+y₁) from theith row and a factory₁−y_j from the jth column (j >1). Hence,

(362) D(X, Y) = (y₁−y₂)(y₁−y₃)· · ·(y₁−y_n) (x₁+y₁)(x₂+y₁)· · ·(x_n+y₁)

1 _x ¹

1+y2 · · ·_x₁_+y¹ _n 1 _x ¹

1+y2 · · ·_x₁_+y¹ _n ... ... . .. ...

1 _x ¹

n+y2 · · ·_x_n_+y¹ _n and subtracting the first row to the other ones, we extract a factor x₁−x_i from row i and a factor 1/(x₁+y_j) from columnj (i, j >1). Thus,

(363) D(X, Y) =

Qn

i=2(y₁−y_i)(x₁−x_i) Qn

i=1(x_i+y₁))Qn

j=2(x₁+y_j)D(x₂, . . . , x_n;y₂, . . . , y_n). Replacing x_i by 1/x_i and y_i by −y_i, we obtain on the one hand

(364) det

1 1−x_iy_j

= ∆(X)∆(Y)

n

Y

i,j=1

1

1−x_iy_j = ∆(X)∆(Y)σ₁(XY). On the other hand, the matrix ((1−x_iy_j)⁻¹is the product of the infinite Vandermonde matrix V(X) = (x^j−1_i )_{1≤i≤n, j≥1} by the transpose of V(Y). Applying the Binet- Cauchy formula for minors of orders n to this product, we have

(365) det

1 1−x_iy_j

= X

α

|V(X)|α|V(Y)|α= ∆(X)∆(Y)X

λ

s_λ(X)s_λ(Y). Letting the number of variables tend to infinity, we obtain:

(13)

Corollary 5.5 (The Cauchy identity for Schur functions). Let X, Y be any two alphabets. Then,

(366) K(X, Y) =X

λ

s_λ(X)s_λ(Y).

In particular, Schur functions form an orthonormal basis of Sym for the Hall scalar product.

5.5. The Jacobi-Trudi identity. As we have seen, the monomial antisymmetric functions are the minors of an infinite Vandermonde matrix. Since we know that they are divisible by the product of differences A_ρ, it is is natural to try to extract this factor by row and column manipulations. This is has been done by Jacobi, and independently by his student Trudi.

Let us first observe that for a single variable,

(367) xⁿ_i =h_n(x_i)

and that if i6=j,

(368) h_n(X+x_i)−h_n(X+x_j) = (x_i−x_j)h_n−1(X+x_i+x_j).

Let us now compute some Schur function, say s₅₂₂(x₁, x₂, x₃). The numerator of (349) is the antisymmetrization of x⁵⁺²₁ x²⁺¹₂ x²⁺⁰₃ , that is

(369) A₇₃₂=

x⁷₁ x³₁ x²₁ x⁷₂ x³₂ x²₂ x⁷₃ x³₃ x²₃

=

h₆(x₁+x₂) h₂(x₁+x₂) h₁(x₁+x₂) h₆(x₂+x₃) h₂(x₂+x₃) h₁(x₂+x₃)

h₇(x₃) h₃(x₃) h₂(x₃)

(x₁−x₃)(x₂−x₃) the equality resulting from the subtraction of the last row to the first two ones and from (367) and (371). Subtracting now the second row to the first one, and appying again (371), we get

A₇₃₂=

h₅(x₁+x₂+x₃) h₁(x₁+x₂+x₃) h₀(x₁+x₂+x₃) h₆(x₂+x₃) h₂(x₂+x₃) h₁(x₂+x₃)

h₇(x₃) h₃(x₃) h₂(x₃)

(x₁−x₃)(x₂−x₃)(x₁−x₂)

=

h₅(X₃) h₁(X₃) h₀(X₃) h₆(X₃) h₂(X₃) h₁(X₃) h₇(X₃) h₃(X₃) h₂(X₃)

(x₁−x₃)(x₂−x₃)(x₁−x₂) (370)

the final equality following from the expansion

(371) h_n(X−x_i) =h_n(X)−x_ih_n−1(X)

which shows that inserting the extra variables does not change the value of the determinant:

(372) h_n(x₃) =h_n((x₂+x₃)−x₂) =h_n(x₂+x₃)−x₂h_n−1(x₂+x₃) so that x₃ can be replaced by x₂+x₃ in the last row, and so on.

We have thus proved:

(14)

Theorem 5.6 (Jacobi-Trudi). The Schur function s_λ is equal to the following determinant

(373) s_λ= det (h_λ_i_+j−i) .

In particular, adding null parts to λdoes not change the value of the determinant, so that Schur functions are stable w.r.t. addition of new variables:

(374) s_λ(x₁, . . . , x_n,0) =s_λ(x₁, . . . , x_n).

This formula also makes sense of s_γ for an arbitrary γ ∈ Zⁿ, if one makes the convention that h_n = 0 for n <0.

5.6. The Kostka-Naegelsbach identity. The Jacobi-Trudi identity shows that Schur functions are minors of the infinite Toeplitz matrixS = (h_j−i)_i,j≥1. The inverse of this matrix is ((−1)^j⁻ⁱe_j−i). There is another identity of Jacobi which relates the minors of order k of a matrixM to those of its inverse: if adj M = detM·M⁻¹ is the adjugate of M,

(375) Λ^kM = (detM)·adj^(k)M⁻¹

where the kth adjugate adj^(k)A of a matrix Ais defined as detA·Λ^kA.

Applying this to the Jacobi-Trudi determinant, we obtain

Proposition 5.7 (Kostka-Naegelsbach). Schur functions are determinants of elementary functions:

(376) s_λ = det(e_λ^′

i+j−i)

where λ^′ is the conjugate partition of λ. As a consequence,

(377) ω(s_λ) =s_λ^′.

5.7. The second Pieri rule. Applying ω to the first Pieri rule, we obtain:

Proposition 5.8. The product of a Schur function by a complete function is a multiplicity free sum of Schur functions

(378) h_ks_λ =X

µ

s_µ

where the sum is over all partitions µ whose diagram is obtained from λ by addingk boxes, no two ones in the same column.

5.8. Skew Schur functions. Schur functions are minors of the infinite Toeplitz matrix S(X) = (h_j−i(X)). This matrix satisfies S(X +Y) = S(X)S(Y). Applying the Binet-Cauchy identity to this product, we obtain:

Proposition 5.9. The coproduct of a Schur function is given by

(379) s_λ(X+Y) =X

µ⊆λ

s_λ/µ(X)s_µ(Y)

where µ⊆λ means that µ_i ≤λ_i for all i, so that the diagram ofµ is included in the one of λ, and the skew Schur functions s_λ/µ are defined by

(380) s_λ/µ= det h_λ_i_−µ_j_+j−i .

(15)

The commutative images r_I(X) of the noncommutative ribbon Schur functions R_I(A) are particular skew Schur functions, precisely those such that the Ferrers diagram of λ/µcoincides with the ribbon diagram of the composition I.

5.9. Differential operators. Since Sym=K[p₁, p₂, . . .], one may write a Tay- lor formula forf(X +Y) =F[p₁(X) +p₁(Y), p₂(X) +p₂(Y), . . .]

(381) f(X+Y) = exp

( X

n≥1

p_n(Y) ∂

∂p_n(X) )

f(X)

Actually, the partial derivative with respect to p_n is almost the adjoint of multiplication by p_n. If, for any symmetric function g, we denote by D_g the adjoint of the map f 7→gf, we have

(382) D_p_nf =n ∂

∂p_nf .

Exercise 5.2. Check this by takingf=pµ.

Thus, the coproduct f(X)7→f(X+Y) can be rewritten as (383) f(X+Y) = exp

( X

n≥1

p_n(Y)D_p_n_(X) n

)

f(X) =D_σ^(X₁_(XY⁾ ₎f(X),

the notationD^(X)meaning that we take the adjoint inSym(X), they_ibeing regarded as parameters.

Expanding σ₁(XY) by the Cauchy formula, we find

(384) s_λ(X+Y) =X

µ

s_µ(Y)· D_s_µs_λ (X) so that

(385) s_λ/µ=D_s_µs_λ,

a result due to Foulkes².

5.10. Dual Pieri rules. Writinghh_ks_µ, s_νi=hs_µ, s_ν/kiandhe_ks_µ, s_νi=hs_µ, s_ν/1^ki, and applying the Pieri rules, we obtain

Proposition 5.10. The skew Schur function s_ν/k (resp. s_ν/1^k)is the sum of all Schur functions s_µ such that the diagram of µ is obtained by removing k boxes from the diagram of λ, at most one from each column (resp. row).

2The operatorsDf are often calledFoulkes derivatives. They are denote byf^⊥ in Macdonald [66].

(16)

5.11. Young tableaux. A semi-standard Young tableau of shape a partition λ (or a skew-partitionλ/µ) is a filling of its Ferrers diagrams by postive integers, such that rows are nondecreasing from left to right and columns are strictly increasing from bottom to top.

For a tableauT, denote byX^T the monomial obtained by replacing each entryiof T by the variablex_i and taking the product. Letm_i(T) be the number of occurences of iin T. The vector (m_i(T)) is called theweight of T.

We already know that the ribbon Schur functionr_I(X) is the sum of the monomi- alsX^T forT of shapeI. This includes in particularr_n =s_n (single row tableaux) and r₁ⁿ =s₁ⁿ (single column tableaux). We may suspect that a more general statement should be true. Let us first compute the coefficient of a monomial function m_µ in a Schur functions_λ. Writing

(386) K_λµ:=hs_λ, h_µi=hs_λ/µ_s, h_µ_¯i=X

ν

hs_ν, h_µ_¯i whereµ = (µ₁, . . . , µ_s) and ¯µ= (µ₁, . . . , µ_s−1), we prove by induction

Proposition 5.11. The Kostka number K_λµ is the number of semi-standard tableaux of shape λ and weight µ.

This suggest that s_λ(X) should actually be the sum of theX^T for T of shapeλ.

That this is true is easily seen by induction on the number of variables. Suppose that it is proved for n−1 variables. Then,

(387) s_λ(X_n−1+x_n) =X

µ

s_µ(X_n−1)s_λ/µ(x_n)

Exercise 5.3. Show that a skew Schur function sλ/µ(x) of a single variable x is nonzero iff the diagram ofλ/µ is a disjoint union of rows.

Assuming the exercise, each s_µ(X_n−1) in the sum above can be expanded as a sum of tableaux of shape µ, and the powers of x_n coming from the nonzero s_λ/µ(x_n) can now be added to these tableaux to build all the tableaux of shape λover X_n.

Exercise 5.4. Extend this to skew Schur functions.

Theorem 5.12. Schur functions are sums of tableaux:

(388) s_λ/µ(X) = X

T∈Tab(λ/µ)

X^T ,

where Tab(λ/µ)denotes the set of semi-standard tableaux of shape λ/µ.

5.12. The Littlewood-Richardson rule.

(17)

6. GROUP REPRESENTATIONS 67

6. Group representations

A linear representation of a group G (over a field K) is a homomorphism R : G → GL(V) from G to the group of automorphisms of some vector space V (over K).

Exercise 6.1. Let V =Cⁿ. Then the map R: S_n →GLn(C) defined byR(σ)ij = 1 ifi=σ(j) and 0 otherwise is a linear representation (permutation matrices).

Identify now Cⁿ with the linear span of variablesx1, . . . , xn. Then, S_n acts on the space Vd

homogeneous polynomials of degree d by Rd(σ)(f) = f(xσ(1), . . . , xσ(n)). The representations Rd

are the symmetric powersRd=S^d(R) of the representationR=R1.

Exercise 6.2. Similarly, the group G = GLn(C) has a natural representation (the vector representation) V = Cⁿ of dimension n, and representations in T^k(V), S^k(V), Λ^k(V). Compute the dimensions of these representations. Forn= 2, write down the matricesS²(g),T²(g) and Λ²(g) for a generic matrixg.

For a finite group, a linear representation is the same thing as a module over the group algebra KG.

Two representations R, R^′ on V, V^′ are said to be isomorphic (or equivalent) if there is a linear isomorphismf : V → V^′ such that f ◦R=R^′◦f.

A representation is said to be irreducible if it has no nontrivial G-invariant subspace.

Exercise 6.3. Check that the representation ofS_nby permutation matrices is not irreducible. In the casen= 3, decomposeC³as a direct sum of irreducible representations.

Let V and V^′ be two irreducible representations of some group G, and suppose we have a map f : V → V^′ which commutes withG: f ◦R= R^′◦f. This is called an intertwining operator.

If f is not injective, its kernel is a G-invariant subspace, so we must have f = 0.

Similarly, if f is not surjective, its image is G-invariant, so again, f = 0. Hence, if f 6= 0, it must be an isomorphism. We can then assume that V =V^′ and R=R^′. If we assume now thatK=C(which will always be the case in the sequel), thenf has at least one eigenvalueλ. The mapf−λI_V has now a nonzero kernel, and commutes with G. It must therefore be zero, and we have proved:

Lemma 6.1 (Schur’s lemma). There are no nonzero intertwining operators be- tween two non-isomophic irreducible representations of a group. Moreover, over C, the space of intertwining operators between two isomorphic irreducible representations is one dimensional. In particular, any endomorphism of an irreducible representation commuting with G must be a scalar multiple of the identity.

6.1. Finite groups. We shall now concentrate on the case of representations of finite groups over C. When there is no ambiguity, we shall drop the symbol R and write only gv for R(g)v.

So, let V, V^′ be two representations of a finite group G. Given any linear map f : V →V^′, we can build an intertwining operator ˆf by averaging it as

(389) fˆ= 1

|G| X

g∈G

g⁻¹f g .