An Introduction to Invariant Theory · Classical Invariant Theory: Binary Forms the polynomial f(x...

transcript

An Introduction to Invariant Theory

Harm Derksen, University of Michigan

Optimization, Complexity and Invariant TheoryInstitute for Advanced Study, June 4, 2018

Harm Derksen, University of Michigan An Introduction to Invariant Theory

Plan of the Talk

I applications of invariants

I a classical, motivating example : binary forms

I polynomial rings ideals

I group representations and invariant rings

I Hilbert’s Finiteness Theorem

I the null cone and the Hilbert-Mumford criterion

I degree bounds for invariants

I polarization of invariants and Weyl’s Theorem

I Invariant Theory for other fields

Plan of the Talk

Applications of Invariants

Definition

an invariant is a quantity or expression that stays the same undercertain operations

the total energy in a physical system is an invariant as the systemevolves over time

loop invariants can be used to prove the correctness of an algorithmalthough the number of iterations in a loop may vary, the loopinvariant tell us to say something about the variables after theiterations

Definition

Knot invariants (such as the Jones polynomial) can be used todistinguish knots

knot invariants remain unchanged under Reidemeister moves

(co-)homology groups are invariants of topological manifolds

Knot invariants (such as the Jones polynomial) can be used todistinguish knots

knot invariants remain unchanged under Reidemeister moves

(co-)homology groups are invariants of topological manifolds

Invariant Theory

in invariant theory we restrict ourselves to

I invariants that are polynomial functions on a vector space

I invariants that remain unchanged under group symmetriessuch as rotations, permutations etc.

we start with a motivating example from 19th century invarianttheory

Invariant Theory

Classical Invariant Theory: Binary Forms

a binary form of degree 2 is a polynomial

p(z ,w) = p1z2 + p2zw + p3w

with p1, p2, p3 ∈ C

{(a bc d

): ad − bc = 1

}is the group of 2× 2 matrices with determinant onea matrix A ∈ SL2 gives a linear change of coordinates in C2

the group SL2 acts on (the coefficients of) binary forms:we make the substitution (z ,w) 7→ (az + cw , bz + dw) and getanother polynomial

p′(z ,w) = p(az + cw , bz + dw) = p′1z2 + p′2zw + p′3w

p(z ,w) = p1z2 + p2zw + p3w

{(a bc d

): ad − bc = 1

p(z ,w) = p1z2 + p2zw + p3w

{(a bc d

): ad − bc = 1

(a bc d

)∈ SL2,p′1

p′2p′3

p1p2p3

a2 ab b2

ac ad + bc bdc2 cd d2

the polynomial f (x1, x2, x3) = x22 − 4x1x3 ∈ C[x1, x2, x3] (thediscriminant) can be viewed as a function from C3 to C and aneasy calculation shows that

p1p2p3

= p22 − 4p1p3 = (p′2)2 − 4p′1p′3 = f

p′1p′2p′3

we say that f (x1, x2, x3) is an invariant under the action of SL2

f (x1, x2, x3) is a fundamental invariant that generates all invariants:if h(x1, x2, x3) is another polynomial invariant, then there exists apolynomial q(y) such that h(x1, x2, x3) = q(f (x1, x2, x3))

p1p2p3

= p22 − 4p1p3 = (p′2)2 − 4p′1p′3 = f

p′1p′2p′3

p1p2p3

= p22 − 4p1p3 = (p′2)2 − 4p′1p′3 = f

p′1p′2p′3

we may identify binary forms of degree n with vectors in Cn+1:

p1zn + p2z

n−1w + · · ·+ pn+1wn ↔

p1p2...

the vector space of binary forms of degree n is an(n + 1)-dimensional representation of SL2

we may identify binary forms of degree n with vectors in Cn+1:

p1zn + p2z

n−1w + · · ·+ pn+1wn ↔

p1p2...

the vector space of binary forms of degree n is an(n + 1)-dimensional representation of SL2

polynomial invariants for binary forms of arbitrary degree wereextensively studied in the 19th century by mathematicians likeBoole, Sylvester, Cayley, Aronhold, Hermite, Eisenstein, Clebsch,Gordan, Lie, Klein, Capelli etc.

Theorem (Gordan 1868)

for binary forms of degree d there exists a finite system offundamental invariants that generate all invariants (i.e., everyinvariant is a polynomial expression in the fundamental invariants)

one of the main objectives was to find an explicit system offundamental invariants for binary forms up to degree d

(currently known for d ≤ 10)

The Polynomial Ring

x1, x2, . . . , xn coordinate functions on V = Cn

a polynomial f (x1, . . . , xn) can be viewed as function from V to CC[x] = C[x1, . . . , xn] graded ring of polynomial functions

Definition (Ideal)

a subset I ⊆ C[x] is an ideal if

1. 0 ∈ I ;

2. f (x), g(x) ∈ I ⇒ f (x) + g(x) ∈ I ;

3. f (x) ∈ C[x], g(x) ∈ I ⇒ f (x)g(x) ∈ I .

The Polynomial Ring

x1, x2, . . . , xn coordinate functions on V = Cn

a polynomial f (x1, . . . , xn) can be viewed as function from V to CC[x] = C[x1, . . . , xn] graded ring of polynomial functions

Definition (Ideal)

a subset I ⊆ C[x] is an ideal if

1. 0 ∈ I ;

2. f (x), g(x) ∈ I ⇒ f (x) + g(x) ∈ I ;

3. f (x) ∈ C[x], g(x) ∈ I ⇒ f (x)g(x) ∈ I .

Hilbert’s Basis Theorem

the ideal (S) generated by a subset S ⊆ C[x] is

{a1(x)f1(x)+· · ·+ar (x)fr (x) | r ∈ N,∀i ai (x) ∈ C[x], fi (x) ∈ S}

Theorem (Hilbert 1890)

every ideal I ⊆ C[x] is generated by a finite set(C[x] is noetherian)

if S ⊆ C[x], then (S) = (T ) for some finite subset T ⊆ S

Hilbert used this theorem to prove a his Finiteness Theorem inInvariant Theory (discussed later)

Action of a Group G

suppose V = Cn is a representation of a group Gthis means that every g ∈ G acts my some n × n matrixMg : V → V (so g · v = Mgv)

and we have Me = I andMgh = MgMh

this also implies Mg−1 = (Mg )−1

if f (x) ∈ C[x] and M = (mi ,j) is n × n matrix, then v 7→ f (Mv) isa polynomial function given by the formula

f( n∑

m1,jxj , . . . ,n∑

mn,jxj

Action of a Group G

suppose V = Cn is a representation of a group Gthis means that every g ∈ G acts my some n × n matrixMg : V → V (so g · v = Mgv) and we have Me = I andMgh = MgMh

f( n∑

m1,jxj , . . . ,n∑

mn,jxj

Action of a Group G

suppose V = Cn is a representation of a group Gthis means that every g ∈ G acts my some n × n matrixMg : V → V (so g · v = Mgv) and we have Me = I andMgh = MgMh

f( n∑

m1,jxj , . . . ,n∑

mn,jxj

Action of a Group G

G acts on C[x] as follows:

if g ∈ G and f (x) ∈ C[x] then define (g · f )(x) ∈ C[x] by(g · f )(v) = f (Mg−1v)

(we use Mg−1 instead of Mg to make it a left action)

C[x] is an ∞-dimensional C-vector spacethe monomials form a basisG acts by linear transformations on C[x]C[x] is an ∞-dimensional representation of G

Action of a Group G

G acts on C[x] as follows:

if g ∈ G and f (x) ∈ C[x] then define (g · f )(x) ∈ C[x] by(g · f )(v) = f (Mg−1v)

(we use Mg−1 instead of Mg to make it a left action)

C[x] is an ∞-dimensional C-vector spacethe monomials form a basisG acts by linear transformations on C[x]C[x] is an ∞-dimensional representation of G

The Invariant Ring

f (x) ∈ C[x] is G -invariant if (g · f )(x) = f (x) for all g ∈ Gf (x) ∈ C[x] is G -invariant if and only if it is constant on allG -orbits in V

Definition

C[x]G is the set of all G -invariant polynomials in C[x]

C[x]G is a subalgebra, i.e., contains C and is closed under addition,subtraction and multiplication

if f1(x), . . . , fr (x) ∈ C[x] then

C[f1(x), . . . , fr (x)] :=

{p(f1(x), . . . , fr (x)) | p(y1, . . . , yr ) ∈ C[y1, . . . , yr ]}

is the subalgebra of C[x] generated by f1(x), . . . , fr (x).

The Invariant Ring

Definition

if f1(x), . . . , fr (x) ∈ C[x] then

C[f1(x), . . . , fr (x)] :=

{p(f1(x), . . . , fr (x)) | p(y1, . . . , yr ) ∈ C[y1, . . . , yr ]}

The Invariant Ring

Definition

if f1(x), . . . , fr (x) ∈ C[x] then

C[f1(x), . . . , fr (x)] :=

{p(f1(x), . . . , fr (x)) | p(y1, . . . , yr ) ∈ C[y1, . . . , yr ]}

The Symmetric Group

G = Sn acts on V = Cn by permuting the coordinatesfor σ ∈ Sn, Mσ is the corresponding permutation matrixSn acts on C[x] as

(σ · f )(x1, . . . , xn) = f (xσ(1), . . . , xσ(n))

define the k-th elementary symmetric function as

ek(x) =∑

1≤i1<i2<···<ik≤nxi1xi2 · · · xik

for example e1 = x1 + x2 + · · ·+ xn and en = x1x2 · · · xn

Theorem

C[x]Sn = C[e1(x), . . . , en(x)]

The Symmetric Group

(σ · f )(x1, . . . , xn) = f (xσ(1), . . . , xσ(n))

ek(x) =∑

1≤i1<i2<···<ik≤nxi1xi2 · · · xik

Theorem

C[x]Sn = C[e1(x), . . . , en(x)]

The Symmetric Group

(σ · f )(x1, . . . , xn) = f (xσ(1), . . . , xσ(n))

ek(x) =∑

1≤i1<i2<···<ik≤nxi1xi2 · · · xik

Theorem

C[x]Sn = C[e1(x), . . . , en(x)]

Hilbert’s Finiteness Theorem

assume that G is (linearly) reductive, which means that everyrepresentation of G is a direct sum of irreducible representationsexamples are GLn, SLn, On, finite groups

C[x]G is a finitely generated algebra, i.e.,C[x]G = C[f1(x), . . . , fr (x)] for some r <∞ andf1(x), . . . , fr (x) ∈ C[x]G

proof sketch:J ⊆ C[x] ideal generated by all homogeneous, non-constantf (x) ∈ C[x]G (∞ many!)Basis Theorem: J = (f1(x), . . . , fr (x)) for some r <∞ andhomogeneous f1(x), . . . , fr (x) ∈ C[x]G

by induction one shows that C[x]G = C[f1(x), . . . , fr (x)]

proof sketch:J ⊆ C[x] ideal generated by all homogeneous, non-constantf (x) ∈ C[x]G (∞ many!)

Basis Theorem: J = (f1(x), . . . , fr (x)) for some r <∞ andhomogeneous f1(x), . . . , fr (x) ∈ C[x]G

Degree Bounds

Definition

β(C[x]G ) is the smallest d such that C[x]G is generated bypolynomials of degree ≤ d

Theorem (Jordan 1876)

for binary forms of degree d we have β(C[x1, . . . , xd+1]SL2) ≤ d6

Theorem (Emmy Noether 1916)

if G is finite then β(C[x]G ) ≤ |G |

Degree Bounds

Definition

Degree Bounds

Definition

A Constructive Proof

the proof of Hilbert’s finiteness theorem does not give an algorithmfor finding generators, nor does it give an upper bound forβ(C[x]G ) for arbitrary G

so Hilbert gave another, more constructive proof in 1893 of hisFiniteness Theorem using his notion of the null cone

A Constructive Proof

the proof of Hilbert’s finiteness theorem does not give an algorithmfor finding generators, nor does it give an upper bound forβ(C[x]G ) for arbitrary G

so Hilbert gave another, more constructive proof in 1893 of hisFiniteness Theorem using his notion of the null cone

Hilbert’s Null cone

for v ∈ V , G · v = {g · v | g ∈ G} is orbit of vG · v ⊆ V closure of the orbit

Theorem

G · v ∩ G · w 6= ∅ ⇔ f (v) = f (w) for all f (x) ∈ C[x]G

⇒: f ∈ C[x]G is constant on G · v and G · w

Definition

Hilbert’s Null cone:

N := {v ∈ V | 0 ∈ G · v} =

= {v ∈ V | f (v) = f (0) for all f (x) ∈ C[x]G}

if C[x]G = C[f1(x), . . . , fr (x)] with f1(x), . . . , fr (x) homogeneous,non-constant, then N = {v ∈ V | f1(v) = · · · = fr (v) = 0}

Theorem

Definition

N := {v ∈ V | 0 ∈ G · v} =

Theorem

Definition

N := {v ∈ V | 0 ∈ G · v} =

Example: Multiplicative Group

G = C?, V = C4

for t ∈ C?, define

t 0 0 00 t 0 00 0 t−1 00 0 0 t−1

v1v2v3v4

tv1tv2

t−1v3t−1v4

N = {v1 = v2 = 0} ∪ {v3 = v4 = 0}

Example: Multiplicative Group

C[x1, x2, x3, x4]C?

= C[x1x3, x1x4, x2x3, x2x4]

N = {v1v3 = v1v4 = v2v3 = v2v4 = 0} = {v1 = v2 = 0}∪{v3 = v4 = 0}

Note that in this case, there is an algebraic relation between thegenerators, namely

(x1x3)(x2x4) = (x1x4)(x2x3)

Hilbert-Mumford criterion

Definition

a one parameter subgroup (1-PSG) is a homomorphism ofalgebraic groups λ : C? → G

Theorem (Hilbert-Mumford criterion)

if v ∈ V = Cn, thenv ∈ N ⇔ there exists a 1-PSG λ : C? → G with lim

t→0λ(t) · v = 0

Hilbert-Mumford criterion

Definition

a one parameter subgroup (1-PSG) is a homomorphism ofalgebraic groups λ : C? → G

Theorem (Hilbert-Mumford criterion)

if v ∈ V = Cn, thenv ∈ N ⇔ there exists a 1-PSG λ : C? → G with lim

t→0λ(t) · v = 0

Conjugation of n × n Matrices

V = Matn,n, the space of n × n matricesG = GLn (the group of invertible n × n matrices) acts byconjugation: if A = (ai ,j) ∈ V and g ∈ G then g · A = gAg−1

(?) λ(t) =

with k1 ≥ k2 ≥ · · · ≥ kn, then

λ(t) · A = λ(t)Aλ(t)−1 = (tki−kjai ,j).

so limt→0 λ(t) · A = 0 if and only if A is strict upper triangular

every 1-PSG is of the form (?) after a base change, so

A ∈ N ⇔ A conjugate to strict upper triang. mat.⇔ A is nilpotent

(?) λ(t) =

with k1 ≥ k2 ≥ · · · ≥ kn, then

(?) λ(t) =

with k1 ≥ k2 ≥ · · · ≥ kn, then

X = (xi ,j) where xi ,j are indeterminates

det(tI − X ) = tn − f1(x)tn−1 + · · ·+ (−1)nfn(x)

where x = x1,1, x1,2, . . . , xn,nf1(A) = trace(A), fn(A) = det(A)

Theorem

C[x]G = C[f1(x), . . . , fn(x)]

A ∈ N ⇔ f1(A) = · · · = fn(A) = 0⇔ det(tI−A) = tn ⇔ A nilpotent

X = (xi ,j) where xi ,j are indeterminates

det(tI − X ) = tn − f1(x)tn−1 + · · ·+ (−1)nfn(x)

where x = x1,1, x1,2, . . . , xn,nf1(A) = trace(A), fn(A) = det(A)

Theorem

C[x]G = C[f1(x), . . . , fn(x)]

A ∈ N ⇔ f1(A) = · · · = fn(A) = 0⇔ det(tI−A) = tn ⇔ A nilpotent

Degree Bounds

suppose f1(x), . . . , fr (x) ∈ C[x]G are homogeneous andN = {v | f1(v) = · · · = fr (v) = 0}

then there exists finitely many homogenous invariantsh1(x), . . . , hs(x) such that every invariant p(x) ∈ C[x]G is of theform