+ All Categories
Home > Documents > We use Numerical continuation Bifurcation theory with symmetries to analyze a class of optimization...

We use Numerical continuation Bifurcation theory with symmetries to analyze a class of optimization...

Date post: 21-Dec-2015
Category:
View: 222 times
Download: 3 times
Share this document with a friend
Popular Tags:
10
We use Numerical continuation Bifurcation theory with symmetries to analyze a class of optimization problems of the form max F(q,)=max (G(q)+D(q)). The goal is to solve for = B(0,), where: . G and D are infinitely differentiable in . G is strictly concave. D is convex. G and D must be invariant under relabeling of the classes. The hessian of F is block diagonal with N blocks {B } and B =B if q(z |y)= q(z |y) for every yY. Problems in this Class Deterministic Annealing (Rose 1998) max H(Z|Y) - D(Y,Z) Clustering Algorithm Rate Distortion Theory (Shannon ~1950) max –I(Y,Z) - D(Y,Z) Optimal Source Coding Information Distortion (Dimitrov and Miller2001) max H(Z|Y) + I(X,Z) Used in neural coding. Information Bottleneck Method (Tishby, Pereira, Bialek 2000) max –I(Y,Z) + I(X,Z) Used for document classification, gene expression, neural coding and spectral analysis q q n Z z Y y y z q y z q , 1 ) | ( | ) | ( : A Class of Problems
Transcript

We use

Numerical continuation Bifurcation theory with symmetries

to analyze a class of optimization problems of the form

max F(q,)=max (G(q)+D(q)).

The goal is to solve for = B(0,), where:

• .

• G and D are infinitely differentiable in .• G is strictly concave.• D is convex.• G and D must be invariant under relabeling of the classes.• The hessian of F is block diagonal with N blocks {B} and B=B if

q(z|y)= q(z|y) for every yY.

Problems in this Class

• Deterministic Annealing (Rose 1998) max H(Z|Y) - D(Y,Z)

Clustering Algorithm

• Rate Distortion Theory (Shannon ~1950) max –I(Y,Z) - D(Y,Z)

Optimal Source Coding

• Information Distortion (Dimitrov and Miller2001) max H(Z|Y) + I(X,Z) Used in neural coding.

• Information Bottleneck Method (Tishby, Pereira, Bialek 2000) max –I(Y,Z) + I(X,Z)

Used for document classification, gene expression, neural coding and spectral analysis

q q

n

Zz

Yyyzqyzq

,1)|(|)|(:

A Class of Problems

2H(X) input sequences

2H(Y

) ou

tput

seq

uenc

es

2I(X,Y) distinguishable input/output classes of (x,y) pairs

Y

X

1

2

3

4 Size of an input/output class:

2(H(X|Y) + H(Y|X)) pairs

Rate DistortionHow well is the source X represented by Z?

Information Distortion

Goal: Determine the input/output classes of (x,y) pairs.

Idea: We seek to quantize (X,Y) into clusters which correspond with the input/output classes.

Method: We determine a quantizer, Q*, between X and Z , a representation of Y using N elements, such that F(Q*,B) is a maximum for some B (0,).

X YP(Y |X)

inputsource

outputsource

Z

clusteredoutputs

q*(Z |Y)

Q*(Z |X)

X

p(X)

Z is a representation of X using N symbols (or clusters)

A good communication system has p(X,Y) like:

Some nice properties of the problem

The feasible region , a product of simplices, is nice.

Lemma is the convex hull of vertices ().

The optimal quantizer q* is DETERMINISTIC.

Theorem The extrema of lie generically on the vertices of ..

Corollary The optimal quantizer is invariant to small perturbations in the model.

321 yyy

321 yyy

Solution of the problem when p(X,Y):= 4 gaussian blobs

p(X,Y) I(X,Z) vs. N

Goal: To efficiently solve maxq (G(q) + D(q)) for each , incremented in sufficiently small steps, as B.

Method: Study the equilibria of the of the flow

• The Jacobian wrt q of the K constraints {zq(z|y)-1} is J = (IK IK … IK).

• The first equilibrium is q*(0 = 0) 1/N.

• . determines stability and location of

bifurcation.

Assumptions:• Let q* be a local solution to and fixed by SM .• Call the M identical blocks of q F (q*,): B. Call the other N-M blocks

of q F (q*,): {R}. • At a singularity (q*,*,*), B has a single nullvector v and R is

nonsingular for every .• If M<N, then BR

-1 + MIK is nonsingular.

Theorem: If q, L(q*,*,*) is singular then q F (q*,*) is singular.

Theorem: (q*,*,*) is a bifurcation of equilibria of if and only if

q, L(q*,*, *) is singular.

Theorem: If (q*,*,*) is a bifurcation of equilibria of , then * 1.

Theorem: dim (ker q F (q*,* )) = M with basis vectors w1,w2, … , wM

Theorem: dim (ker q, L (q*,*,*)) = M-1 with basis vectors

Yy z

yqq yzqqDqGqq

1)|()()(:),,( ,, L

The Dynamical System

0),,(, T

qq J

JFq L

otherwise 0

class unresolved theis if ][

th

i

ivw

1

100

M

i

Mi ww

Continuation

• A local maximum qk*(k) of is an equilibrium of the

gradient flow .• Initial condition qk+1

(0)(k+1(0)) is sought in the tangent direction

qk , which is found by solving the matrix system

• The continuation algorithm used to find qk+1*(k+1) is based

on Newton’s method.

k)0(

1k

),( *kkq

),( 1*

1 kkq

),( )0(1

)0(1 kkq

*1kq

*kq

q

)0(1kq

),,(),,( ,, kkkqk

k

kkkq qq

q

LL

How:

Use numerical continuation in a constrained system to choose and to choose an initial guess to find the equilibria q*( ).

Use bifurcation theory with symmetries to understand bifurcations of the equilibria.

Investigating the Dynamical System

Bifurcations of q*()

Observed Bifurcations for the 4 Blob Problem

Conceptual Bifurcation Structure

q* (YN|Y)

Nq

1*

Bifurcations with symmetryTo better understand the bifurcation structure, we capitalize onthe symmetries of the optimization function F(q,).

The “obvious” symmetry is that F(q,) is invariant to relabeling of the N classes of Z

The symmetry group of all permutations on N symbols is SN.

The action of SN on and q, L (q, , ) is represented by the finite Lie Group

where P is a “block permutation” matrix.

The symmetry of is measured by its isotropy group, the subgroup

of which fixes it.

P

|0

0:

KKnK

Kn

I

q

q

The Equivariant Branching Lemma gives the existence of bifurcating solutions for every isotropy subgroup which fixes a one dimensional subspace of ker q,L (q*,,).

Theorem: Let (q*,*,*) be a singular point of the flow

such that q* is fixed by SM. Then there exists M bifurcating solutions, (q*,*,*) + (tuk,0,(t)), each with isotropy group SM-1, where

What do the bifurcations look like?

),,(, qq

q L

otherwise 0

class unresolvedother any is if

class unresolved theis if)1(

][ kv

kvM

u

th

k

Let T(q*,*) =

Transcritical or Degenerate?

Theorem: If T(q*,*) 0 and M>2, then the bifurcation at (q*,*) is transcritical. If T(q*,*) = 0, it is degenerate.

Branch Orientation?

Theorem: If T(q*,*) > 0 or if T(q*,*) < 0, then the branch is supercritical or subcritical respectively. If T(q*,*) = 0 , then 4

qqqq F(q,) dictates orientation.

Branch Stability?

Theorem: If T(q*,*) 0, then all branches fixed by SM-1 are unstable.

.][][][),(

,,

**3

lmklmk lmk

vvvqqq

qF

Bifurcation Structure

4S

3S3S

3S 3S

0

3

vv

v

v

0

3

vv

v

v

0

3vv

v

v

0

3vv

v

v

2S2S 2S2S2S2S2S2S

1

0

2

0

vv

v

2S 2S 2S2S

0

2

0

vv

v

0

2

0

vv

v

0

0

2

vv

v

0

2

0

vv

v

0

2

0

vv

v

0

0

2

v

v

v

0

20v

v

v

0

0

2

v

v

v

0

0

2

v

v

v

0

0

2

v

v

v

0

02v

v

v

Partial lattice of the isotropy subgroups of S4 (and associated bifurcating directions)

For the 4 blob problem:The isotropy subgroups and bifurcating directions of the

observed bifurcating branches

isotropy group: S4 S3 S2 1bif direction: (-v,-v,3v,-v,0)T (-v,2v,0,-v,0)T (-v,0,0,v,0)T …No more bifs!

4S

4A

21324

21234

21243

v

v

v

v

v

v

v

v

v

v

v

v

)1324(

The Smoller-Wasserman Theorem ascertains the existence of bifurcating branches for every maximal isotropy subgroup.

Theorem: If M is a composite number, then there exists bifurcating solutions with isotropy group <p> for every element of order M in and every prime p|M. The bifurcating direction is in the p-1 dimensional subspace of ker q,L (q*,,) which is fixed by <p>.

We have never numerically observed solutions fixed by <p> and so perhaps they are unstable.

Other Branches

An example of redundancy: (1423)2= (1324)2= (12)(34)

The full lattice of subgroups of the group SM is not known for arbitrary M.

)1423(

Lattice of the maximal isotropy subgroups <p> in S4

The efficient algorithmto solve max F(q, )

Let q0 be the maximizer of maxq G(q), 0 =1 and s > 0. For k 0, let (qk , k ) be a solution to maxq (G(q) + D(q )). Iterate the following steps until K = B for some K.

1. Perform -step: solve

for and select k+1 = k + dk where

dk = s /(||qk ||2 + ||k ||2 +1)1/2.

2. The initial guess for qk+1 at k+1 is qk+1(0) = qk + dk qk .

3. Optimization: solve maxq (G(q) + k+1 D(q)) to get the maximizer q*

k+1 , using initial guess qk+1(0) .

4. Check for bifurcation: compare the sign of the determinant of an identical block of each of q [G(qk) + k D(qk)] and q [G(qk+1) + k+1 D(qk+1)]. If a bifurcation is detected, then set qk+1

(0) = qk + dk u where u is given by and repeat step 3.

),,(),,( ,, kkkqk

k

kkkq qq

q

LL

k

kq


Recommended