Extending metric multidimensional scaling with Bregman divergences Mr. Jigang Sun Supervisor: Prof....

Post on 18-Jan-2018

226 views 0 download

description

basic MDS An example high dimensional space /data space/input space low dimensional space /latent space/output space

transcript

Extending metric multidimensional scaling with Bregman divergences

Mr. Jigang SunSupervisor: Prof. Colin Fyfe

Nov 2009

Multidimensional Scaling(MDS)

• A group of information visualisation methods that projects data from high dimensional space, to a low dimensional space, often two or three dimensions, keeping inter-point dissimilarities (e.g. distances) in low dimensional space as close as possible to the original dissimilarities in high dimensional space. When Euclidean distances are used, it is Metric MDS.

basic MDS

An example

high dimensional space/data space/input space

low dimensional space/latent space/output space

Basic MDS• We minimise the stress function

spacelatent in and pointsbetween distance mapped the

space datain j and i pointsbetween distance the

||, - || L

||, - ||

ij

ij

jYiY

XXD

ji

ji

YYXX

ijij

ii

LD

YXYX

jj

data space Latent space

)Dabs(L E

E)D(LE

ijijij

N

1i

N

1ij

2ij

N

1i

N

1ij

2ijijBasicMDS

error

where

Sammon Mapping (1969)

N

1i

N

1ijij

ijijij

N

1i

N

1ij ij

2ij

N

1i

N

1ij ij

2ijij

Sammon

DC

)Dabs(L E

DE

D)D(L

E

scalarion Normalisat

error

where

11CC

Focuses on small distances: for the same error, the smaller distance is given bigger stress, thus on average the small distances are mapped more accurately than long distances. Small neighbourhoods are well preserved.

Bregman divergence)(,)()(),( qFqpqFpFqpdF

is the Bregman divergence between p and q based on strictly convexfunction, F. Intuitively, the difference between the value of F at point p and the value of the first-order Taylor expansion of F around point q evaluated at point p.

)()()( qpqFqFpF

• When F is in one variable, the Bregman Divergence is truncated Taylor series

• A useful property for MDS: Non-negativity:

• If is a function in p, p approaches q when it is minimised.

qpqpdqpd FF 0),( and ,0),(

q)(p,dF

Bregman divergence

MDS using Bregman divergence

• Bregmanised MDS

• Equivalent Expression: residual Taylor series

Basic MDS is a special BMMDS• Base convex function is chosen as • And higher order derivatives are

• So

• Is derived as

Example 2: Extended Sammon• Base convex function

• This is equivalent to

• The Sammon mapping is rewritten as

0, x x,log x F(x)

Sammon and Extended Sammon• The common term • The Sammon mapping is considered to be an

approximation to the Extended Sammon mapping using the common term.

• The Extended Sammon mapping will do more adjustments on the basis of the higher order terms.

An Experiment on Swiss roll data set

At a glance

• Basic MDS captures the global curve, but poorly differentiates local points of same X and Y coordinate but different Z coordinate.

• The Sammon mapping does better than BasicMDS.

• The Extended Sammon mapping is the best.

Distance preservation

• Horizontal axis: mean distances in data space, 40 sets.

• Vertical axis: relative mean distances in latent space.

• Sammon is better than BasicMDS, Extended Sammon is better than Sammon:

• Small distances are mapped closer to their original value in data space; long distances are mapped longer.

Distance preservation

Relative standard deviation

Relative standard deviation

• On short distances, Sammon has smaller variance than BasicMDS, Extended Sammon has smaller variance than Sammon, i.e. control of small distances is enhanced.

• Large distances are given more and more freedom in the same order as above.

LCMC: local continuity meta-criterion (L. Chen 2006)

• A common measure assesses projection quality of different MDS methods.

• In terms of neighbourhood preservation.• Value between 0 and 1, the higher the better.

Quality accessed by LCMC

Stress comparison between Sammon and Extended Sammon

Stress comparison between Sammon and Extended Sammon

• For the ExtendedSammon, a shorter distance error (e.g. if Dij-Lij=2) in latent space is penalized more than a longer distance error (e.g. if Dij – Lij =-2)in latent space.

Stress formation by items

Stress formation by terms

• Stress coming from the term of the Sammon mapping is the largest. It is the main part of stress.

• However, for small distances, the contribution from other terms is not negligible.

OpenBox, Sammon and FirstGroup

SecondGroup on OpenBox

Future work

• Combining two opposite strategies for choosing base convex functions.

• Right Bregman divergences is one kind of CCA.

Conclusion

• Applied Bregman divergences to multidimensional scaling.

• Shown that basic MMDS is a special case and Sammon mapping approximates a BMMDS.

• Improved upon both with 2 families of divergences.

• Shown results on two artificial data sets.