+ All Categories
Home > Documents > Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of...

Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of...

Date post: 15-Jan-2016
Category:
Upload: louisa-ellis
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
21
Bregman Bregman Information Bottleneck Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Koby Crammer Hebrew Hebrew University University of Jerusalem of Jerusalem Noam Slonim Noam Slonim Princeton Princeton University University
Transcript
Page 1: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

Bregman Bregman Information BottleneckInformation Bottleneck

NIPS’03, Whistler December 2003

Koby CrammerKoby CrammerHebrew UniversityHebrew University

of Jerusalemof Jerusalem

Noam SlonimNoam SlonimPrinceton UniversityPrinceton University

Page 2: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

MotivationMotivation

• Extend the IB for a broad family of representations• Relation to the Exponential family

Hello, world

Multinomial distribution

Vectors

Page 3: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

OutlineOutline

• Rate-Distortion Formulation• Bregman Divergences• Bregman IB• Statistical Interpretation• Summary

Page 4: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

Information BottleneckInformation Bottleneck

X T Y

X

[ p(y=1|X) … p(y=n|X)]

[ p(y=1|T) … p(y=n|T)]

T

Page 5: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

• Input

• Variables

• Distortion

Rate-Distortion FormulationRate-Distortion Formulation

Page 6: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

• Bolzman Distribution:

• Markov + Bayes

• Marginal

Self-Consistent EquationsSelf-Consistent Equations

Page 7: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

Bregman DivergencesBregman Divergences

f

(u,f(u))

(v,f(v))

(v, f(u)+f’(u)(v-u))

Bf(v||u) = f(v) - (f(u)+f’(u)(v-u))Bf(v||u) = f:S R

Page 8: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

• Functional

• Bregman Function

• Input

• Variables

• Distortion

Bregman IB: Rate-Distortion FormulationBregman IB: Rate-Distortion Formulation

Page 9: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

• Bolzman Distribution:

• Prototypes: convex combination of input vectors

• Marginal

Self-Consistent EquationsSelf-Consistent Equations

Page 10: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

Special CasesSpecial Cases

• Information Bottleneck: Bregman function: f(x)=x log(x) – x Domain: Simplex Divergence: Kullback-Leibler

• Soft K-means Bregman function: f(x)=(1/2) x2

Domain: Realsn

Divergence: Euclidian Distance [Still, Bialek, Bottou, NIPS 2003]

Page 11: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

Bregman IBBregman IB

Information Bottleneck

BregmanClustering

Rate-Distortion

Exponential Family

Page 12: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

Exponential FamilyExponential Family

• Expectation parameters:

• Examples (single dimension): Normal

Poisson

Page 13: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

• Expectation parameters:

• Properties :

Exponential Family and Exponential Family and Bregman DivergencesBregman Divergences

Page 14: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

IllustrationIllustration

Page 15: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

• Expectation parameters:

• Properties :

Exponential Family and Exponential Family and Bregman DivergencesBregman Divergences

Page 16: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

• Distortion:

• Data vectors and prototypes: expectation parameters

• Question: For what exponential distribution we have ?

Answer: Poisson

Back to Distributional ClusteringBack to Distributional Clustering

Page 17: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

Product of Poisson

Distributions

IllustrationIllustration

a a b a a a b a a a .8.2

a b

6040

a b

Pr

Multinomial Distribution

Page 18: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

Back to Distributional ClusteringBack to Distributional Clustering

• Information Bottleneck: Distributional clustering of Poison distributions

• (Soft) k-means: (Soft) Clustering of Normal distributions

Page 19: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

• Distortion

• Input: Observations

• Output Parameters of Distribution

• IB functional: EM [Elidan & Fridman, before]

Maximum Likelihood PerspectiveMaximum Likelihood Perspective

Page 20: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

• Posterior:

• Partition Function:

Weighted -norm of the Likelihood

• → ∞ , most likely cluster governs• →0 , clusters collapse into a single prototype

Back to Self Consistent EquationsBack to Self Consistent Equations

Page 21: Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of Jerusalem Noam Slonim Princeton University.

Summary Summary

• Bregman Information Bottleneck Clustering/Compression

for many representations and divergences

• Statistical Interpretation Clustering of distributions from the exponential family EM like formulation

• Current Work: Algorithms Characterize distortion measures which also yield

Bolzman distributions General distortion measures


Recommended