+ All Categories
Home > Documents > Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels...

Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels...

Date post: 22-Sep-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
28
Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P. Xing as interpreted by Ted Westling STAT 572 Final Talk May 8, 2014 Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 1
Transcript
Page 1: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Mixed Membership Stochastic BlockmodelsJournal of Machine Learning Research, 2008

by E.M. Airoldi, D.M. Blei, S.E. Fienberg, E.P. Xingas interpreted by Ted Westling

STAT 572 Final TalkMay 8, 2014

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 1

Page 2: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Overview

1. Notation and motivation

2. The Mixed Membership Stochastic Blockmodel

3. Simulations

4. Applications

5. Conclusions

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 2

Page 3: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Overview

1. Notation and motivation

2. The Mixed Membership Stochastic Blockmodel

3. Simulations

4. Applications

5. Conclusions

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 3

Page 4: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Network theory: notation

N nodes or individuals.

We observe relations/interactions Y (i , j) on pairs of individuals.

Here we assume Y (i , j) ∈ {0, 1}, Y (i , i) = 0, but do not assumeY (i , j) = Y (j , i) (we deal with directed networks).

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 4

Page 5: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Scientific Motivation for Blockmodels

General blockmodels:

Hypothesis: the nodes can be grouped in to non-overlapping blockssuch that the probability of observing an edge from i to j is determinedby latent block indicators πi , πj and the block relation πT

i Bπj .Goal: recover the blocks and the block relationships.

MMSB:

Hypothesis: nodes exhibit mixed membership over latent blocks suchthat ... (same as above, except πi , πj are now distributions).Goal: recover the mixed memberships and the block relationships.

Two contrasting examples: monks and rural Indian village.

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 5

Page 6: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Example 1: Crisis in a Cloister

Sampson (1968) collected a directed social network of 18 monks.Qualitative analysis indicated three clusters of monks: Young Turks,Loyal Opposition, and Outcasts.

Figure: Monk adjacency matrices, ordered randomly (left) and using the a-prioriclusters (right).

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 6

Page 7: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Example 1: Crisis in a Cloister

Is this qualitative hypothesis supported by the data?

What are the relationships between the blocks?

How strongly defined are the blocks? (MMSB can answer whileregular blockmodel cannot.)

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 7

Page 8: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Example 2: Indian village network

Researchers collected individual- and household-level demographicand network data for 75 villages near Bangalore, India.I focused on one village of 114 households. Constructed network fromhousehold-level visit survey. No ground truth.

Figure: Indian village adjacency matrix, arbitrarily ordered.

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 8

Page 9: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Example 1: Indian village network

Do the data exhibit block structure?

Do the estimated blocks correspond to anything in the data?

How strongly defined are the blocks?

Are there particularly influential households?

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 9

Page 10: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Overview

1. Notation and motivation

2. The Mixed Membership Stochastic Blockmodel

3. Simulations

4. Applications

5. Conclusions

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 10

Page 11: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Generative MMSB

1 Fix the number of latent blocks K .2 For each node i = 1, . . . ,N:

1 Have a distribution πi over the latent blocks, distributed asπi ∼ Dirichlet(α).

3 For each ordered pair of nodes (i , j):1 Assume i and j are in particular blocks for the interaction i → j

indicated by zi→j , zi←j , where:

zi→j ∼ Categorical(πi ).zi←j ∼ Categorical(πj).

2 Draw the binary relation Y (i , j) ∼ Bernoulli(b) where b = zTi→jBzi←j .This is B(g , h) when zi→j = eg and zi←j = eh.

4 Choose K by estimating the model for K = 1, 2, 3, . . . and pickingthe model with the highest approximate BIC.

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 11

Page 12: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

MMSB vs. Previous methods

1 Central advancement over previous methods: each node i given adistribution πi over blocks rather than being a member of exactly oneblock.

2 Allows us to determine how strong someone’s allegiance to each blockis.

3 Learn about the block-level relationships through B.

4 Limitations: unlike previous methods, requires fixing the number ofblocks K . Also does not incorporate other possible networkmechanisms, e.g. reciprocity or triangle-closing.

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 12

Page 13: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Model Estimation: Overview

Strategy: treat {π,Z→,Z←} ≡ θ as random latent variables andobtain posterior distribution. Treat {α,B} ≡ β as fixed parameters toestimate via Empirical Bayes.

Cannot use EM algorithm, because there is no closed form forp(θ|Y , β). Sampling from exact posterior does not scale well, soinstead use Variational Bayes.

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 13

Page 14: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Variational Bayes

Main idea: write down a simple parametric approximation q(π,Z |∆)for the posterior distribution that depends on free variationalparameters ∆.

Minimize the KL divergence between q and the true posterior interms of ∆, which is equivalent to maximizing the evidence lowerbound or ELBO

L(∆|Y , α,B) = Eq [log p(π,Z ,Y |α,B)]− Eq [log q(π,Z |∆)] .

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 14

Page 15: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Empirical Bayes hyperparameter estimation

Said we would update α and B with Empirical Bayes.

This usually means maximizing p(Y |α,B), which we can’t get inclosed form.

However, L is a lower bound:

log p(Y |α,B) = K (p, q) + L(∆|Y , α,B)

so we maximize this instead, hoping that K (p, q) is relativelyconstant in α,B.

Really approximate Empirical Bayes. Different sort of approximationthan variational inference.

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 15

Page 16: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

“Nested” algorithm for the MMSB

1 Initialize B(0), α(0), γ(0)1:N ,Φ

(0)→ ,Φ

(0)← .

2 E-step:(a) for each i , j :

(i) Update φi→j , φi←j

(ii) Update γi , γj(iii) Update B.

(b) Until convergence

3 M-step: Update α.

4 Until convergence.

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 16

Page 17: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Overview

1. Notation and motivation

2. The Mixed Membership Stochastic Blockmodel

3. Simulations

4. Applications

5. Conclusions

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 17

Page 18: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Model Simulations

Simulated data from the assumed model under various parametervalues.

K = 3, 6, 9.N = 20, 50, 80.α = 0.025 · 1, 0.1 · 1, 0.25 · 1.B with .8 on the diagonal and .1 off-diagonal (“Diagonal”), withUniform(.5, 1) on the diagonal and Uniform(.2, .5) off-diagonal(“Diffuse”).

For each combination of parameter values, ran five simulations.

For each simulation, initialized at the true parameter values(“Oracle”) and at uninformed values (“Naive”).

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 18

Page 19: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Simulations: run time

alpha = 0.025 alpha = 0.1 alpha = 0.25

●● ●●

●● ●●

●●

●●

●● ●●

●●

●● ●●

●● ●

●●●●

●●

●● ●

●●●

0

50

100

150

0

50

100

150

0

50

100

150

K =

3K

= 6

K =

9

20 40 60 8020 40 60 8020 40 60 80N (number of nodes)

Run

tim

e (m

inut

es) B

● Diagonal

Diffuse

Initialization

Naive

Oracle

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 19

Page 20: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Simulations: MSE of γ

alpha = 0.025 alpha = 0.1 alpha = 0.25

0.0

0.1

0.2

0.0

0.1

0.2

0.0

0.1

0.2

K =

3K

= 6

K =

9

20 40 60 8020 40 60 8020 40 60 80N (number of nodes)

MS

E o

f blo

ck m

embe

rshi

p ve

ctor

s

B

● Diagonal

Diffuse

Initialization

Naive

Oracle

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 20

Page 21: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Simulations: conclusions

Runtime is order N2.

Initialization matters, especially when α is small.

Better performance when there is less mixing (small α, B close todiagonal).

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 21

Page 22: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Overview

1. Notation and motivation

2. The Mixed Membership Stochastic Blockmodel

3. Simulations

4. Applications

5. Conclusions

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 22

Page 23: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Crisis in a Cloister: posterior expected memberships

1 2

3

4

5 67

8

9

1011

12

13 141516

17

18

True Faction

aaaa

Loyal Opposition

Outcasts

Waverers

Young Turks

Figure: Left: Posterior expected node memberships, with color indicating thegroup membership identified by Sampson

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 23

Page 24: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Crisis in a Cloister: adjacency matrices

Figure: Left: Original adjacency matrix, ordered according to posterior groupmemberships. Right: smoothed adjacency matrix using posterior Z .

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 24

Page 25: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Indian Village: adjacency matrices

Figure: Left: Original adjacency matrix, ordered according to posterior groupmemberships. Right: smoothed adjacency matrix using posterior Z .

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 25

Page 26: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Overview

1. Notation and motivation

2. The Mixed Membership Stochastic Blockmodel

3. Simulations

4. Applications

5. Conclusions

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 26

Page 27: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Conclusions

MMSB is a useful extension to blockmodels when there is a groundtruth or the goal is prediction.

However, when lacking ground truth model lacks interpretability.

Could be improved by incorporating other network components.

Variational algorithm also makes interpretation difficult since posteriordistribution and emprical bayes are both approximations.

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 27

Page 28: Mixed Membership Stochastic Blockmodels · 2014. 6. 3. · Mixed Membership Stochastic Blockmodels Journal of Machine Learning Research, 2008 by E.M. Airoldi, D.M. Blei, S.E. Fienberg,

Thank you!

Ted Westling Mixed Membership Stochastic Blockmodels STAT 572 Final Talk 28


Recommended