+ All Categories
Home > Documents > Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random...

Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random...

Date post: 13-Aug-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
57
Communities Introduction Block models Optimization Random walks Mining subspaces Ritz vectors Examples Conclusions Communities, Spectral Clustering, and Random Walks David Bindel Department of Computer Science Cornell University 15 Nov 2011
Transcript
Page 1: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Communities, Spectral Clustering, andRandom Walks

David Bindel

Department of Computer ScienceCornell University

15 Nov 2011

Page 2: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

1

2

3

4

5

6 7

8

9

10

11

12

13

14

15

16

17

1819

20

21

22

23

24

25

26

27

28

29

30

Page 3: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Basic setting

Informal: Community = “unusually tight” node group?

Formal: Graph G = (V ,E), seek subgraph G′ = (V ′,E ′):1 By model fitting2 By optimization of some metric3 By random walks on G

Unified by linear algebra!

Page 4: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Plan for today

Three routes to an invariant subspaceHow to mine a subspace for informationFrom eigenvectors to Ritz vectorsSome examples

Page 5: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Notation

Adjacency matrix A ∈ {0,1}n×n is

Aij =

{1, (i , j) ∈ E0, otherwise

Also define

e = vector of n onesd = Ae = degree vectorD = diag(d)

L = D − A = graph Laplacian

B = A− ddT

m= modularity matrix

Page 6: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Spectrum for a random graph

0 10 20 30 40 50 60 70 80 90 100

0

10

20

30

40

50

60

70

80

90

100

nz = 2015

Spectrum of a Gn,p graph:One large eigenvalue ≈ npOther eigs between ≈ ±

√np(1− p)/4

Adjacency matrix = peeT + “noise”

Page 7: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Spectrum for a G100,0.2 sample

20 40 60 80 100

0

10

20

Index

Eig

enva

lue

Page 8: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Perron vector for a G100,0.2 sample

10 20 30 40 50 60 70 80 90 100

0.5

1

1.5

Page 9: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Block model approach

0 20 40 60 80 100 120 140 160 180

0

20

40

60

80

100

120

140

160

180

nz = 3996

Composite model: A ≈ S diag(β)ST , S ∈ {0,1}n×c

Motivation: possibly-overlapping random graphsColumns of S are one basis for range spaceWant to go from some general basis back to S

Page 10: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Spectrum for a block model sample

20 40 60 80 100 120 140 160 180

0

10

20

Index

Eig

enva

lue

Page 11: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Dominant vectors for a block model example

20 40 60 80 100 120 140 160 180

0

1

2

Page 12: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Same space, different basis

20 40 60 80 100 120 140 160 180

0

0.2

0.4

0.6

0.8

1

Page 13: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Questions

What about different matrices (e.g. L)?What about more interesting graph structures?How do we find the “right” subspace basis?

Page 14: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Measurement by quadratic forms

Indicate V ′ ⊆ V by s ∈ {0,1}n. Measure subgraph:

sT As = |E ′| = internal edges

sT Ds = edges incident on subgraph

sT Ls = edges between V ′ and V̄ ′

sT Bs = “surprising” internal edges

Page 15: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Graph bisection

Idea: Find s ∈ {0,1}n such that eT s = n/2 tominimize sT Ls (min cut)maximize sT Bs (max modularity)

Equivalently: Find s̄ ∈ {±1}n such that eT s̄ = 0 tominimize s̄T Ls̄ = sT Ls ormaximize s̄T Bs̄ = sT Bs

Oops — NP hard!

Page 16: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Relax!

2 4 6 8 10 12 14 16 18 20−1

−0.5

0

0.5

1

Hard: min s̄T Ls̄ s.t. eT s̄ = 0, s̄ ∈ {±1}n.Easy: min vT Lv s.t. eT v = 0, v ∈ Rn, ‖v‖2 = n.

Page 17: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Rayleigh quotients

sT AssT s

= mean internal degree in subgraph

sT LssT s

= edges cut between V ′ and V̄ ′

sT AssT Ds

= fraction of incident edges internal to V ′

sT LssT Ds

= fraction of incident edges cut

sT BssT s

= mean “surprising” internal degree in subgraph

sT BssT Ds

= mean fraction of internal degree that is surprising

Page 18: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Rayleigh quotients and eigenvalues

Basic connection (M spd):

xT KxxT Mx

stationary at x ⇐⇒ Kx = λMx

Easy despite lack of convexity.

Page 19: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Limits of Rayleigh quotients

But small variations kill us:

maxx 6=0

xT Ax‖x‖22

= λmax(A), but

maxx 6=0

xT Ax‖x‖21

= 1− ω−1

where ω is the max clique size (Motzkin-Strauss).

Page 20: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Rayleigh quotients and eigenproblems

Decompose:

W T MW = I and W T KW = Λ = diag(λ1, . . . , λn).

For any x 6= 0,

xT KxxT Mx

=n∑

j=1

λjz2j , where z =

W−1x‖W−1x‖2

.

SosT KssT Ms

≈ λmax =⇒ s ≈∑

λj≈λmax

wjzj .

So look at invariant subspaces for extreme eigenvalues.

Page 21: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

The random walker

Page 22: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

The random walker

Basic idea: extract structure from random walk.

Old: start at seed and walk forwardDay 1: I came up with a funny joke!Day 2: I tell everyone in my familyDay 3: My mother tells a friend?

New: look at how quickly source is forgottenDay 1: David came up with a funny joke!Day 2: There’s a joke going around Cornell CS.Day 3: I read this bad joke on the web...

Page 23: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

The random walker

Lazy random walk with transition matrix T = 12(I + AD−1).

1 Start at p0, take k steps. Distribution:

pk = T kp0 (→ d/m as k →∞)

2 End at q0 after k steps. Conditional distribution on start:

qk ∝(

T T)k

q0 (→ e/n as k →∞)

Notes:If the graph is undirected, T T = D−1TD.If the graph is also regular, T T = T .

Page 24: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Simon-Ando theory

Markov chain with loosely-coupled subchains:Rapid local mixing: after a few steps

pk ≈c∑

j=1

αj,kp(j)∞

where p(j)∞ is a local equilibrium for the j th subchain

Slow equilibration: αj,k → αj,∞.

Alternately, rapid local mixing looks like:

qk ≈c∑

j=1

γj,ksj

where sj is an indicator for nodes in one subchain.

Page 25: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Simon-Ando theory

In chemistry:transitions among metastable states.

In network analysis:transitions among communities?

Page 26: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Spectral Simon-Ando picture

Exactly decoupled case (c decoupled chains):Eigenvalue one has multiplicity c.Eigenvectors of T are local equilibria.Eigenvectors of T T are indicators for chains.Rapid mixing =⇒ large gap to λc+1.

Weakly coupled case:Cluster of c eigenvalues near 1.Eigenvectors of T are combinations of local equilibria.Eigenvectors of T T are combinations of indicators.Large gap between λc and λc+1.

Page 27: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Summary so far

Indicator vectors approximately in invariant subspacesSeveral possible motivationsSeveral possible matrices (I like T T )

But how do we go from the subspace to the indicators?

Page 28: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Indicators from subspaces: spectral clustering

U spans a small subspace (e.g. an invariant subspace)1 S ≈ UY indicates a partition. If Sij = 1,

U(i , :) ≈ (Y−1)(j , :).

Idea: Treat rows of U are latent coordinates. Cluster.

2 Suppose some indicator s ≈ Uy . Then row U(j , :)forms an acute angle with y when sj = 1is almost normal to y when sj = 0.

Clustering? What if sets overlap?

Page 29: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Clustering and overlap

Dominant eigenvectors for A:

Alternate basis for the space:

How do we get the latter basis?

Page 30: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Desiderata

Given a basis U, want to extract a vector s̃ s.t.s̃ lies close to the span of Us̃ is almost an indicator for a community

Maybe nonnegative?Not too many ones?

Page 31: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Indicators from subspaces: LP version

To find s ≈ Uy for some y , si = 1:

minimize ‖s̃‖1 (proxy for sparsity of s̃)s.t. s̃ = Uy (s̃ in the right space)

s̃i ≥ 1 (“seed” constraint)s̃ ≥ 0 (componentwise nonnegativity)

Recovers smallest set containing node i ifU = SY−1 exactly.Each set contains at least one element only in that set.(Frequently works if there is not “too much” overlap.)

Got noise? Need thresholding!

Page 32: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Indicators from subspaces: QP version

Alternate optimization (box-constrained quadratic program):

minimize 12 s̃T Ps̃ + τ‖s̃‖1

s.t. s̃i ≥ 1s̃ ≥ 0

Recover LP with P = I − UUT and τ → 0 (for UT U = I).Can let P be general semidefinite matrix (e.g. P = L)Size of τ controls sparsity (can automate choice)

Page 33: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Summary so far

Two pieces to spectral community detection:Pull out an invariant subspaceMine the subspace for community structure

Motivation: optimization or random walk dynamics.

But...What about when n and c are both large?What if there is no clear spectral gap?

Would like an alternative to invariant subspaces!

Page 34: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Eigenvectors to Ritz vectors

Variational characterization of eigenpairs:

xT KxxT Mx

stationary at x ⇐⇒ Kx = λMx

Rayleigh-Ritz approximation: x = Vy

(Vy)T K (Vy)

(Vy)T M(Vy)stationary at y ⇐⇒ (V T KV )y = λ̂(V T MV )y

Page 35: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Role of Ritz vectors

Usual approach to large-scale symmetric eigenproblems:1 Generate a basis for a Krylov subspace

Kk (A, x0) = span{x0,Ax0,A2x0, . . . ,Ak−1x0}

2 Use Rayleigh-Ritz on subpace3 Re-start if subspace gets too large

Page 36: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

“Know when to walk away”

You’ve got to know when to hold ’emKnow when to fold ’emKnow when to walk awayKnow when to run

(with apologies to Kenny Rogers)

Page 37: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Ritz subspaces

Idea: Use unconverged subspace computations

Invariant subspace Ritz subspaceLong random walks Short random walksPotentially expensive Cheap to computeMay need large space Small space okay

Page 38: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Current favorite method

1 Pick “seed” nodes j1, j2, . . .2 Take short random walks (length k ) from each seed3 Extract few Ritz vectors from span{q0,q1, . . . ,qk−1}.4 Approx indicators in span of all Ritz vectors.5 Possibly add more seeds and return to step 1.6 Convert raw “score” vector to a {0,1} indicator.

Page 39: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Wang test graph

1

2

3

4

5

6 7

8

9

10

11

12

13

14

15

16

17

1819

20

21

22

23

24

25

26

27

28

29

30

Page 40: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Spectrum for Wang test graph

5 10 15 20 25 30−0.5

0

0.5

1

Index

Eig

enva

lue

Page 41: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Zachary Karate graph

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

Page 42: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Spectrum for Karate

5 10 15 20 25 30

−0.5

0

0.5

1

Index

Eig

enva

lue

Page 43: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Football graph

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

3031

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

5960

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

Page 44: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Spectrum for Football

20 40 60 80 100

0

0.5

1

Index

Eig

enva

lue

Page 45: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Dolphin graph

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24 25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

Page 46: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Spectrum for Dolphin

10 20 30 40 50 60

−0.5

0

0.5

1

Index

Eig

enva

lue

Page 47: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Non-overlapping synthetic benchmark (µ = 0.5)

0 100 200 300 400 500 600 700 800 900 1000

0

100

200

300

400

500

600

700

800

900

1000

nz = 15746

Page 48: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Spectrum for synthetic benchmark

20 40 60 80 100

0.4

0.6

0.8

1

Index

Eig

enva

lue

Page 49: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Score vector

100 200 300 400 500 600 700 800 900 1,0000

0.2

0.4

0.6

0.8

1

Index

Sco

re

Score vector for the two-node seed of 492 and 513 in thefirst LFR benchmark graph. Ten steps, three Ritz vectors.

Page 50: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Non-overlapping synthetic benchmark (µ = 0.6)

0 100 200 300 400 500 600 700 800 900 1000

0

100

200

300

400

500

600

700

800

900

1000

nz = 15316

Page 51: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Spectrum for synthetic benchmark

20 40 60 80 100

0.4

0.6

0.8

1

Index

Eig

enva

lue

Page 52: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Score vector

100 200 300 400 500 600 700 800 900 1,0000

0.5

1

Index

Sco

re

Score vector for the two-node seed of 492 and 513 in thefirst LFR benchmark graph. Ten steps, three Ritz vectors.

Page 53: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Overlapping synthetic benchmark (µ = 0.3)

1000 nodes47 communities500 nodes belong to two communities

Page 54: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Spectrum for synthetic benchmark

20 40 60 80 100

0.4

0.6

0.8

1

Index

Eig

enva

lue

Page 55: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Score vector

100 200 300 400 500 600 700 800 900 1,0000

0.5

1

Index

Sco

re

Score vector for the two-node seed of 521 and 892.The desired indicator is in red.

Page 56: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Score vector

100 200 300 400 500 600 700 800 900 1,0000

0.5

1

1.5

Index

Sco

re

Score vector for the two-node seed of 521 and 892 +twelve reseeds. The desired indicator is in red.

Page 57: Communities, Spectral Clustering, and Random Walksbindel/present/2011-11-brownbag.pdf · Random walks Mining subspaces Ritz vectors Examples Conclusions Spectral Simon-Ando picture

Communities

Introduction

Block models

Optimization

Randomwalks

Miningsubspaces

Ritz vectors

Examples

Conclusions

Conclusions

Classic spectral methods use eigenvectors to findcommunities, but:

We don’t need to stop at partitioning!Overlap is okayKey is how we mine the subspace

We don’t need to stop at eigenvectors!Can also use Ritz vectorsComputation is cheap: short random walks


Recommended