+ All Categories
Home > Documents > Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators...

Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators...

Date post: 26-Jun-2020
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
48
Communities, Spectral Clustering, and Random Walks David Bindel Department of Computer Science Cornell University 26 Sep 2011
Transcript
Page 1: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Communities, Spectral Clustering, andRandom Walks

David Bindel

Department of Computer ScienceCornell University

26 Sep 2011

Page 2: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

1

2

3

4

5

6 7

8

9

10

11

12

13

14

15

16

17

1819

20

21

22

23

24

25

26

27

28

29

30

Page 3: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Basic setting

Informal approach:Community = unusually tightly connected nodes in a network

Formal version:Given graph G = (V ,E), seek subgraph G′ = (V ′,E ′):

1. Based on optimality properties (cut size, modularity, etc)2. Based on dynamics on G (random walks and variants)

Two approaches unified by linear algebra!

(For today, all graphs are undirected, most are unweighted.)

Page 4: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Unusually tightly connected?

What constitutes “unusual” connectivity of a subgraph?I High internal connectivity?I Low external connectivity?

Page 5: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Basic notation

The adjacency matrix A ∈ {0,1}n×n for G is

Aij =

{1, (i , j) ∈ E0, otherwise

Also define

e = vector of n onesd = Ae = degree vectorD = diag(d)

L = D − A = graph Laplacian

Page 6: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Measuring subgraphs of G

Indicate V ′ ⊆ V by s ∈ {0,1}n. Can write many properties ofthe induced subgraph via quadratic forms:

sT As = |E ′| = number of (directed) edges in subgraph

sT Ds = number of (directed) edges incident on subgraph

sT Ls = sT (D − A)s = edges between V ′ and V ′

Example: e indicates all of V , and

m = eT Ae = eT De = number of (directed) edges

0 = eT Le = edges between V and ∅

Page 7: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Configuration model

Configuration model for a random graph G = (V , E):I The degree vector d is specified.I P{(i , j) ∈ E} = didj/m.

Self-loops are allowed.

Expected adjacency matrix, degree vector and matrix, andLaplacian are

A =ddT

md = Ae = d

D = D

L = D − A

Page 8: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Modularity

Define B = A− A = L− L. Then

sT Bs = unexpected extra edges in subgraph= unexpected lack of cut edges

If s1, . . . , sc indicate a partition into c sets,

Q :=12

c∑j=1

sTj Bsj = modularity of partition

Page 9: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Bisection by optimization

Idea: Find s ∈ {0,1}n such that eT s = n/2 andI sT Ls is minimized (min cut) orI sT Bs is maximized (max modularity)

Equivalently: Find s ∈ {−1/2,1/2}n such that eT s = 0I sT Ls = sT Ls is minimized orI sT Bs = sT Bs is maximized

Oops — NP hard!

Page 10: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Spectral bisection

Relaxation makes the problem easier:Hard: minimize sT Ls s.t. eT s = 0, s ∈ {−1/2,1/2}n.Easy: minimize vT Lv s.t. eT v = 0, v ∈ Rn, ‖v‖2 = n/4.

Now v is an eigenvector for second smallest eigenvalue of L.Use sign pattern of v to partition =⇒ spectral bisection.Heuristic works well in practice (often with some refinement).

Same idea works for modularity.

Page 11: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Rayleigh quotients

Given matrices (K ,M), the generalized Rayleigh quotient is

ρK ,M(x) =xT KxxT Mx

.

Can represent interesting subgraph properties:

ρA,I(s) = mean internal degree in subgraph

ρL,I(s) = edges cut between V ′ and V ′

ρA,D(s) = fraction of incident edges internal to V ′

ρL,D(s) = fraction of incident edges cut

ρB,I(s) = mean “surprising” internal degree in subgraphρB,D(s) = mean fraction of internal degree that is surprisingρB,L(s) = fraction of edge cuts that are surprising

Page 12: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Rayleigh quotients and eigenvalues

Suppose M is positive definite. Basic connection:

ρK ,M stationary at x ⇐⇒ Kx = ρK ,M(x)Mx

Stationary points are (generalized) eigenvalues. Reasonable tocompute (even though the optimization is nonconvex!).

Page 13: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Limits of Rayleigh quotients

But small variations kill us:

maxx 6=0

xT Ax‖x‖22

= λmax(A), but

maxx 6=0

xT Ax‖x‖21

= 1− ω−1

where ω is the max clique size (Motzkin-Strauss).

Page 14: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Rayleigh quotients and eigenproblems

For M pos. def., have the generalized eigendecomposition

W T MW = I and W T KW = Λ = diag(λ1, . . . , λn).

For any x , the gen. RQ is a weighted average of eigenvalues

ρK ,M(x) =n∑

j=1

λjz2j ,

where z = W−1x/‖W−1x‖2. Therefore1. λmax = maxx 6=0 ρK ,M(x)

2. If ρK ,M(s) is near λmax, most weight is on largeeigenvalues. So s nearly lies in the invariant subspaceassociated with the large eigenvalues.

So look at invariant subspaces for extreme eigenvalues.

Page 15: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Another reason to look at subspaces

Spectrum of a Gn,p graph:I One large eigenvalue ≈ npI Other eigs between ≈ ±

√np(1− p)/4

I Adjacency matrix = peeT + “noise”

Composite model: A ≈ S diag(β)ST , S ∈ {0,1}n×c

I Motivation: possibly-overlapping random graphsI Columns of S are one basis for range spaceI Want to go from some general basis back to S

Page 16: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Indicators from subspaces, take 1

U spans a small subspace (e.g. an invariant subspace)1. If span(u1, . . . ,uc) ≈ span(s1, . . . , sc) where {sj} indicate a

partition, rows of U in the same partition are identical.Idea: Treat rows of U are latent coordinates. Cluster.

2. Suppose we have some indicator s ≈ Uy . Then row U(j , :)I forms an acute angle with y when sj = 1I is almost normal to y when sj = 0.

Clustering? What if sets overlap?

Page 17: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Indicators from subspaces, take 2

Suppose s ≈ Uy for some y , si = 1. Want to find s.Try optimization (a linear program):

minimize ‖s‖1 (proxy for sparsity of s)s.t. s = Uy (s in the right space)

si ≥ 1 (“seed” constraint)s ≥ 0 (componentwise nonnegativity)

Recovers smallest set containing node i ifI U = SY−1 exactly.I Each set contains at least one element only in that set.

(Frequently works if there is not “too much” overlap.)

What about noise? Generally need a thresholding strategy.

Page 18: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Indicators from subspaces, take 3

Alternate optimization (box-constrained quadratic program):

minimize 12 sT Ps + τ‖s‖1

s.t. si ≥ 1s ≥ 0

Recover LP with P = I −UUT and τ → 0 (assuming UT U = I).I Can let P be more general semidefinite matrix (e.g. P = L)I Size of τ controls sparsity (can automate choice)

Page 19: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Summary so far

Two pieces to spectral community detection:I Pull out an invariant subspaceI Mine the subspace for community structure

Page 20: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

The random walker

Page 21: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

The random walker, take 1

Lazy random walk on a graph:

pk+1 = T pk =12

(I + T )pk → p∞ = d/m

where T = AD−1 is a transition matrix (column stochastic).

Idea: extract community structure from random walk dynamicsI Start at a node i and take a few stepsI Rapidly explore the local community (only one?)I Probability “leaks” into adjoining communities (slowly?)

Page 22: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

The random walker, take 2

If a random walk starts at known i , goes k steps:

pk (j) = P{end at j | start at i} = eTj T kei

If end at known j , uniform prior on starting point, then

qk (i) = P{start at i | end at j} =1

Zj,keT

j T kei

Idea: extract structure from how fast we forget starting pointsDay 1: David came up with a funny joke!Day 2: There’s a joke going around the CS department.Day 3: I read this bad joke while browsing the web...

Page 23: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Simon-Ando theory

Markov chain with loosely-coupled subchains. Dynamics are:I Rapid local mixing: after a few steps

pk ≈c∑

j=1

αj,kp(j)∞

where p(j)∞ is a local equilibrium for the j th subchain

I Slow equilibration: αj,k → αj,∞.

Alternately, rapid local mixing looks like:

φk ≈c∑

j=1

γj,ksj

where sj is an indicator for nodes in one subchain.

Page 24: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Simon-Ando theory

In chemistry:transient dynamics = transitions among metastable states.

In network analysis:transient dynamics = transitions among communities?

But what if mixing happens so fast we miss the transient?

Page 25: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Random walks and spectrum

Write

T =12

(I + T )

T = AD−1 = D1/2AD−1/2

A = D−1/2AD−1/2

Have an eigendecomposition A = QΛQT . Then

pk = (D1/2Q)(I + Λ)k (QT D−1/2)p0

φk = (D−1/2Q)(I + Λ)k (QT D1/2)φ0/Zj,k

Page 26: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Spectral picture of Simon-Ando

φk = (D−1/2Q)(I + Λ)k (QT D1/2)φ0/Zj,k

=1

Zj,k

n∑j=1

(1 + λj)kcj(D−1/2φj)

≈ 1Zj,k

c∑j=1

(1 + λj)kcj(D−1/2qj)

I Gap in the spectrum between λc and λc+1

I After a few steps, (1 + λc+1)k negligible, but not (1 + λc)k .So qk lies approximately in span of D−1/2q1, . . . ,D−1/2qc .

I Treat as a perturbation of decoupled case where subchainindicator vectors are eigenvectors for unit eigenvalues.D−1/2qj ≈ linear combo of indicators, j = 1, . . . , c.

Page 27: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Summary so far

Two pieces to spectral community detection:I Pull out an invariant subspaceI Mine the subspace for community structure

Motivation: optimization or random walk dynamics.

But...I What about when n and c are both large?I What if there is no clear spectral gap?

Would like an alternative to invariant subspaces!

Page 28: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Eigenvectors to Ritz vectors

Eigenvectors are stationary points of Rayleigh quotients.Find stationary points in a subspace =⇒ Ritz vectors.

Usual approach to large-scale eigenproblems:1. Generate a basis for a Krylov subspace

Kk (A, x0) = span{x0,Ax0,A2x0, . . . ,Ak−1x0}

2. Ritz values rapidly approximate extreme eigenvalues3. Ritz vectors approximate extreme eigenvectors

Idea: Instead of searching invariant subspace, search in aspace spanned by a few scaled Ritz vectors. Pulls outdynamics of short random walks (vs long).

Page 29: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Current favorite method

1. Pick “seed” nodes j1, j2, . . .2. Take short random walks (length k ) from each seed3. Extract a few Ritz vectors (fewer than k ) from

span{φ0, φ1, . . . , φk−1}.4. Use quadratic programming to find approximate indicators

in subspace space spanned by all Ritz vectors.5. Possibly add more seeds and return to step 1.6. Threshold to get initial indicator approximation.7. Greedily optimize angle between indicator and space.

Page 30: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Wang test graph

1

2

3

4

5

6 7

8

9

10

11

12

13

14

15

16

17

1819

20

21

22

23

24

25

26

27

28

29

30

Page 31: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Spectrum for Wang test graph

5 10 15 20 25 30−0.5

0

0.5

1

Index

Eig

enva

lue

Page 32: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Zachary Karate graph

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

Page 33: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Spectrum for Karate

5 10 15 20 25 30

−0.5

0

0.5

1

Index

Eig

enva

lue

Page 34: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Football graph

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

3031

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

5960

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

Page 35: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Spectrum for Football

20 40 60 80 100

0

0.5

1

Index

Eig

enva

lue

Page 36: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Dolphin graph

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24 25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

Page 37: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Spectrum for Dolphin

10 20 30 40 50 60

−0.5

0

0.5

1

Index

Eig

enva

lue

Page 38: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Non-overlapping synthetic benchmark (µ = 0.5)

0 100 200 300 400 500 600 700 800 900 1000

0

100

200

300

400

500

600

700

800

900

1000

nz = 15746

Page 39: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Spectrum for synthetic benchmark

10 20 30 40 50 60 70 80 90 100

0.4

0.6

0.8

1

Index

Eig

enva

lue

Page 40: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Score vector

100 200 300 400 500 600 700 800 900 1,0000

0.2

0.4

0.6

0.8

1

Index

Sco

re

Score vector for the two-node seed of 492 and 513 in the firstLFR benchmark graph. Ten steps, three Ritz vectors.

Page 41: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Non-overlapping synthetic benchmark (µ = 0.6)

0 100 200 300 400 500 600 700 800 900 1000

0

100

200

300

400

500

600

700

800

900

1000

nz = 15316

Page 42: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Spectrum for synthetic benchmark

10 20 30 40 50 60 70 80 90 100

0.4

0.6

0.8

1

Index

Eig

enva

lue

Page 43: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Score vector

100 200 300 400 500 600 700 800 900 1,0000

0.5

1

Index

Sco

re

Score vector for the two-node seed of 492 and 513 in the firstLFR benchmark graph. Ten steps, three Ritz vectors.

Page 44: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Overlapping synthetic benchmark (µ = 0.3)

I 1000 nodesI 47 communitiesI 500 nodes belong to two communities

Page 45: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Spectrum for synthetic benchmark

10 20 30 40 50 60 70 80 90 100

0.4

0.6

0.8

1

Index

Eig

enva

lue

Page 46: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Score vector

100 200 300 400 500 600 700 800 900 1,0000

0.5

1

Index

Sco

re

Score vector for the two-node seed of 521 and 892.The desired indicator is in red.

Page 47: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Score vector

100 200 300 400 500 600 700 800 900 1,0000

0.5

1

1.5

Index

Sco

re

Score vector for the two-node seed of 521 and 892 +twelve reseeds. The desired indicator is in red.

Page 48: Communities, Spectral Clustering, and Random Walksbindel/present/2011-09-scan.pdf · Indicators from subspaces, take 2 Suppose s ˇUy for some y, si = 1. Want to find s. Try optimization

Conclusions

Classic spectral methods use eigenvectors to find communities,but:

I We don’t need to stop at partitioning!I Overlap is okayI Key is how we mine the subspace

I We don’t need to stop at eigenvectors!I Can also use Ritz vectorsI Computation is cheap: short random walks


Recommended