+ All Categories
Home > Documents > Large networks , clusters and Kronecker products

Large networks , clusters and Kronecker products

Date post: 23-Feb-2016
Category:
Upload: nerys
View: 43 times
Download: 0 times
Share this document with a friend
Description:
Jure Leskovec ([email protected]) Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos Faloutsos (CMU), Michael Mahoney (Stanford), Kevin Lang (Yahoo), Anirban Dasgupta (Yahoo). - PowerPoint PPT Presentation
Popular Tags:
24
Large networks, clusters and Kronecker products Jure Leskovec ([email protected]) Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos Faloutsos (CMU), Michael Mahoney (Stanford), Kevin Lang (Yahoo), Anirban Dasgupta (Yahoo)
Transcript
Page 1: Large networks , clusters and  Kronecker  products

Large networks, clusters and Kronecker productsJure Leskovec ([email protected])Computer Science DepartmentCornell University / Stanford UniversityJoint work with: Jon Kleinberg (Cornell), Christos Faloutsos (CMU), Michael Mahoney (Stanford), Kevin Lang (Yahoo), Anirban Dasgupta (Yahoo)

Page 2: Large networks , clusters and  Kronecker  products

Rich data: Networks Large on-line computing applications have

detailed records of human activity: On-line communities: Facebook (120 million) Communication: Instant Messenger (~1 billion) News and Social media: Blogging (250 million)

We model the data as a network (an interaction graph)

Can observe and study phenomena at scales not

possible before Communication network

Page 3: Large networks , clusters and  Kronecker  products

3

Small vs. Large networks Community (cluster) structure of networks

Collaborations in NetSci (N=380) Tiny part of a large social network

What is the structure of the network? How can we model that?

Page 4: Large networks , clusters and  Kronecker  products

4

Conductance (normalized cut):

How expressed are communities? How community like is a set of

nodes? Idea: Use approximation

algorithms for NP-hard graph partitioning problems as experimental probes of network structure.

Small Φ(S) == more community-like sets of nodes

S

S’

[w/ Mahoney, Lang, Dasgupta, WWW ’08]

Page 5: Large networks , clusters and  Kronecker  products

5

Network Community Profile Plot We define:

Network community profile (NCP) plotPlot the score of best community of size k

Community size, log k

log Φ(k)Φ(5)=0.25

Φ(7)=0.18

k=5 k=7

[w/ Mahoney, Lang, Dasgupta, WWW ’08]

Page 6: Large networks , clusters and  Kronecker  products

6

NCP plot: Network Science Collaborations between scientists in

Networks [Newman, 2005]

Community size, log k

Cond

ucta

nce,

log

Φ(k

)

[w/ Mahoney, Lang, Dasgupta, WWW ’08]

Page 7: Large networks , clusters and  Kronecker  products

7

NCP plot: Large network Typical example:

General relativity collaboration network (4,158 nodes, 13,422 edges)

[w/ Mahoney, Lang, Dasgupta, WWW ’08]

Page 8: Large networks , clusters and  Kronecker  products

8

More NCP plots of networks

[w/ Mahoney, Lang, Dasgupta, WWW ’08]

Page 9: Large networks , clusters and  Kronecker  products

9

Φ(k

), (c

ondu

ctan

ce)

k, (community size)

NCP: LiveJournal (n=5m, e=42m)

Better and better

communities

Communities get worse and worse

Best community has ~100

nodes

[w/ Mahoney, Lang, Dasgupta, WWW ’08]

Page 10: Large networks , clusters and  Kronecker  products

10

Community size is bounded!

Each dot is a different networkPractically constant!

[w/ Mahoney, Lang, Dasgupta, WWW ’08]

Page 11: Large networks , clusters and  Kronecker  products

11

Structure of large networks

Core-periphery (jellyfish, octopus)

Small good

communities

Denser and denser core

of the network

Core contains ~60% nodes and ~80%

edges

So, what’s a good model?

Page 12: Large networks , clusters and  Kronecker  products

12

Kronecker product: Definition Kronecker product of matrices A and B is given by

We define a Kronecker product of two graphs as a Kronecker product of their adjacency matrices

N x M K x L

N*K x M*L

[w/ Chakrabarti-Kleinberg-Faloutsos, PKDD ’05]

Page 13: Large networks , clusters and  Kronecker  products

13

Kronecker graphs Kronecker graph: a growing sequence of

graphs by iterating the Kronecker product

Each Kronecker multiplication exponentially increases the size of the graph

One can easily use multiple initiator matrices (G1

’, G1’’, G1

’’’ ) that can be of different sizes

[w/ Chakrabarti-Kleinberg-Faloutsos, PKDD ’05]

Page 14: Large networks , clusters and  Kronecker  products

14

Kronecker graphs

Kronecker graphs mimic real networks: Theorem: Power-law degree distribution, Densification,

Shrinking/stabilizing diameter, Spectral properties

Initiator(9x9)(3x3)

(27x27)

pij

Edge probability Edge probability

Starting intuition: Recursion & self-similarity

[w/ Chakrabarti, Kleinberg, Faloutsos, PKDD ’05]

Page 15: Large networks , clusters and  Kronecker  products

15

Various Kronecker initiator matrices

Page 16: Large networks , clusters and  Kronecker  products

16

Kronecker graphs: Interpretation Initiator matrix G1 is a similarity

matrix Node u is described with k binary

attributes: u1, u2 ,…, uk Probability of a link between

nodes u, v:P(u,v) = ∏ G1[ui, vi]

1G a bc d

a b

c d

a bc d

v

u = (0,1,1,0)

P(u,v) = b∙d∙c∙b

0 101 v = (1,1,0,1)

u

Given a real graph. How to estimate the

initiator G1?

Page 17: Large networks , clusters and  Kronecker  products

17

Estimating Kronecker graphs Want to generate realistic networks:

How to estimate initiator matrix: Method of moments [Owen ‘09]:

Compare counts of subgraphs and solve Maximum likelihood [Leskovec&Faloutsos, ’07]:

arg max P( | G1) SVD [VanLoan&Pitsianis ‘93]:

Can solve using SVD

Compare graphs properties, e.g., degree

distribution

Given a real network

Generate a synthetic network

1Ga bc d

211min

FGGG

Page 18: Large networks , clusters and  Kronecker  products

18

Kronecker & Network structure What do estimated parameters

tell us about the network structure?

[w/ Dasgupta-Lang-Mahoney, WWW ’08]

1G a bc d a edges d edges

b edges

c edges

Page 19: Large networks , clusters and  Kronecker  products

19

Kronecker & Network structure What do estimated parameters

tell us about the network structure?

Core0.9

edgesPeriphery0.1 edges

0.5 edges

0.5 edges

Core-periphery (jellyfish, octopus)

[w/ Dasgupta-Lang-Mahoney, WWW ’08]

1G 0.9 0.50.5 0.1

Page 20: Large networks , clusters and  Kronecker  products

20

Small vs. Large networks Small and large networks are very

different:

Collaboration network (N=4,158, E=13,422)

Scientific collaborations (N=397, E=914)

0.99 0.54

0.49 0.13

0.99 0.17

0.17 0.82G1 = G1 =

Page 21: Large networks , clusters and  Kronecker  products

21

Conclusion Computational tools as probes into the structure of

large networks Community structure of large networks:

Core-periphery structure Scale to natural community size: Dunbar number

Model: Kronecker graphs Analytically tractable: provable properties Can efficiently estimate parameters from data

Implications: No large clusters: no/little hierarchical structure Can’t be well embedded – no underlying geometry

Page 22: Large networks , clusters and  Kronecker  products

22

Reflections Why are networks the way they are? Only recently have basic properties been

observed on a large scale Confirms social science intuitions; calls others

into question What are good tractable network models?

Builds intuition and understanding Benefits of working with large data

Observe structures not visible at smaller scales

Page 23: Large networks , clusters and  Kronecker  products

[email protected]://cs.stanford.edu/~jure

Page 24: Large networks , clusters and  Kronecker  products

24

References Graphs over Time: Densification Laws, Shrinking Diameters and Possible Explanations, by

J. Leskovec, J. Kleinberg, C. Faloutsos, KDD 2005

Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication, by J. Leskovec, D. Chakrabarti, J. Kleinberg and C. Faloutsos, PKDD 2005

Scalable Modeling of Real Graphs using Kronecker Multiplication, by J. Leskovec and C. Faloutsos, ICML 2007

Statistical Properties of Community Structure in Large Social and Information Networks, by J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney, WWW 2008

Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters, by J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney, Arxiv 2008


Recommended