Finding All Maximal Cliques in Very Large Social Networks

Post on 16-Apr-2017

227 views 0 download

transcript

Finding All Maximal Cliques in Very Large Social Networks

16 March 2016, EDBT 2016, Bordeaux, France

Alessio Conte°, Roberto De Virgilio§, Antonio Maccioni§, Maurizio Patrignani§, Riccardo Torlone§

°Università di Pisa §Università Roma Tre

Social Network Analysis

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 1

Community Detection

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 2

Cliques

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 3

A

J H

F

Z

A

J

A

J

F

A

J H

F

Maximal Clique Enumeration (MCE)

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 4

A

JH H

F

D D E

S

E Y

E G

S U

S W

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 5

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

D

US

W

E

E

S D

GY

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 6

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

D

US

W

E

E

S D

GY

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 7

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

D

US

W

E

E

S D

GY

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 8

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

Z RP

D

L

E

S

WU

X

G

Y

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 9

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

Z RP

D E

E

S X

GY

D

L

S

W

U

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 10

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

Z RP

D E

E

S X

GY

D

L

S

W

U

Distributed MCE

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 11

[Cheng et al., KDD 2012][Cheng et al., SIGMOD 2010]

Block size m (Max number of nodes) = 5

A

JH

F

D

Z RP

D E

ES

X

GY

D

L

S

W

U

D E

S

undetected cliques

Hub Nodes Effect

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 12

Block size m (Max number of nodes) = 5

A

JH

F

Z RP

D

L

E

S

WU

X

G

Y

The Block Size Effect

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 13

* Taken from [Cheng et al., KDD 2012]

efficiency vs completeness/correcteness

Overview of the Approach

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 14

G = (N, E)

C = c1, c

2, ...,

c

n

1st level decomposition

Induced graph2nd level decomposition

FIND MAX CLIQUES

FIND MAX CLIQUES

Block analysis

Nf N

h

Block 1 Block z...

UC

fC

h

1st Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 15

Separate hubs from the rest of nodes in N- according to a maximum block size m

1st level decompositionFIND MAX CLIQUES

Nf N

h

G = (N, E)

A

JH

F

Z R

P

L WU

X

GY

D ES

1st Level Decomposition - Lemma

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 16

The set of all maximal cliques C of G can be obtained by computing C

f and C

h alone

- C = Cf U C

h

- Cf is the set of cliques with at least one node in N

f

- Ch is the set of cliques with all the nodes in N

h

- proof of this Lemma is in our paper

2nd Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 17

2nd level decomposition

FIND MAX CLIQUES

Nf

Block 1 Block z...

2nd Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 18

Kernel node Border nodeKernel node Visited node

B1

A

J

H2nd level decomposition

FIND MAX CLIQUES

Nf

Block 1 Block z...

2nd Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 19

Kernel node Border nodeKernel node Visited node

B1

A

J

H2nd level decomposition

FIND MAX CLIQUES

Nf

Block 1 Block z...

2nd Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 20

Kernel node Border nodeKernel node Visited node

B1

A

J

H

F

D 2nd level decomposition

FIND MAX CLIQUES

Nf

Block 1 Block z...

2nd Level Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 21

Kernel node Border nodeKernel node Visited node

Maximal Clique Enumeration

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 22

There are many algorithms for MCE- no one outperforms the others- but each has specific advantages

Tomita et al. Eppstein et al....

FIND MAX CLIQUES

Block analysis

Block 1 Block z...

Cf

Block Analysis

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 23

We determine the best-fit MCE algorithm on each block

Block analysis

Select best-fit

Tomita et al. Eppstein et al....

Cf

Block

FIND MAX CLIQUES

Block analysis

Block 1 Block z...

Cf

Decision Tree

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 24

Recursion Over Hub Nodes

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 25

Induced graph

FIND MAX CLIQUES

FIND MAX CLIQUES

Nh

Ch

Recursion Over Hub Nodes

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 26

The induced graph of the hub nodes is recursively processed

B12

DE

S

Kernel node Border nodeKernel node Visited node

Induced graph

FIND MAX CLIQUES

FIND MAX CLIQUES

Nh

Ch

Convergence Guarantee

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 27

Given a degeneracy of the graph lower than the block size, the whole process converges

Theorem. Let G be a graph and let Gi, with i=1, 2, 3, ... be a sequence

of subgraphs of G such that G1 = G and G

i, for i > 1 is the graph induced

by the nodes of Gi−1

of degree greater or equal than m. Let the degeneracy d of G be strictly less than m + 1.

1. There is a value q such that all Gj , with j ≥ q, are empty graphs.

2. There exists a graph with n nodes for which q is Ω(n).

Degeneracy and Sparsity

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 28

A measure of the sparsity of a graph- it is the highest value d for which the network contains a d-core. A d-core is obtained by recursively removing nodes with degree less

than d- degeneracy is typically < 100 on scale-free real graphs- facebook has a degeneracy ~ 54

Experiments: Decomposition

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 29

Experiments: the Block Size Effect

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 30

Experiments: Decision Tree

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 31

Experiments: Effectiveness

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 32

Conclusion

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 33

Approach for computing maximal cliques over an arbitrarily large graph- taking advantage of distributed computation- leveraging on the advantages of different existing algorithms for MCE

Completeness and correcteness are not compromised by hub nodes- unlike state-of-the-art

Future Work

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 34

Take into account the “semantics” of the graph in order to search for specific kind of cliques

Extend our framework for computing more relaxed communities- k-plexes, k-clans, k-cores, etc.

Thanks For The Attention

Finding all Maximal Cliques in Very Large Social Networks @ EDBT 2016 35