1
On the Eigenvalue Power Law
Milena Mihail Georgia Tech
Christos PapadimitriouU.C. Berkeley
&
2
Network and application studies need properties and models of:
Internet graphs & Internet Traffic.
Shift of networking paradigm: Open, decentralized, dynamic.
Intense measurement efforts. Intense modeling efforts.
Internet Measurement and Models
Routers
WWW
P2P
3
Internet & WWW Graphs
http://www.etc
http://www.XXX.net
http://www.YYY.com
http://www.etc http://www.ZZZ.edu
http://www.XXX.com
http://www.etc
Routers exchanging traffic. Web pages and hyperlinks.
10K – 300K nodesAvrg degree ~ 3
4
Real Internet Graphs
CAIDA http://www.caida.org
Average Degree = Constant
A Few Degrees VERY LARGE
Degrees not sharply concentrated around their mean.
5
Degree-Frequency Power Law
degree1 3 4 5 102 100
freq
uen
cy
WWW measurement: Kumar et al 99Internet measurement: Faloutsos et
al 99
E[d] = const., but
No sharp concentration
6
Degree-Frequency Power Law
1 3 4 5 102 100
freq
uen
cy
E[d] = const., but
No sharp concentration
degree
E[d] = const., but
No sharp concentration
Erdos-Renyi sharp concentration
Models by Kumar et al 00, x Bollobas et al 01, x Fabrikant et al 02
7
Rank-Degree Power Law
rank
deg
ree
1 2 3 4 5 10
Internet measurement: Faloutsos et al 99
UUNET
SprintC&WUSA
AT&TBBN
8
Eigenvalue Power Law
rank
eig
en
valu
e
1 2 3 4 5 10
Internet measurement: Faloutsos et al 99
9
This Paper: Large Degrees & Eigenvalues
rank
eig
en
valu
es
1 2 3 4 5 10
UUNET
SprintC&WUSA
AT&TBBN2
34
2 3 4
deg
ree
s
10
This Paper: Large Degrees & Eigenvalues
11
Principal Eigenvector of a Star
11
1
11
1
1
1
d
12
Large Degrees
2
3
4
13
Large Eigenvalues
2
3
4
14
Main Result of the Paper
The largest eigenvalues of the adjacency martix of a graph whose large degrees are power law distributed (Zipf), are also power law distributed.
Explains Internet measurements.
Negative implications for the spectral filtering method in information retrieval.
15
Random Graph Model
let
Connectivity analyzed by Chung & Lu ‘01
16
Random Graph Model
17
Random Graph Model
18
Theorem :
Ffor large enough
Wwith probability at least
19
Proof : Step 1. Decomposition
Vertex Disjoint StarsLR-extra
RR
LL
LR =
-
20
Proof: Step 2: Vertex Disjoint Stars
Degrees of each Vertex Disjoint Stars Sharply Concentrated around its Mean d_iHence Principal Eigenvalue Sharply Concentrated around
21
Proof: Step 3: LL, RR, LR-extra
LR-extra has max degree
LL has
edges
RR has max degree
22
Proof: Step 3: LL, RR, LR-extra
LR-extra has max degree
RR has max degree
LL has
edges
23
Proof: Step 4: Matrix Perturbation Theory
Vertex Disjoint Stars have principal eigenvalues
All other parts have max eigenvalue QED
24
Implication for Info Retrieval
Spectral filtering, without preprocessing, reveals only the large degrees.
Term-Norm Distribution Problem :
25
Implication for Info Retrieval
Term-Norm Distribution Problem : Spectral filtering, without preprocessing, reveals only the large degrees.
Local information.
No “latent semantics”.
26
Implication for Information Retrieval
Application specific preprocessing (normalization of degrees) reveals clusters:
WWW: related to searching, Kleinberg 97
IR, collaborative filtering, …
Internet: related to congestion, Gkantsidis et al 02
Open : Formalize “preprocessing”.
Term-Norm Distribution Problem :