Date post: | 27-Dec-2015 |
Category: |
Documents |
Upload: | howard-wilcox |
View: | 218 times |
Download: | 1 times |
Using Adaptive Methods for Updating/Downdating PageRank
Gene H. GolubStanford University SCCM
Joint Work With Sep Kamvar, Taher Haveliwala
2
Motivation Problem:
Compute PageRank after the Web has changed slightly
Motivation: “Freshness”
Note: Since the web is growing, PageRank Computations don’t get faster as computers do.
3
Outline 0.4
0.2
0.4
(k)1)(k Axx Power Method:
0
50
100
150
200
250
300
350
400
Definition of PageRank
Computation of PageRank
Convergence Properties
Outline of Our Approach
Empirical Results
4
Link Counts
Linked by 2 Important Pages
Linked by 2 Unimportant
pages
Martin’s Home Page
Gene’s Home Page
Yahoo! Iain Duff’s Home Page
George W. Bush Donald Rumsfeld
5
Definition of PageRank
The importance of a page is given by the importance of the pages that link to it.
jBj j
i xN
xi
1
importance of page i
pages j that link to page i
number of outlinks from page j
importance of page j
8
PageRank Diagram
Propagate ranks across links(multiplying by link weights)
0.167
0.167
0.333
0.333
13
Matrix Notation
jBj j
i xN
xi
1
0 .2 0 .3 0 0 .1 .4 0 .1=
.1
.3
.2
.3
.1
.1
.2
.1
.3
.2
.3
.1
.1TP
x
14
Matrix Notation
.1
.3
.2
.3
.1
.1
0 .2 0 .3 0 0 .1 .4 0 .1=
.1
.3
.2
.3
.1
.1
.2
xPx TFind x that satisfies:
15
Eigenvalue Distribution The matrix PT has several eigenvalues on
the unit circle. This will make power method-like algorithms less effective.
16
PageRank doesn’t actually use PT. Instead, it uses A=cPT + (1-c)ET.
E is a rank 1 matrix, and in general, c=0.85. This ensures a unique solution and fast convergence. For matrix A, 2=c. 1 1From “The Second Eigenvalue of the Google Matrix” (http://dbpubs.stanford.edu/pub/2003-20)
Rank-1 Correction
17
Outline Definition of PageRank
Computation of PageRank
Convergence Properties
Outline of Our Approach
Empirical Results
u1 u2 u3 u4 u5
u1 u2 u3 u4 u5
0.4
0.2
0.4
(k)1)(k Axx Repeat:
24
Why does it work?
Imagine our n x n matrix A has n distinct eigenvectors ui.
ii uAu i
n0 uuux n ...221)(
u1
1u2
2
u3
3
u4
4
u5
5
Then, you can write any n-dimensional vector as a linear combination of the eigenvectors of A.
25
Why does it work? From the last slide:
To get the first iterate, multiply x(0) by A.
First eigenvalue is 1.
Therefore:
...;1 211
n0 uuux n ...221)(
n
n
(0)(1)
uuu
AuAuAu
Axx
nn
n
...
...
22211
221
n(1) uuux nn ...2221
All less than 1
26
Power Method
n0 uuux n ...221)(
u1
1u2
2
u3
3
u4
4
u5
5
u1
1u2
22
u3
33
u4
44
u5
55
n(1) uuux nn ...2221
n)( uuux 2
22221
2 ... nn u1
1u2
222
u3
332
u4
442
u5
552
27
Outline Definition of PageRank
Computation of PageRank
Convergence Properties
Outline of Our Approach
Empirical Results
u1 u2 u3 u4 u5
u1 u2 u3 u4 u5
0.4
0.2
0.4
(k)1)(k Axx Repeat:
28
The smaller 2, the faster the convergence of the Power Method.
Convergence
n)( uuux k
nnkk ...2221
u1
1u2
22k
u3
33k
u4
44k
u5
55k
29
Quadratic Extrapolation (Joint work with Kamvar and Haveliwala)
u1 u2 u3 u4 u5
Estimate components of current iterate in the directions of second two eigenvectors, and eliminate them.
30
Facts that work in our favor For traditional problems:
A is smaller, often dense. 2 often close to , making the power method slow.
In our problem, A is huge and sparse More importantly, 2 is small1.
1(“The Second Eigenvalue of the Google Matrix” dbpubs.stanford.edu/pub/2003-20.)
31
How do we do this? Assume x(k) can be written as a linear
combination of the first three eigenvectors (u1, u2, u3) of A.
Compute approximation to {u2,u3}, and subtract it from x(k) to get x(k)’
32
Sequence Extrapolation A classical and important field in
numerical analysis: techniques for accelerating the convergence of slowly convergent infinite series and integrals.
33
Example: Aitken Δ2 - ProcessSuppose A=An+aλn+rn
where rn=bμn+o(min{1,|μ|n}),a, b, λ, μ all nonzero, |λ|>|μ|.It can be shown that
Sn = (AnAn+2–An+12)/(An-2An+1+An+2)
satisfies (as n goes to infinity)
| Sn-A|--------- O( (|μ|/|λ|)n = o(1).
|An-A| ….
34
In other words…
Assuming a certain pattern for the series is helpful in accelerating convergence.
We can apply this component-wise in order to get a better estimate of the eigenvector.
35
Another approach Assume the x(k) can be represented by
three eigenvectors of A:
33322211 uuuAxx )()( kk
n)( uuux 3221 k
32332
2221
2 uuux )( k
33332
3221
3 uuux )( k
36
Linear Combination We take some linear combination of
these 3 iterates.
)()()( xxx 33
22
11
kkk
)( 32332
22212 uuu
)( 33332
32213 uuu
)( 33322211 uuu
37
Rearranging Terms We can rearrange the terms to get:
)()()( xxx 33
22
11
kkk
1321 )( u
2323
222212 )( u
3333
232313 )( u
Goal: Find 1,2,3 so that coefficients of u2 and u3 are 0, and coefficient of u1 is 1.
38
Rearranging Terms We can rearrange the terms to get:
)()()( xxx 33
22
11
kkk
1321 )( u
2323
222212 )( u
3333
232313 )( u
Goal: Find 1,2,3 so that coefficients of u2 and u3 are 0, and coefficient of u1 is 1.
40
Estimating the coefficients
Procedure 1:
Set ß1=1 and solve the least squares problem.
Procedure 2:
Use the SVD for computing the coefficient of the characteristic polynomial.
42
Take-home message Quadratic Extrapolation estimates the
components of current iterate in the direction of the second and third eigenvector, and subtracts them off.
Achieves significant speedup, and ideas are useful for further speedup algorithms.
43
Summary of this part We make an assumption about the
current iterate. Solve for dominant eigenvector as a
linear combination of the next three iterates.
We use a few iterations of the Power Method to “clean it up”.
44
0.4
0.2
0.4
(k)1)(k Axx Power Method:
Outline Definition of PageRank
Computation of PageRank
Convergence Properties
Outline of Our Approach
Empirical Results
49
Updates Use the previous vector as a start vector. Speedup not that great. Why? The old pages converge quickly,
but the new pages still take long to converge.
But, if you use Adaptive PageRank, you save the computation on the old pages.
50
0
100
200
300
400
0.4
0.2
0.4
(k)1)(k Axx Repeat:
Outline Definition of PageRank
Computation of PageRank
Convergence Properties
Outline of Our Approach
Empirical Results
51
Empirical Results
0
50
100
150
200
250
300
350
400
MFlops
PageRank
PR using lastmonth's PRAPR using lastmonth's PR
3 Update Algorithms on Stanford Web
(n=700,000)
52
Take-home message Simply not recomputing PageRank of
pages that have converged after an update speeds up PageRank by a factor of 2.
53
An Arnoldi/SVD approach (joint work with C. Greif)
Perform Arnoldi (of degree k<<n) on A. Compute the SVD of the (k+1)-by-k unreduced
Hessenberg matrix, after subtracting the augmented identity matrix from it first.
Compute the linear combinations of the columns of the Arnoldi vectors Q with the null vector of H.
Use resulting vector as the new guess for the Arnoldi procedure.
Repeat until satisfied.
54
Advantages We *do* take advantage of knowing the largest
eigenvalue. (As opposed to most general purpose eigensolve-packages.)
Computing the corresponding eigenvector does not rely on prohibitive inversions or decompositions. (Matrix is BBBBBIIIIIGGGG!!)
Orthogonalizing ‘feels right’ from a numerical linear algebra point of view.
Smooth convergence behavior. Overhead is minimal.