+ All Categories
Home > Documents > PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm...

PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm...

Date post: 16-Sep-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
23
PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics Workshop on Ranking Palo Alto, CA August 17th, 2010 Gleich & Langville AIM 1 / 21
Transcript
Page 1: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

PAGERANK PARAMETERS

David F. GleichAmy N. Langville

American Institute of MathematicsWorkshop on Ranking

Palo Alto, CAAugust 17th, 2010

Gleich & Langville AIM 1 / 21

Page 2: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

The most important page on the web

Gleich & Langville Recap AIM 2 / 21

Page 3: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

The most important page on the web

Gleich & Langville Recap AIM 2 / 21

Page 4: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

PageRank details

1

2

3

4

5

6

1/6 1/2 0 0 0 01/6 0 0 1/3 0 01/6 1/2 0 1/3 0 01/6 0 1/2 0 0 01/6 0 1/2 1/3 0 11/6 0 0 0 1 0

︸ ︷︷ ︸

P

Pj≥0eTP=eT

“jump” → v = [ 1n ... 1n ]T ≥0

eTv=1

Markov chain�

αP+ (1− α)veT�

x = xunique x ⇒ j ≥ 0, eTx = 1.

Linear system (− αP)x = (1− α)vIgnored dangling nodes patched back to v

algorithms laterGleich & Langville Recap AIM 3 / 21

Page 5: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

Other uses for PageRankWhat else people use PageRank to do

GeneRankMorrison et al. GeneRank, 2005

10 20 30 40 50 60 70

NM_003748NM_003862Contig32125_RCU82987AB037863NM_020974Contig55377_RCNM_003882NM_000849Contig48328_RCContig46223_RCNM_006117NM_003239NM_018401AF257175AF201951NM_001282Contig63102_RCNM_000286Contig34634_RCNM_000320AB033007AL355708NM_000017NM_006763AF148505Contig57595NM_001280AJ224741U45975Contig49670_RCContig753_RCContig25055_RCContig53646_RCContig42421_RCContig51749_RCAL137514NM_004911NM_000224NM_013262Contig41887_RCNM_004163AB020689NM_015416Contig43747_RCNM_012429AB033043AL133619NM_016569NM_004480NM_004798Contig37063_RCNM_000507AB037745Contig50802_RCNM_001007Contig53742_RCNM_018104Contig51963Contig53268_RCNM_012261NM_020244Contig55813_RCContig27312_RCContig44064_RCNM_002570NM_002900AL050090NM_015417Contig47405_RCNM_016337Contig55829_RCContig37598Contig45347_RCNM_020675NM_003234AL080110AL137295Contig17359_RCNM_013296NM_019013AF052159Contig55313_RCNM_002358NM_004358Contig50106_RCNM_005342NM_014754U58033Contig64688NM_001827Contig3902_RCContig41413_RCNM_015434NM_014078NM_018120NM_001124L27560Contig45816_RCAL050021NM_006115NM_001333NM_005496Contig51519_RCContig1778_RCNM_014363NM_001905NM_018454NM_002811NM_004603AB032973NM_006096D25328Contig46802_RCX94232NM_018004Contig8581_RCContig55188_RCContig50410Contig53226_RCNM_012214NM_006201NM_006372Contig13480_RCAL137502Contig40128_RCNM_003676NM_013437Contig2504_RCAL133603NM_012177R70506_RCNM_003662NM_018136NM_000158NM_018410Contig21812_RCNM_004052Contig4595Contig60864_RCNM_003878U96131NM_005563NM_018455Contig44799_RCNM_003258NM_004456NM_003158NM_014750Contig25343_RCNM_005196Contig57864_RCNM_014109NM_002808Contig58368_RCContig46653_RCNM_004504M21551NM_014875NM_001168NM_003376NM_018098AF161553NM_020166NM_017779NM_018265AF155117NM_004701NM_006281Contig44289_RCNM_004336Contig33814_RCNM_003600NM_006265NM_000291NM_000096NM_001673NM_001216NM_014968NM_018354NM_007036NM_004702Contig2399_RCNM_001809Contig20217_RCNM_003981NM_007203NM_006681AF055033NM_014889NM_020386NM_000599Contig56457_RCNM_005915Contig24252_RCContig55725_RCNM_002916NM_014321NM_006931AL080079Contig51464_RCNM_000788NM_016448X05610NM_014791Contig40831_RCAK000745NM_015984NM_016577Contig32185_RCAF052162AF073519NM_003607NM_006101NM_003875Contig25991Contig35251_RCNM_004994NM_000436NM_002073NM_002019NM_000127NM_020188AL137718Contig28552_RCContig38288_RCAA555029_RCNM_016359Contig46218_RCContig63649_RCAL080059

Use (− αGD−1)x =w tofind “nearby” importantgenes.

ProteinRankObjectRankEventRank

IsoRankClustering

Sports rankingFood websCentrality

Reverse PageRankFutureRank

SocialPageRankBookRank

ArticleRankItemRankSimRank

DiffusionRankTrustRankTweetRank

Note New paper LabRank with a random scientist?

Gleich & Langville Recap AIM 4 / 21

Page 6: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

Ulam NetworksChirikov mapyt+1 = ηyt+k sin(t+θt)t+1 = t + yt+1

Ulam network1. divide phase space into uniform cells2. form P based on trajectories.

log(E [x(A)]) log(Std [x(A)]))/ log(E [x(A)])

A ∼ Bet(2,16)Note White is larger, black is smaller

Google matrix, dynamical attractors, and Ulam networks, Shepelyansky and Zhirov, arXivGleich & Langville Recap AIM 5 / 21

Page 7: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

Choosing alphaSlide 6 of 21

Choosing alpha

Choosing personalization

Related methods

Open issues

Page 8: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

What is alpha? There’s no single answer.Ask yourself, why am I computing PageRank? Then use the bestvalue for your application.

web-search → tune α for the best featurevector

node centrality → understand what randomjumps mean in your graph

find important nodesin a web-graph → use the random surfer inter-

pretation

Author αBrin and Page (1998) 0.85Najork et al. (2007) 0.85Litvak et al. (2006) 0.5Pan el al. (2004) 0.15Algorithms (...) ≥ 0.85Experiment ???

Gleich & Langville Choosing alpha AIM 7 / 21

Page 9: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

The PageRank limit valueSingular? (− αP)x = (1− α)v

P = X�

00 J1

X−1

− αX�

00 J1

X−1�

x = (1− α)v

X

− α�

00 J1

��

X−1x = (1− α)v�

− α�

00 J1

��

y = (1− α)z

(1− α)y1 = (1− α)z1(− αJ2)y2 = (1− α)z2

Boldi et al. 2003: PageRank as a function of the damping parameter

Gleich & Langville Choosing alpha AIM 8 / 21

Page 10: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

TotalRank

t =∫ 1

0x(α)dα

Proposed by Boldi et al. (2005) as a parameter free PageRank.

Gleich & Langville Choosing alpha AIM 9 / 21

Page 11: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

Generalized PageRank

PageRank (− αP)x = (1− α)vx =

∑∞=0(1− α)(α

)Pv

Generalized PageRank y =∑∞

=0 ƒ ()Pv

ƒ () <∞

TotalRank ƒ () = 1+1 −

1+2

LinearRank ...HyperRank ...

Baeza-Yates et al. 2006

Gleich & Langville Choosing alpha AIM 10 / 21

Page 12: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

Pick a distributionMultiple surfers should have an impact!

Each person picks α from distribution A

↓x(E [A])

...

↓E [x(A)]

↘ ↙x(E [A]) 6= E [x(A)]

TotalRank : E [x(A)] : A ∼ U[0,1]Constantine & Gleich, Internet Mathematics, in press.

Gleich & Langville Choosing alpha AIM 11 / 21

Page 13: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

From users

Raw α

density

0.0

0.5

1.0

1.5

2.0

0.0 0.2 0.4 0.6 0.8 1.0

Sample mean μ̄ = 0.631.Gleich et al., WWW2010Note 257,664 users from Microsoft toolbar data

Gleich & Langville Choosing alpha AIM 12 / 21

Page 14: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

ChoosingpersonalizationSlide 13 of 21

Choosing alpha

Choosing personalization

Related methods

Open issues

Page 15: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

Personalization choices

Application specific

É GeneRank : v = normalized microarray weightsÉ TopicRank: v = pages on the same topicÉ TrustRank: v = only pages known to be goodÉ BadRank: v = only pages known to be bad (an reverse the

graph)

Super-personalized

É Set v to have only a single non-zero : v = e.

Gleich & Langville Choosing personalization AIM 14 / 21

Page 16: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

Personalized PageRank

B = (1− α)(− αP)−1

Bj = “personalized score of page when jumping to page ”

Gleich & Langville Choosing personalization AIM 15 / 21

Page 17: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

Related methodsSlide 16 of 21

Choosing alpha

Choosing personalization

Related methods

Open issues

Page 18: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

PageRank history

See Vigna 2010: Spectral Ranking andFranceschet 2010: PageRank: Standing on the shoulder of giants.

Let A be the adjacency matrix of a graph.PageRank (− αP)x = (1− α)v (αP+ (1− α)veT )x = x

Seeley 1949 Px = x

Wei 1952 ATx = x

Katz 1953 (− αA)x = e

Hubbell 1965 ATx = x+ v

Gleich & Langville Related methods AIM 17 / 21

Page 19: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

Graph centrality

For a graph G, a score assigned to each vertex ∈ V is acentrality score if larger scores are “more central” vertices andthe score is independent of the labeling on the vertices.

Gleich & Langville Related methods AIM 18 / 21

Page 20: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

Open issuesSlide 19 of 21

Choosing alpha

Choosing personalization

Related methods

Open issues

Page 21: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

From

Vigna, A history of spectral ranking, MMDS2010

Page 22: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

Other issues

Gleich & Langville Open issues AIM 21 / 21

Page 23: PAGERANK PARAMETERS - University of Chicagolekheng/meetings/matho...40 60 80 100 120 40 60 80 mm PAGERANK PARAMETERS David F. Gleich Amy N. Langville American Institute of Mathematics

40 60 80 100 120

40

60

80

mm

QUESTIONS?


Recommended