+ All Categories
Home > Documents > Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking...

Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking...

Date post: 22-Dec-2015
Category:
View: 218 times
Download: 1 times
Share this document with a friend
44
Topic-Sensitive PageRank Taher H. Haveliwala
Transcript
Page 1: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Topic-Sensitive PageRank

Taher H. Haveliwala

Page 2: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

PageRank

Importance is propagatedA global ranking vector is pre-computed

Page 3: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

PageRank

Page 4: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Topic-Sensitive PageRank

Basic idea For each topic, the importance scores for each page

are computed Composite score of a page are calculated by

combining the scores of the page based on the topics of the query

Page 5: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Topic-Sensitive PageRank

ODP-Biasing The top level categories of the Open Directory (16 topics)

is used Let Tj be the set of URLs in the ODP categories cj

In computing the PageRank vector for topic cj, we replace the uniform damping vector by the non-uniform vector where

It will be referred as

Page 6: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Topic-Sensitive PageRank

We chose to make P(cj) uniform

Page 7: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Topic-Sensitive PageRank

Page 8: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experiment

Page 9: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

Similarity Measure for Induced Rankings overlap of two sets A and B

= . k = 20

Kendall’s distance measure

Page 10: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

Page 11: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

Page 12: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

Page 13: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

Page 14: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

Page 15: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

Query-Sensitive Scoring User Study

10 queries (randomly selected from our test set) 5 volunteers For each query, the volunteer was shown 2 result

rankings:• 1. top 10 results ranked with the unbiased PageRank

vector• 2. top 10 results ranked with the topic-sensitive

PageRank vector

Page 16: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

User Study( con’t) The volunteer was asked to

• 1. select all URLs which were “relevant” to the query• 2. select the ranking list which is better

(They were not told anything about how either of the rankings was generated.)

Page 17: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

Page 18: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

Page 19: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

Context-Sensitive Scoring

Page 20: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experimental Results

Page 21: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Other issues

Search Context hierarchical directory users’ browsing patterns Bookmarks email archives

Page 22: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Other issues

Flexibility Apply to any kinds of context

Transparency tune the classifier used on the search context, or adjust

topic weights

Privacy a client-side program could use the user context to

generate the user profile locally

Efficiency query-time cost and the offline preprocessing cost is low

Page 23: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Automatic Identification of User Interest For Personalized Search

Feng Qiu Junghoo Cho

Page 24: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

User Preference Representation

Topic Preference Vector T = [T(1),…,T(m)] T(i) represents the user’s degree of interest in the ith

topic

Page 25: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

User Preference Representation

Page 26: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

User Model

Topic-Driven Random Surfer Model• The user browses the web in a two-step process.• First, the user chooses a topic of interest t for the

ensuing sequence of random walks with probability T(t)• Then with equal probability, she jumps to one of the

pages on topic t• Starting from this page, the user then performs a random

walk, such that at each step, with probability d, she randomly follows an out-link on the current page; with the remaining probability 1-d she gets bored and picks a new topic of interest for the next sequence of random walks based on T and jumps to a page on the chosen topic.

• This process is repeated forever.

Page 27: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

User Model

Topic-Driven Searcher Model• The user always visits web pages through a search

engine in a two-step process.• First, the user chooses a topic of interest t with

probability T(t).• Then the user goes to the search engine and issues a

query on the chosen topic t. • The search engine then returns pages ranked by

TSPRt(p), on which the user clicks.

Page 28: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

User Model

Relationship between V and T Under Topic-Driven Random Surfer Model

Under Topic-Driven Searcher Model

Page 29: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Learning Topic Preference Vector

Problem

Given V and TSPRi, find T satisfies

Page 30: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Learning Topic Preference Vector

Linear regression Minimize the square-root error

Maximum likelihood estimator **

= the probability that the user visits the page p

Page 31: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Ranking Search Results Using Topic Preference Vectors

Ranking of page p =

because

Page 32: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Evaluation Metrics

Accuracy of topic preference vector

Te is our estimation based on the user’s click history T is the user’s actual topic preference vector

Page 33: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Evaluation Metrics

Accuracy of personalized ranking Kendall distance between and is the sorted list of top-k pages based on the

estimated personalized ranking scores is the sorted list of top-k pages computed the user

‘s true preference vector

Page 34: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Evaluation Metrics

Improvement in search quality Average rank of relevant pages in the search

result

S denotes the set of the pages the user u selected

R(p) is the ranking of the page p

Page 35: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experiments

User Study 10 subjects in the UCLA Computer Science

Department 04/2004 – 10/2004 (6 months) Queries to Google, results and clicked URLs

average number of queries per subject = 255.6 average number of clicks per query = 0.91

Page 36: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experiments

Accuracy of Learning Method synthetic dataset generated by simulation based on

our topic-driven searcher model Generation of topic preference vector

• Randomly choose K topics and assign random weight for them. The weight of others are set to zero. The vector is then normalized

Generation of click history• Use the generated topic preference vector to generate the

clicks by the visit probability distribution dictated by the topic-driven searcher model

Page 37: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experiments

Accuracy of estimated topic preference vector

Page 38: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experiments

Accuracy of estimated topic preference vector

Page 39: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experiments

Accuracy of Personalized PageRank

Page 40: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experiments

Accuracy of Personalized PageRank

Page 41: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experiments

Quality of Personalized Search

Page 42: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Experiments

Quality of Personalized Search

Page 43: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Conclusion

Proposed a framework to investigate the problem of personalizing web searching by the user search history and TSPR

Conducted both theoretical and real life experiments to evaluate the approach

Page 44: Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

Thank you


Recommended