Exploring Online Social Activities for Adaptive Search Personalization
CIKM’10Advisor : Jia Ling, KohSpeaker : SHENG HONG, CHUNG
Outline
• Introduction• System Design– Overview– User Interest Profile– Search Result Personalization– Adaptive adjustment
• Evaluation• Conclusion
2
Introduction
• Social network have experienced explosive growth in the past few years.
• Social online activities carry valuable information about users’ background and interests.
• How to choose right sources?– Availability– Privacy– Accuracy
3
Introduction
• In this article, proposing a personalization framework that infers users’ interests and preferences through public activities on a variety of online social systems.– Retrieve information and creates an interest
profile for each user.– Based on interest profile to personalize.– Automatically adjust weights of different
information.
4
Social activity Social activity Social activity Social activity……. ……. …….
System
user user user user………………………………………………
UserInterestprofile
UserInterestprofile
UserInterestprofile
UserInterestprofile
Personalization
Adaptive adjustment
System Design---Overview• User Interest Profile• create an interest profile for each user
• Receiving a query from a user– Search engine returns a number of webpages– Retrieve interest vector from interest profile– Compute interest score based on how well the webpage
matches the user’s interest• Search Result Personalization– Combined both scores into final score
• Adaptive Adjustments– Personalization degree– The weights of different social information sources
(relevance score)
5
userUser
Interestprofile
+
query
Search Engine
Webpage1Webpage2Webpage3Webpage4Webpage5
.
.
.
Search Result
Relevance score
Interest Vector Keyword : tScore : s
Cosinesimilarity
interest score + Personalization
User Interest Profile
• Three parts :– Creating interest vectors– Combining interest vectors– Updating interest vectors
• Pre-definition :– A user interest profile is represented as {V, W, p} – V : {v1,……,vk} a set of interest vectors– W : {w1,……,wk} weight of the corresponding interest
vectors– p : real number called the personalization degree
6
1. Facebook 2. Twitter 3. Bookmarks
System
user
{V, W, p}
V = {v1, v2, v3}v1 : user information from Facebookv2 : user information from Twitterv3 : user information from Bookmarksw1 ~ w3 : corresponding weight
Creating interest vectors• There are different ways to create an vector– Depending on information source
• Text resources :– Keywords : most important keywords– Score : the number of the texts contain this
keyword• Tag-based resources :– Keywords : tags are treated as keywords– Score : the number of people have tagged the user
with the keyword• For each user, normalize the scores into [0,1]
7
Combining interest vectors
8
1. Facebook 2. Twitter 3. Bookmarks
System
user
{V, W, p}
Rice(4)Noodle(2)
Spaghetti(2)...
T1 : { Rice, Noodle, Spaghetti }
0.4 0.3 0.3
s(t) = 4*0.4 + 2*0.4 + 2*0.4 = 3.2
Updating interest vectors
• Periodically crawl new data from social systems
• Integrate new information• Add new social information source– Add new interest vector and make use of new
data
• Give higher probability to new data
9
Search Result Personalization
• Relevance score– The search engine will then return a list of
webpages– 1 / (1+k) : kth webpage in the result list
• Interest score– Cosine similarity between the word vector of the
webpage and overall interest vector• Final score– gf(x) = gr(x) * (1-p) + gi(x) * p
10
Adaptive adjustment
• Adjusting personalization degree– Su : the set of search results that are actually
clicked by the user u– Lg : original list of results returned by the search
engine– Lp : final list of result returned by personalized
search system
11
NDCG : Normalized Discounted Cumulative Gain
Calculate two valuesNDCG(Lg , Su )NDCG(Lp , Su )
x : top or x of Lg or Lp
ri : 0 or 1ri = 1 : ith element of L is in Sri = 0 : Otherwise
Lg Lp Su
A A A
D B B
F G C
NDCG(Lg , Su ) = Z3 ( 1/1 + 0 + 0 ) = Z3
NDCG(Lp , Su ) = Z3 ( 1/1 + 1/log23 + 0) = 1… * (Z3)
NDCG(Lp , Su ) NDCG(Lg , Su ) >
personalization degree
Adaptive adjustment
• Adjusting source weights– Su : the set of search results that are actually
clicked by the user u– vi : the interest vector of the ith information
source
12
Su
A
B
C
As v1 : Facebook v2 : Twitter
h1( v1, Su ) = cos(v1, A) + cos(v1, B) + cos(v1, C)
h2( v2, Su ) = cos(v2, A) + cos(v2, B) + cos(v2, C)
The average of h = (h1+h2) /2
h1 and h2 which is greater than the average of h ?
Evaluation
• Experiment– Blogs – Social bookmarks– Mutual tags
• 208 users– At least 10 blogs– No less than 10 people tags– Bookmarked 20 webpages or more
13
Evaluation Method and Metrics• Use 25% bookmarks to create interest profile• The other 75% is the testing corpus• For ith user ui, randomly choose 30 words• Search query consisting of the word was issued
on behalf of ui
• Search query consists of a word t• Lt[1,k] is the list of top k results returned by the
search system• St is the set of webpages that have been tagged
with t by ui
14
Evaluation Method and Metrics
• Compute the average value of the recall over the 30 search queries issued for ui
• Improvement percentage
• ra && rb is the average recall of approaches A and B
15
Experimental Results
• Personalization v.s. Non-personalization
16
17
Experimental Results
• Active users v.s. Less active users
18
19
Experimental Results
• Multiple sources v.s. Single source
20
21
Experimental Results
• Effectiveness of adaption– Personalization degree adjustment (PDA)– Source weight initialization (SWI)– Source weight adjustment (SWA)
22
23
Conclusion
• Propose a personalization framework – Infer users’ preferences from their activities on
lots of online social systems– Create user interest profiles– Integrate information from different information
resources– How to personalize– Adaptive
24