Date post: | 18-Jan-2016 |
Category: |
Documents |
Upload: | herbert-benson |
View: | 214 times |
Download: | 0 times |
Personalized Social Media Search through Augmented Folksonomy Graph
Qing LI
Department of Computer Science
Multimedia Software Engineering Research Centre
City University of Hong Kong
Outline
Background Motivation Methodology Experiment Conclusion
Outline
Background Motivation Methodology Experiment Conclusion
Background
Huge amount of resources on the World Wide Web Users can contribute a lot more since Web 2.0 User Profiling – assist users to find out what they want
• Folksonomy/Collaborative tagging system: Large popularity on the Web
• Users can contribute on resource descriptions
• Users can annotate resources by tags from their perspectives
• A resource can be annotated by users collaboratively
Web pages photos music books videos
BackgroundCollaborative tagging systems
• Folksonomy/Collaborative tagging system: Large popularity on the Web
• Definition: A folksonomy F is a tuple
F = (U, T, R, Y), Y U X T X R⊆
Web pages photos music books videos
Users Tags Resources
BackgroundCollaborative tagging systems
Coffee fan Programmer
“Java” “Java”
Results from www.delicious.com
Most of the current resource search tools… only depends on the relevant match of the query and the resource descriptions no personalization
Tom Bob
BackgroundCollaborative tagging systems
-Modeling User by Tags (User Profiling) Tags that annotated by the user
-Modeling Resource by Tags (Resource Profiling) Tags given by users on the resource
Spicy, pork,…, delicious
Beef, lily,…, healthy
Acetous, chop,…, fried
Acetous, sugary, …,high-calorie
BackgroundCollaborative tagging systems
Personalized Resource Search Framework
M. G. Noll et al. (ISWC’ 07) (TF)
Web search personalization via
Social Bookmarking &Tagging
S. Xu et al. (SIGIR’ 08) (TF-IDF)
Exploring Folksonomy for Personalized Search
D. Vallet et al. (ECIR’ 10) (BM25)
Personalizing Web Search with
Folksonomy-based User & Document Profiles
Y. Cai & Q. Li (CIKM’ 10) (NTF)
Personalizing Search by Tag-based User Profile
& Resource Profile in Collaborative Tagging Systems
A. Fabian et al. (ESWC’ 11) (CDW)
Semantic Enrichment of Twitter Posts for User
Profile Construction on Social Web
QueryUser Profile
Resources
Input has
Resource rankBased onContent
Relevance
sim
ilari
ty
Resource rankBased on
User Interest Relevance
sim
ilari
ty
Personalized Rank
ag
gre
ga
tion
Background
Outline
Background Motivation Methodology Experiment Conclusion
-User Communities can help address the problem of
conflictive tags to provide more precise search results.
Motivation
-User Communities can help address the problem of
conflictive tags to provide more precise search results.
Motivation
Existing communities mining methods are mainly
based on ternary relationships (tripartite graph)
among users, tags and resources.
Motivation
Existing communities mining methods are mainly
based on ternary relationships (tripartite graph)
among users, tags and resources.
-Wu et al. (WWW’ 06)
Exploring social annotations for the semantic web
-Ma et al. (TMM’ 10)
Bridging the semantic gap between image contents and tags
-Xie et al. (JCST’ 12)
Community-Aware Resource Profiling for Personalized Search in
Folksonomy
Motivation
Existing communities mining methods are mainly
based on ternary relationships (tripartite graph)
among users, tags and resources.
Problems: -- Sparsity
-- Noisy
Motivation
What we have missed?
Content similarity between resourcese.g., visual similarity (images), acoustic similarity (audios)
Motivation
What we have missed?
Semantic similarity between tagse.g., “is-a” relationship (apple & fruit)
Motivation
Motivation We first propose a method to handle the problem of discovering
latent user communities via MFG -- Multi-facet Folksonomy Graph MFG includes not only conventional tripartite relations of user-resource-
tag, but also the tag-to-tag semantic and resource-to-resource content similarities!
Outline
Background Motivation Methodology Experiment Conclusion
Q: How to integrate these two kinds of similarities with the original ternary
relations (user-resource-tag)? -- Augmented Folksonomy Graph!
Methodology
MethodologyAugmented Folksonomy Graph (AFG)
Image-to-Image Visual Similarity
The cosine similarity between global features (e.g. color correlogram, color histogram, edge direction histogram, wavelet texture) extracted from resource images.
is the feature vector for image
MethodologyAugmented Folksonomy Graph (AFG)
Tag-to-Tag Semantic Similarity
Semantic similarity between two tags (e.g. Lin’s similarity on WordNet):
is the corresponding synset (i.e. the set of synonyms) for tag tx
is the lowest super-ordinate (i.e. the smallest common
denominator in the concept hierarchy) for their information contents,
is the probability of encountering an instance of the synset
MethodologyAugmented Folksonomy Graph (AFG)
Q: How to measure the distance between two users?
Random walk distance: multiple paths; multiple types of edges; unified measurement
MethodologyRandom Walk Model
MethodologyRandom Walk Distance
We use two parameters α and β to control the impact of image-to-
image visual similarity and tag-to-tag semantic similarity on AFG.
where Wl+1|l(j |i) is the final transition probability matrix to be directly applied to the random walk
Cluster Initialization by density function
Find new user to maximize the user-cluster similarity
until the sum of intra-cluster (objective function):
converges!
Methodology
Cluster-based User Community Discovery with APC:
Graph-based TBPS
To find a resource (image) vertex, which has the largest sum of possibilities to reach both the user vertex and all query term (tag)vertices of the query:
Social-based TBPSBased on the “voting” results by user community members
Application of APC in Search
Tag-based Personalized Search (Search by Terms/Tags!)
Graph-based CBPS
To find a resource (image) vertex which has the largest sum of possibilities to reach both the user vertex and query example image of the query:
Social-based CBPSBased on the “voting” results by user community members:
Application of APC in Search
Content-based Personalized Search (Search by Image Example!)
Outline
Background Motivation Methodology Experiment Conclusion
NUS-WIDE Data Set 269,648 resources (images), 5,018 unique tags, and 50,120
users. Each user on average has annotated about 5 images Ground truth for 81 categories (concepts) of the images. For these images, six types of low-level features are
available: 64-D color histogram, 144-D color correlogram, 73-D edge direction histogram, 128-D wavelet texture, 225-D block-wise color moments and 500-D bag of words based on SIFT description.
We split the dataset into 20% as the test set and 80% as the training set.
Experiment
Dataset
P@N :
MRR:
IMP:
Experiment
Metrics
Baselines
Experiment
: Carmel et al. (JVLDB 2010) Social bookmark weighting for search and recommendation
: Xie et al. (JCST 2012) Community-aware resource
profiling for personalized search in folksonomy : Ma et al. (TMM 2010) Bridging the semantic gap between
image contents and tags.
ExperimentBaselines
Experiment
Performance on TBPS (Graph-based)
Experiment
Performance on TBPS (Social-based)
Experiment
Performance on TBPS (Graph-based & Social-based MRR)
ExperimentPerformance on CBPS (Graph-based)
ExperimentPerformance on CBPS (Social-based)
ExperimentPerformance on CBPS (Graph-based & Social-based MRR)
k from 20 to 80 (or 100): MRR values for all methods are increasing Explanation: more and more user communities can lead to a more
precise partition for all users
ExperimentInfluence of Parameters (impact of k for social-based methods)
k from 80 (or 100) to 120: MRR values for all methods drops Explanation: the number of available users for constructing user
communities becomes smaller and smaller with the increase of k.
ExperimentInfluence of Parameters (impact of k for social-based methods)
k = 80 (or 100): MRR values for all methods reach the peak Explanation: the dataset used contains 81 categories
ExperimentInfluence of Parameters (impact of k for social-based methods)
α: the impact of image-to-image visual similarity
β: the impact of tag-to-tag semantic similarity
Impact on Graph-based TBPS
ExperimentInfluence of Parameters (impact of α, β for graph-based methods)
The best MRR (0.5416)α = 0.0001 β = 0.01
The MRR performance is more sensitive to β than α, becauseα has little impact on the performance when β = 0, while β seems to be quite influential when α = 0.
tag-to-tag semantic similarity is more dominating than the image-to-image visual similarity in TBPS.
α: the impact of image-to-image visual similarity
β: the impact of tag-to-tag semantic similarity
Impact on Graph-based CBPS
ExperimentInfluence of Parameters (impact of α, β for graph-based methods)
The best MRR (0.4832)α = 0.01 β = 0.001
The MRR performance is more sensitive to α than β, becauseβ has little impact on the performance when α = 0, while α seems to be quite influential when β = 0.
image-to-image visual similarity is more dominating than the tag-to tag semantic similarity in CBPS.
The SimRank is based on the concept of structural-context similarity, in which ‘two objects are similar if they relate to similar objects’. For any two user vertices in the AFG, the SimRank value in l+1 step is calculated based on the l step:
The SimRank method is a suitable method for measuring the distance between two vertices in the AFG as well.
(Jeh, G. and Widom, J. Simrank: A Measure of Structural-Context Similarity. KDD 2002)
Experiment
Influence of Alternatives (SimRank v.s. Random Walk)
Experiment
Influence of Alternatives (SimRank v.s. Random Walk)
The classical clustering method K-means can also be applied to the user similarity matrix to discover the user communities.
We compare K-means with our clustering method APC
ExperimentInfluence of Alternatives (K-Means vs. APC)
Outline
Background Motivation Methodology Experiment Conclusion
Conclusion
We have proposed a facility of AFG -- Augmented Folksonomy Graph, by incorporating both image-to-image visual similarity and tag-to-tag semantic similarity
User communities are then discovered by a density-based clustering method based on user similarity
We have conducted extensive experiments on the NUS-WIDE dataset, the results of which show that the proposed approach outperforms baseline ones in search applications.
Future works Matrix Factorization in finding similar users Implement some real applications based on our proposed method
Thank you!