Date post: | 04-Jan-2016 |
Category: |
Documents |
Upload: | bruno-booker |
View: | 214 times |
Download: | 0 times |
LEARNING TO MODEL RELATEDNESS FOR NEWS RECOMMENDATION
Author: Yuanhua Lv and et al. UIUC, Yahoo! labs
Presenter: Robbie
1
WWW 2011
OUTLINE
Introduction and Motivation Model Relatedness Experiment Conclusion
2
INTRODUCTION Post-Click news Recommendation
Seed news
Candidate news
3
MOTIVATION
Promote users’ navigation on the visited website
Yahoo!, Google focus on initial clicks, post-click news recommendation largely under-explored
Mainly depend on editors’ manual effort
No existing method proposed to model relatedness directly
4
MODEL RELATEDNESS
Four aspects•Relevance•Novelty•Connection clarity •Transition smoothness
5
RELEVANCE AND NOVELTY
Similar but not duplicate Novelty often in contrast to relevance Use same set of features to measure
them cosine similarity BM25 language models with Dirichlet prior smoothing language models with Jelinek-Mercer smoothing
6
CONNECTION CLARITY
Relevance and novelty can only model word overlap between two articles s and d
Example: s: White House: Obamas earn $5.5 million in 2009
d: Obama’s oil spill bill seeks $118 million, oil company
s and d must be topically cohesive Connection clarity defines topical
cohesion of two news
7
CONNECTION CLARITY
Rr Vw
dswcrwPrwPdswPdswP,
, ),(, )|()|(),()|(
Vw
dswcrwP,
, ),(, )|(
8
TRANSITION SMOOTHNESS
Example: s: Toyota dismisses account of runaway Prius
d1: What to do if your car suddenly accelerates
d2: Toyota to build Prius at 3rd Japan plant: report
Definition: Measures how well a user’s reading interests can transit from s to d
Transition smoothness from s-d to d-s, i.e. from “known” to “novel”
9
Smooth(s, d1)>smooth(s, d2)
TRANSITION SMOOTHNESS
10
LEARNING A RELATEDNESS FUNCTION
11
CONSTRUCTING TEST COLLECTION
Yahoo! News articles from March 1st to June 30th 2010
Each run, randomly generate 549 seed news from June 10th to June 20th with at least 2000 visits
Perform redundancy detection
12
EDITORIAL JUDGMENTS
A group of professional news editors from a commercial online news website
4 point relatedness scale “very related”, “somewhat related”,
“redundant”, “unrelated”, any document with two different judgments, select a judgment with a higher ratio
High agreement in relative relatedness (80.8%),inspire to learn relatedness functions from pair-wise preference information 13
EXPERIMENTS: COMPARING INDIVIDUAL RETRIEVAL MODELS
14
• “Body” is the best ---- title and abstract may lose information
• Cosine similarity as well as or even better than language models in some cases, but NDCG1 is worst • Effective for redundancy detection, which brings
redundant documents to the top
EXPERIMENTS: COMPARING MACHINE-LEARNED RELATEDNESS MODELS
15
EXPERIMENTS: ANALYZING THE UNIFIED RELATEDNESS MODEL
16
• Cosine similarity significantly worse than BM25 as individual relatedness function, but the most important in the unified model
• Connection clarity and transition smoothness contribute 7/15 together
CONCLUSIONS
First attempt at post-click news recommendation
Propose 4 aspects to characterize news relatedness
Future work Incorporate into the unified relatedness function
non-content features Document and user adaptive measures will be
more accurate
17
RELEVANCE AND NOVELTY
Problem: Top ranked documents may be redundant and unrelated articles
Solution : Passage retrieval
18
EXPERIMENTS: PASSAGE RETRIEVAL EVALUATION
19
o Fixed-length(250 empirically) arbitrary passage retrievalo Passage retrieval doesn’t help in most caseso Improve NDCG1 clearly. --- Probably relaxes the concern
of ranking redundant documents on top