Date post: | 29-Mar-2015 |
Category: |
Documents |
Upload: | zoe-barkus |
View: | 214 times |
Download: | 0 times |
Learning to Re-rank: Query-Dependent Image
Re-Ranking using Click Data
Manik VarmaMicrosoft Research India
Vidit JainYahoo! Labs India
Keyword-based image search
Best actress, Academy awards 2011• “Natalie Portman”
Highest h-index in computer science• “Scott Shenker” Professor @ UC Berkeley
“Scott Shenker”
similar results from other search engines
Our holy grail
• Huge improvements in performance for tail queries
• Only the raw click data
• Efficient use of visual content
In this talk
< 20 (16) 20-40 (23) 40-60 (15) 60-100 (21) >100 (118)0
2
4
6
8
10
12
14
Avg
gai
n in
mea
n(n
DC
G@
20)
(%)
# of clicked images per query
0
2
4
6
8
Avg
gai
n in
mea
n(n
DC
G@
20)
(%)
Limitation 1: Ignore visual content
Simple visual processing could improve results
• Score(x) = wtx with a fixed w
• Query : “taj mahal”
• Query : “delhi”
Limitation 2: Static ranker within a vertical
delhi.jpg
tajmahal.jpg
Limitation 3: Noisy training labels
Query : “night train”
Overview of our solution
query Ranker
Original ranked list
Re-ranked list
f*visual
f*text
Pseudo-click estimation
clicked
• Existing work for text document search• linear scanning of ranked list• relationship between relevance and clicks• [Joachims07, Radlinski08, Agrawal09, Cutrell07, …]
• Our contribution for image search• (challenge) 2D display of results• (challenge) no model for browsing/click behavior• use only raw click data
Novelty in our use of click data
Evidence for clicks-relevance relationship
Document search Image search0
102030405060708090
clicked items that are relevant (%)
(short snippet) (thumbnail)[Agichtein06]
• Rich gets richer• The curious case of “distracting” images• Little gain when only a few were clicked
Naïve solution – ClickBoosting
Click-boosting ranked listOriginal ranked list
35
50 20 5 2
Pseudo-click estimation: Regression
Query : “night train”
(#clicks)
23
50 20 5 2
Visual features are not enough
Need both visual and text features
“night rod”
Query : “night train”
(#clicks)
Re-ranking function
queryCompute
text features
Compute visual
features
color, texture, shape
query dependentquery independent
ytext
yvisual
Regression
Score: sR(x) = a1 sO(x) + a2 ytext(x) + a3 yvisual(x)
sR
sO
• #features (~3000) >> #clicked images (~10)
• Dimensionality reduction• unsupervised: only positive labeled data• and the winner is… PCA !!!
• Bayesian formulation of regression• Gaussian Process regression• prevents over-fitting to small no. of examples• non-optimized Matlab: 20 ms for a query with 20
clicked images
Regression: challenges
• Bing: top 1000 retrieved images for 19 tail queries• e.g., “gnats”, “Bern”, “fracture”, “child drinking
water”
• Relevance• highly relevant, relevant, non-relevant• referred to parent documents when needed• used only for evaluation; not training
• Evaluation• nDCG @ 20 (normalized discounted cumulative gain)
Development set for experiments
Baseline: mean nDCG@20 for Bing = 0.6854
SVR and 1-NN performance in between
Pseudo-click estimation: Regression
Approach Mean nDCG @ 20
Relative Improvement
Linear Regression 0.6871 + 0.2%GP Regression 0.7692 +12.2%
sR(x) = a1 sO(x) + a2 yText(x) + a3 yVisual(x)
Re-scoring Function
Approach Mean nDCG @ 20
Relative Improvement
Baseline (a2 = a3 = 0) 0.6854 – Baseline + yText (a3 = 0) 0.7077 +3.3 %Baseline + yVisual (a2 = 0) 0.6136 –10.5%Baseline + yText + yVisual 0.7692 +12.2%
Evaluation on 193 Queries
0
2
4
6
8
Av
g g
ain
in
mea
n(n
DC
G@
20)
(%
)
Evaluation on 193 Queries
< 20 (16) 20-40 (23) 40-60 (15) 60-100 (21) >100 (118)0
2
4
6
8
10
12
14
Av
g g
ain
in
mea
n(n
DC
G@
20)
(%
)
# of clicked images per query
Query: “fracture”
Bing Our results
Clicks predominately (~92%) on images of bone fracture
Query: “gnats”
Bing Our results
Re-ranked list has only “highly relevant” images
Query: “camel caravan”
Bing Our results
Anecdotally, our results were perceived as more visually pleasant
Query: “turkey”
Bing Our results
Multiple interpretations are retained if manifested by clicks
305 446
81446
Query: “Stargate (1994)”
Bing Our results
Leads to visually diverse results for some queries
• Significant improvement in nDCG@20 over commercial image ranking system
• Use of raw click data
• Address three limitations of existing search engines• incorporate visual features• user clicks to handle noisy “expert” labels• query-dependent re-ranking using GP regression
Conclusions
Thank You!
Additional slides
• Given a ranked list of relevance judgments R
• Cumulative Gain at PCGP(R) = i=1..P 2Ri – 1
• Discounted Cumulative Gain DCGP(R) = i=1..P (2Ri – 1) / log2(i+1)
• Normalized Discounted Cumulative GainnDCGP(R) = DCGP(R) / DCGP(I)
where I is the judgment for the ideal ranked list
Measuring Search Performance – nDCG
• We only have “positive” training data so discriminative methods did not work well (generating negative training data is non-trivial)
• Simple methods did work well
Click Estimation - Dimensionality Reduction
Approach Mean nDCG at 20
Relative Improvement
Average click rank 0.6266 – 8.6%Correlation with score 0.7209 +5.2 %Correlation with clicks 0.7409 +8.1%
PCA 0.7692 +12.2%
• Gaussian Process Regressiony(x) = k(x, xTrain) [ k(xTrain, xTrain) + 2I ]-1 yTrain
= dt(x, xTrain) yTrain = wt(x)
• where• y is the predicted number of clicks and yTrain the
number of clicks for the set of training images• x are the features extracted from a novel image• xTrain are the training set features• is a noise parameter• k is a Gaussian kernel function
Gaussian Process regression
Click Estimation - Regression
Approach Mean nDCG at 20
Relative Improvement
Linear Regression 0.6871 – 0.2%Support Vector Regression 0.6997 +2.1 %
Nearest Neighbour 0.7428 +8.3%GP Regression 0.7692 +12.2%