+ All Categories
Home > Documents > Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft...

Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft...

Date post: 29-Mar-2015
Category:
Upload: zoe-barkus
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
33
Learning to Re- rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India
Transcript
Page 1: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Learning to Re-rank: Query-Dependent Image

Re-Ranking using Click Data

Manik VarmaMicrosoft Research India

Vidit JainYahoo! Labs India

Page 2: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Keyword-based image search

Best actress, Academy awards 2011• “Natalie Portman”

Highest h-index in computer science• “Scott Shenker” Professor @ UC Berkeley

Page 3: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

“Scott Shenker”

similar results from other search engines

Page 4: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Our holy grail

Page 5: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

• Huge improvements in performance for tail queries

• Only the raw click data

• Efficient use of visual content

In this talk

< 20 (16) 20-40 (23) 40-60 (15) 60-100 (21) >100 (118)0

2

4

6

8

10

12

14

Avg

gai

n in

mea

n(n

DC

G@

20)

(%)

# of clicked images per query

0

2

4

6

8

Avg

gai

n in

mea

n(n

DC

G@

20)

(%)

Page 6: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Limitation 1: Ignore visual content

Simple visual processing could improve results

Page 7: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

• Score(x) = wtx with a fixed w

• Query : “taj mahal”

• Query : “delhi”

Limitation 2: Static ranker within a vertical

delhi.jpg

tajmahal.jpg

Page 8: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Limitation 3: Noisy training labels

Query : “night train”

Page 9: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Overview of our solution

query Ranker

Original ranked list

Re-ranked list

f*visual

f*text

Pseudo-click estimation

clicked

Page 10: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

• Existing work for text document search• linear scanning of ranked list• relationship between relevance and clicks• [Joachims07, Radlinski08, Agrawal09, Cutrell07, …]

• Our contribution for image search• (challenge) 2D display of results• (challenge) no model for browsing/click behavior• use only raw click data

Novelty in our use of click data

Page 11: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Evidence for clicks-relevance relationship

Document search Image search0

102030405060708090

clicked items that are relevant (%)

(short snippet) (thumbnail)[Agichtein06]

Page 12: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

• Rich gets richer• The curious case of “distracting” images• Little gain when only a few were clicked

Naïve solution – ClickBoosting

Click-boosting ranked listOriginal ranked list

Page 13: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

35

50 20 5 2

Pseudo-click estimation: Regression

Query : “night train”

(#clicks)

Page 14: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

23

50 20 5 2

Visual features are not enough

Need both visual and text features

“night rod”

Query : “night train”

(#clicks)

Page 15: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Re-ranking function

queryCompute

text features

Compute visual

features

color, texture, shape

query dependentquery independent

ytext

yvisual

Regression

Score: sR(x) = a1 sO(x) + a2 ytext(x) + a3 yvisual(x)

sR

sO

Page 16: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

• #features (~3000) >> #clicked images (~10)

• Dimensionality reduction• unsupervised: only positive labeled data• and the winner is… PCA !!!

• Bayesian formulation of regression• Gaussian Process regression• prevents over-fitting to small no. of examples• non-optimized Matlab: 20 ms for a query with 20

clicked images

Regression: challenges

Page 17: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

• Bing: top 1000 retrieved images for 19 tail queries• e.g., “gnats”, “Bern”, “fracture”, “child drinking

water”

• Relevance• highly relevant, relevant, non-relevant• referred to parent documents when needed• used only for evaluation; not training

• Evaluation• nDCG @ 20 (normalized discounted cumulative gain)

Development set for experiments

Page 18: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Baseline: mean nDCG@20 for Bing = 0.6854

SVR and 1-NN performance in between

Pseudo-click estimation: Regression

Approach Mean nDCG @ 20

Relative Improvement

Linear Regression 0.6871 + 0.2%GP Regression 0.7692 +12.2%

Page 19: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

sR(x) = a1 sO(x) + a2 yText(x) + a3 yVisual(x)

Re-scoring Function

Approach Mean nDCG @ 20

Relative Improvement

Baseline (a2 = a3 = 0) 0.6854 – Baseline + yText (a3 = 0) 0.7077 +3.3 %Baseline + yVisual (a2 = 0) 0.6136 –10.5%Baseline + yText + yVisual 0.7692 +12.2%

Page 20: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Evaluation on 193 Queries

0

2

4

6

8

Av

g g

ain

in

mea

n(n

DC

G@

20)

(%

)

Page 21: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Evaluation on 193 Queries

< 20 (16) 20-40 (23) 40-60 (15) 60-100 (21) >100 (118)0

2

4

6

8

10

12

14

Av

g g

ain

in

mea

n(n

DC

G@

20)

(%

)

# of clicked images per query

Page 22: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Query: “fracture”

Bing Our results

Clicks predominately (~92%) on images of bone fracture

Page 23: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Query: “gnats”

Bing Our results

Re-ranked list has only “highly relevant” images

Page 24: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Query: “camel caravan”

Bing Our results

Anecdotally, our results were perceived as more visually pleasant

Page 25: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Query: “turkey”

Bing Our results

Multiple interpretations are retained if manifested by clicks

305 446

81446

Page 26: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Query: “Stargate (1994)”

Bing Our results

Leads to visually diverse results for some queries

Page 27: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

• Significant improvement in nDCG@20 over commercial image ranking system

• Use of raw click data

• Address three limitations of existing search engines• incorporate visual features• user clicks to handle noisy “expert” labels• query-dependent re-ranking using GP regression

Conclusions

Page 28: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Thank You!

Page 29: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Additional slides

Page 30: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

• Given a ranked list of relevance judgments R

• Cumulative Gain at PCGP(R) = i=1..P 2Ri – 1

• Discounted Cumulative Gain DCGP(R) = i=1..P (2Ri – 1) / log2(i+1)

• Normalized Discounted Cumulative GainnDCGP(R) = DCGP(R) / DCGP(I)

where I is the judgment for the ideal ranked list

Measuring Search Performance – nDCG

Page 31: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

• We only have “positive” training data so discriminative methods did not work well (generating negative training data is non-trivial)

• Simple methods did work well

Click Estimation - Dimensionality Reduction

Approach Mean nDCG at 20

Relative Improvement

Average click rank 0.6266 – 8.6%Correlation with score 0.7209 +5.2 %Correlation with clicks 0.7409 +8.1%

PCA 0.7692 +12.2%

Page 32: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

• Gaussian Process Regressiony(x) = k(x, xTrain) [ k(xTrain, xTrain) + 2I ]-1 yTrain

= dt(x, xTrain) yTrain = wt(x)

• where• y is the predicted number of clicks and yTrain the

number of clicks for the set of training images• x are the features extracted from a novel image• xTrain are the training set features• is a noise parameter• k is a Gaussian kernel function

Gaussian Process regression

Page 33: Learning to Re-rank: Query-Dependent Image Re-Ranking using Click Data Manik Varma Microsoft Research India Vidit Jain Yahoo! Labs India.

Click Estimation - Regression

Approach Mean nDCG at 20

Relative Improvement

Linear Regression 0.6871 – 0.2%Support Vector Regression 0.6997 +2.1 %

Nearest Neighbour 0.7428 +8.3%GP Regression 0.7692 +12.2%


Recommended