Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi...

Modelling Relevance and User Behaviour in Sponsored Search using

Click-Data

Adarsh Prasad, IIT DelhiAdvisors: Dinesh Govindaraj

SVN Vishwanathan*Group: Revenue and Relevance

*-Visiting Researcher from Purdue

Overview

• Click-Data seems to be the perfect source of information when deciding which Ads to show in answer to a query. It can be thought as the result of users voting in favour of the documents they find interesting.

• This information can be fed into the ranker, to tune search parameters or even use as training points as for the ranker.

• The aim of the project is to develop a model which takes in Click-Data and generates output in the form of constraints or updated ranking score as input to the ranker.

2

• Quality of training points is of critical importance for learning a ranking function

• Currently, labeled data collected using human judges. Human-labeling is time-consuming and labor-intensive.

• Need to ensure “temporal relevance” of Ads i.e. Something relevant today might not be relevant 6 months later, therefore labeling must be repeated and there is a need for automation of labeling process

Motivation

Main Difficulty – Presentation Bias•Results at lower positions are less likely to be clicked even if they are relevant.(Position)•Clicks depend on other Ads being shown.(Externalities)

[1] Oliver Chapelle et al. A Dynamic Bayesian Click Model for Web Search Ranking

Example[1]

Query: myspaceURL = www.myspace.com Market = U.K.

Ranking 1Pos 1: uk.myspace.com: ctr = 0.97Pos 2: www.myspace.com: ctr = 0.11

Ranking 2: Pos 1 : www.myspace.com : ctr = 0.97

3

http://www.myspace.com/

http://www.myspace.com/

Procedure

• Use of Click Data as target : Useful for markets with few editorial Judgments.

• Train on pairwise preferences: Two Sets of preferences: PE from editorial judgments and PC coming from click modeling.

Minimize:

For learning a web search function, clicks can be used as a target[2] or as a feature[3]

Target

1. Deriving Preference Relations on the basis of click-pattern and feeding them as constraints to ranker (Rocky-Road)• Position and Order-of-Click

based Constraints[4]

• Aggregate Constraints

Feature

1. Sample Clicked Ads and label them as relevant.

2. Types of Sampling:• Random• Position based Weighted : User Clicking

ml-4 Ad stronger signal of relevance as compared to user clicking ml-1

3. Feed them to the Binary Classifier

[2] Joachims et al. Optimizing Search Engines using Clickthrough Data[3] Agichtein et al. Improving web search ranking via incorporating User Behaviour[4] Joachims et al. Accurately interpreting ClickThrough Data as Implicit Feedback

4

Results

5

EXACTMATCH BROADMATCH PHRASEMATCH SMARTMATCH

Sampling +0.39% +1.02%

Position and Order Constraints

+1.22% +5.93% +4.15% +0.38%

Aggregate Constraints

+0.2% +5.17% +0.77% +0.5%

SAME SUPERSET DISJOINT

Sampling +5.72% +4.22%


+3.1% +2.28%


+7.4% +5.28%

-6.28%

-3.9%

-11.3%

Fisher Score =

-0.06% -0.5%

Log Loss (Label Based)Sampling +0.001%


+3.07%


+1.75%

Weighted LL

Background on Click Models• Use CTR (click-through rate) data.• Pr(click) = Pr(examination) x Pr(click | examination)

• Need user browsing models to estimate Pr(examination)

Relevance

6

Notation• Φ(i) : result at position i

• Examination event:

• Click event:

otherwise 0,

(i)on clickeduser theif ,1 iC

otherwise 0,

(i) examineduser theif ,1 iE

7

Examination HypothesisRichardson et al, WWW 2007: Pr(Ci = 1) = Pr(Ei = 1) Pr(Ci = 1 | Ei = 1)

• αi : position bias• Depends solely on position.• Can be estimated by looking at CTR of the same result in different

positions.

8

Using Prior Clicks

9

Clicks

Pr(E5 | C1,C3) = 0.5 Pr(E5 | C1) = 0.3

ClicksR1R2R3R4R5

:

R1R2R3R4R5:

Examination depends on prior clicks

• Cascade model• Dependent click model (DCM)• User browsing model (UBM) [Dupret & Piwowarski, SIGIR

2008]• More general and more accurate than Cascade, DCM.• Conditions Pr(examination) on closest prior click.

• Bayesian browsing model (BBM) [Liu et al, KDD 2009]• Same user behavior model as UBM.• Uses Bayesian paradigm for relevance.

10

• Use position of closest prior click to predict Pr(examination). Pr(Ei = 1 | C1:i-1) = αi β i,p(i)

Pr(Ci = 1 | C1:i-1) = Pr(Ei = 1 | C1:i-1) Pr(Ci = 1 | Ei = 1)

User browsing model (UBM)

11

position bias

p(i) = position of closest prior click

Prior clicks don’t affect relevance.

Other Related Work• Examination depends on prior clicks and prior relevance• Click chain model (CCM)• General click model (GCM)

• Post-click models• Dynamic Bayesian model• Session utility model

12

User Browsing in Sponsored Search

13

• Is user browsing in sponsored search similar to browsing in Web Search?? • Generally, the assumption in organic search is that users examine and click in a

linear top-to-bottom fashion.• We observed that for sponsored search where the number of returned results is

few, a fair share (~ 30%) of users click out of order. • Users behaving in a non-linear fashion is a strong signal, which may contain

important information.• Combining position and temporal behavior of user.

The statistic(x) that has been counted is the difference between the positions of temporal clicks.

Example:if the user clicks on ml1 and then ml2 then x = -1 if ml2 and then ml1 then x=1 and so on.

A New Model• Allow users to move in a non-linear fashion• Also, incorporate the notion of externalities, i.e. perceived

relevance changes with other clicks.

14

For learning our parameters, we can use EM Algorithm.(1) In E step, we estimate our

hidden parameters by a forward-backward algorithm.

(2) In M step- We have closed form solutions to maximize the expected log-likelihood.

Date post:	14-Dec-2015
Category:	Documents
Upload:	thalia-verdon
View:	217 times
Download:	0 times

Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi...

Documents