Date post: | 15-Jul-2015 |
Category: |
Data & Analytics |
Upload: | julia-kiseleva |
View: | 139 times |
Download: | 0 times |
Modeling and Detecting
Changes in User Satisfaction
Julia Kiseleva*, Eric Crestan, Riccardo Brigo, Roland Dittel
*Eindhoven University of Technology
Microsoft Bing
What is User Satisfaction?
QUERY SERP,Pr (Ref.)
Assumption: If a “significant” amount of users
reformulate a query with a particular SERP it is an
indication of changing in user preferences
QUERY SERP,
QUERY SERP
ti
ti+1 ,| Pr ti - Pr ti+1 |
Tim
elin
e
Pr ti =
Pr ti+1 =
How Can We Detect the Changes?
• There are many definitions in the literature
• We use the query expansion
o new years wallpaper IS REFORMULATED WITH 2014
o medals Olympics IS REFORMULATED WITH 2014
o ct 40ez IS REFORMULATED WITH 2013
o march 31 holiday IS REFORMULATED WITH 2014
o …
Detecting Query Reformulation
The Explanation of the Drift
Before November 2013 After November 2013
The Question:
“How to detect
this kind of
changes?”
• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can
change over time yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output
(i.e., target variable) given the input (input features)
• Concept drift types:
Change Detection Techniques
• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can change over time
yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)
given the input (input features)
• Concept drift types:
Time
Data
mean
Sudden/abrupt
Disambiguation
such as
“flawless Beyoncé”
Change Detection Techniques
• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can change over time
yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)
given the input (input features)
• Concept drift types:
Time
Data
mean
Incremental
Disambiguation
such as
“cikm conference
2014”
Change Detection Techniques
• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can change over time
yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)
given the input (input features)
• Concept drift types:
Time
Data
mean
Gradual
Breaking news
such as
“idaho bus crash
investigation”
Change Detection Techniques
• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can change over time
yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)
given the input (input features)
• Concept drift types:
Time
Data
mean
Reoccurring
Seasonal change
such as
“black Friday 2014”
Change Detection Techniques
• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can change over time
yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output (i.e., target variable)
given the input (input features)
• Concept drift types:
Time
Data
mean
Change Detection Techniques
• Change detection techniques o In dynamically changing and non-stationary environments, the data distribution can
change over time yielding the phenomenon of concept drift
o The real concept drift refers to changes in the conditional distribution of the output
(i.e., target variable) given the input (input features)
• Concept drift types:
Time
Data
mea
n
Sudden/abru
ptIncremental Gradual
Reoccurring
concepts
Outlier
(not concept drift)
Disambiguation
such as
“medal olympics
2014”
Seasonal change
such as
“black Friday
2014”
Breaking news
such as
“idaho bus crash
investigation”
Disambiguation
such as
“cikm conference
2014”
Change Detection Techniques
Detecting Drifts in Reformulation Signal
Query: “cikm conference”
0.1
TimeLinet0
0.1 0.2 0.2 0.3
Reformulation: “2014”
Window W0ti
Detecting Drifts in Reformulation Signal
Query: “cikm conference”
0.1
TimeLinet0 ti+ t
0.1 0.2 0.2 0.3 0.7 0.8 0.8
Reformulation: “2014”
Window W0 Window W1ti
E(W0) E(W1)
Size of Window W1 = n1Size of Window W0 = n0
The
upcoming
conference
event
If |E(W1) - E(W2)|> eout
Then Drift Detected
Learn
reformulation
model M
User Behavior
Logs
t0
Incoming User
Behavior logs
Timeline
Detect changes in model M
If change detected
else Do Nothing
ti ti+ t
Learn
reformulation
model M
User Behavior
Logs
ti
Incoming User
Behavior logs
Timeline
Detect changes in model M
If change detected
else Do Nothing
ti+w1 ti+w1+w2
Alarm:Change of user
satisfaction
detected
for pairs :
{<Qi,
SERPi>}1<i<n
Learn
reformulation
model M
User Behavior
Logs
t0
Incoming User
Behavior Logs
Timeline
Detect changes in model M
If change detected
else Do Nothing
ti ti+ t
1) List of reformulation terms
per query
2) List of URLs per
reformulation
Alarm:Change of user
satisfaction
detected
for pairs :
{<Qi,
SERPi>}1<i<n
o Dataset consists of 6 months
of the behavioral log data
from a commercial search
engine
o The training window size is
one month
o The test window size is two
weeks
Experimentation
oWe successfully leveraged the concept drift detection
techniques to detect changes in user satisfaction
o The proposed technique works in unsupervised way
o Large scale evaluation has been performed
oClassification of the drift type is needed
o Prediction of the lifetime of the drift would help
Conclusion and Future Work