Query Session Detection as a Cascade
Matthias Hagen Benno Stein Tino Rub
Bauhaus-Universitat [email protected]
SIR 2011Dublin, IrelandApril 18, 2011
Hagen, Stein, Rub Query Session Detection as a Cascade 1
Introduction Motivation
It’s quiz time!
What is the user searching?
paris hilton
Hagen, Stein, Rub Query Session Detection as a Cascade 2
Introduction Motivation
It’s quiz time!
What is the user searching?
paris hilton
Hagen, Stein, Rub Query Session Detection as a Cascade 2
Introduction Motivation
Without context . . .
paris hilton
source: [http://upload.wikimedia.org/wikipedia/commons/2/26/Paris Hilton 3 Crop.jpg]
Hagen, Stein, Rub Query Session Detection as a Cascade 3
Introduction Motivation
What if you knew the previous queries?
paris hotelsparis marriottparis hyattparis hilton
sources: [http://www.alison-anderson.com/wp-content/uploads/hilton hotel paris 2.jpg][http://maps.google.de/][http://upload.wikimedia.org/wikipedia/en/e/eb/HI mk logo hiltonbrandlogo.jpg]
Hagen, Stein, Rub Query Session Detection as a Cascade 4
Introduction Motivation
What if you knew the previous queries?
paris hotelsparis marriottparis hyattparis hilton
sources: [http://www.alison-anderson.com/wp-content/uploads/hilton hotel paris 2.jpg][http://maps.google.de/][http://upload.wikimedia.org/wikipedia/en/e/eb/HI mk logo hiltonbrandlogo.jpg]
Hagen, Stein, Rub Query Session Detection as a Cascade 4
Introduction Motivation
Query sessions: same information need
The benefits
Improved understanding of user intent
Improved retrieval performance via session knowledge
The “minor” issue
Users do not announce when querying for a new information need.
Hagen, Stein, Rub Query Session Detection as a Cascade 5
Introduction Motivation
Query sessions: same information need
The benefits
Improved understanding of user intent
Improved retrieval performance via session knowledge
The “minor” issue
Users do not announce when querying for a new information need.
Hagen, Stein, Rub Query Session Detection as a Cascade 5
Introduction Motivation
A typical query log
User Query Click domain + Click rank Time
773 istanbul en.wikipedia.org 1 2011-04-16 20:34:17773 istanbul archeology 2011-04-17 12:02:54773 istanbul archeology www.kulturturizm.tr 6 2011-04-17 12:03:15773 istanbul archeology www.arkeoloji.gov.tr 13 2011-04-17 18:24:07773 constantinople 2011-04-17 19:00:40773 constantinople www.roman-empire.net 4 2011-04-17 19:01:02773 hurling 2011-04-17 19:03:01773 hurling en.wikipedia.org 1 2011-04-17 19:03:05773 liam mccarthy cup 2011-04-17 23:33:04773 liam mccarthy cup www.hurling.net 5 2011-04-17 23:33:12773 liam mccarthy cup starbets.ie 16 2011-04-18 12:42:48
Hagen, Stein, Rub Query Session Detection as a Cascade 6
Introduction Motivation
How to determine the break points?
User Query Click domain + Click rank Time
773 istanbul en.wikipedia.org 1 2011-04-16 20:34:17773 istanbul archeology 2011-04-17 12:02:54773 istanbul archeology www.kulturturizm.tr 6 2011-04-17 12:03:15773 istanbul archeology www.arkeoloji.gov.tr 13 2011-04-17 18:24:07773 constantinople 2011-04-17 19:00:40773 constantinople www.roman-empire.net 4 2011-04-17 19:01:02
— — — — — — — — — — — — — — — — — —
773 hurling 2011-04-17 19:03:01773 hurling en.wikipedia.org 1 2011-04-17 19:03:05773 liam mccarthy cup 2011-04-17 23:33:04773 liam mccarthy cup www.hurling.net 5 2011-04-17 23:33:12773 liam mccarthy cup starbets.ie 16 2011-04-18 12:42:48
Hagen, Stein, Rub Query Session Detection as a Cascade 7
Introduction The Problem
The key is . . .
Automatic query session detection
Hagen, Stein, Rub Query Session Detection as a Cascade 8
Introduction The Problem
Automatic query session detection
Usual “technique”
Check for consecutive queries whether same/new information need.
Example
773 istanbul 2011-04-16 20:34:17 X same773 istanbul archeology 2011-04-17 18:24:07 X same773 constantinople 2011-04-17 19:01:02
— — — — — — — — — � new
773 hurling 2011-04-17 19:03:05
Hagen, Stein, Rub Query Session Detection as a Cascade 9
Introduction Related Work
Typical features
Temporal thresholds 5 minutes [Silverstein et al., 1999]
10–15 minutes [He and Goker, 2000]
30 minutes [Downey et al., 2007]
user specific [Murray et al., 2006]
Lexical similarity n-gram overlap [Zhang and Moffat, 2006]
Levenshtein distance [Jones and Klinkner, 2008]
Semantic similarity Search results [Radlinski and Joachims, 2005]
ESA [Lucchese et al., 2011]
Hagen, Stein, Rub Query Session Detection as a Cascade 10
Introduction Related Work
Previous methods
Observations
Temporal thresholds: fast but bad accuracy
Feature combinations: more accurate
One of the best: Geometric method (time + lexical) [Gayo-Avello, 2009]
Shortcomings
All features evaluated simultaneously → runtime
Geometric method ignores semantics → accuracy
Examples
Subset test suffices
hurling X samehurling gaa
Geometric method fails
hurling X samemccarthy cup
Hagen, Stein, Rub Query Session Detection as a Cascade 11
Introduction Related Work
Previous methods
Observations
Temporal thresholds: fast but bad accuracy
Feature combinations: more accurate
One of the best: Geometric method (time + lexical) [Gayo-Avello, 2009]
Shortcomings
All features evaluated simultaneously → runtime
Geometric method ignores semantics → accuracy
Examples
Subset test suffices
hurling X samehurling gaa
Geometric method fails
hurling X samemccarthy cup
Hagen, Stein, Rub Query Session Detection as a Cascade 11
Cascading Method The Framework
We address the shortcomings in a cascade . . .
source: [http://wp.ltchambon.com/wp-content/uploads/2010/09/Cascade-de-Tufs-Baume-les-messieurs-Jura.jpg]
Hagen, Stein, Rub Query Session Detection as a Cascade 12
Cascading Method The Framework
. . . well . . . a small 4-step cascade
source: [http://www.solarshop.com/solarpix/Solar Cascade 4 Tier GreenL.jpg]
Step 1: Subset tests
↘Step 2: Geometric method
↘Step 3: ESA similarity
↙Step 4: Search results
Basic Idea
Increased feature cost (runtime) from step to step.Expensive features only if previous steps “unreliable.”
Hagen, Stein, Rub Query Session Detection as a Cascade 13
Cascading Method The Framework
. . . well . . . a small 4-step cascade
source: [http://www.solarshop.com/solarpix/Solar Cascade 4 Tier GreenL.jpg]
Step 1: Subset tests
↘Step 2: Geometric method
↘Step 3: ESA similarity
↙Step 4: Search results
Basic Idea
Increased feature cost (runtime) from step to step.Expensive features only if previous steps “unreliable.”
Hagen, Stein, Rub Query Session Detection as a Cascade 13
Cascading Method Step 1: Subset tests
Simple string comparison
Criterion
Consecutive queries q and q′ in same session if q sub- or superset of q′.Else: Goto Step 2.
Remarks: Repetition, specialization, or generalization.Time gap = continuing a pending session.
Example
Repetition Specialization Generalization
hurling X same hurling X same hurling gaa X samehurling hurling gaa hurling
Hagen, Stein, Rub Query Session Detection as a Cascade 14
Cascading Method Step 2: Geometric method
Combination of temporal and lexical features [Gayo-Avello, 2009]
For consecutive queries q and q′
ftemp = maximum of 0 and 1− t24h t is time between q and q′
flex = cosine similarity of 3- to 5-grams of q′ and s s is session of q
Criterion (original)
Consecutive queries q and q′ in samesession if √
f 2temp + f 2
lex ≥ 1.
Lexi
cal s
imila
rity
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
Nearly identicalqueries at long
temporal distance
Differentqueries with no
temporal distance
Same session
New session
Hagen, Stein, Rub Query Session Detection as a Cascade 15
Cascading Method Step 2: Geometric method
Combination of temporal and lexical features [Gayo-Avello, 2009]
For consecutive queries q and q′
ftemp = maximum of 0 and 1− t24h t is time between q and q′
flex = cosine similarity of 3- to 5-grams of q′ and s s is session of q
Criterion (original)
Consecutive queries q and q′ in samesession if √
f 2temp + f 2
lex ≥ 1.Le
xica
l sim
ilarit
y
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
Nearly identicalqueries at long
temporal distance
Differentqueries with no
temporal distance
Same session
New session
Hagen, Stein, Rub Query Session Detection as a Cascade 15
Cascading Method Step 2: Geometric method
Performs well on standard test corpus . . .Le
xica
l sim
ilarit
y
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
Same session
Lexi
cal s
imila
rity
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
New session
Hagen, Stein, Rub Query Session Detection as a Cascade 16
Cascading Method Step 2: Geometric method
. . . but has some problems “on the edge”Le
xica
l sim
ilarit
y
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
70
50
50
140
58350
10
00
40
60
1423
10
00
20
40
28
10
20
00
10
07
1147
010
011
02
011
Major problems
Similar queries, time gap (upper left)→ Merely a matter of opinion
Diff. queries, same semantics (lower right)→ Incorporate semantics
Criterion (adapted)
Original geometric method if ftemp < 0.8 or flex > 0.4.Else: Goto Step 3.
Hagen, Stein, Rub Query Session Detection as a Cascade 17
Cascading Method Step 2: Geometric method
. . . but has some problems “on the edge”Le
xica
l sim
ilarit
y
Temporal similarity0 0.2 0.60.4 1.00.8
0.2
0.6
0.4
1.0
0.8
0
70
50
50
140
58350
10
00
40
60
1423
10
00
20
40
28
10
20
00
10
07
1147
010
011
02
011
Major problems
Similar queries, time gap (upper left)→ Merely a matter of opinion
Diff. queries, same semantics (lower right)→ Incorporate semantics
Criterion (adapted)
Original geometric method if ftemp < 0.8 or flex > 0.4.Else: Goto Step 3.
Hagen, Stein, Rub Query Session Detection as a Cascade 17
Cascading Method Step 3: Explicit Semantic Analysis
How ESA works [Gabrilovich and Markovitch, 2007]
Preprocessing
tf · idf -weighted inverted indexof Wikipedia articles
→ term-document matrixM
For consecutive queries q and q′
fesa = cosine similarity of MT · q′ and MT · s s is session of q
Criterion
Consecutive queries q and q′ in same session if fesa ≥ 0.35.Else: Goto Step 4.
Hagen, Stein, Rub Query Session Detection as a Cascade 18
Cascading Method Step 4: Search results
Even more “semantics”
Idea
Enrich the short query strings with the results of some web search engine.
Criterion
Consecutive queries q and q′ in same session iffthey share at least one of the top 10 search results.
Remark
If q and q′ share no top 10 result, decision should be “not sure.”
Hagen, Stein, Rub Query Session Detection as a Cascade 19
Cascading Method Step 4: Search results
Even more “semantics”
Idea
Enrich the short query strings with the results of some web search engine.
Criterion
Consecutive queries q and q′ in same session iffthey share at least one of the top 10 search results.
Remark
If q and q′ share no top 10 result, decision should be “not sure.”
Hagen, Stein, Rub Query Session Detection as a Cascade 19
Cascading Method Experimental Results
That’s the complete cascade
source: [http://www.solarshop.com/solarpix/Solar Cascade 4 Tier GreenL.jpg]
Step 1: Subset tests
↘Step 2: Geometric method
↘Step 3: ESA similarity
↙Step 4: Search results
What about accuracy and performance?
Hagen, Stein, Rub Query Session Detection as a Cascade 20
Cascading Method Experimental Results
That’s the complete cascade
source: [http://www.solarshop.com/solarpix/Solar Cascade 4 Tier GreenL.jpg]
Step 1: Subset tests
↘Step 2: Geometric method
↘Step 3: ESA similarity
↙Step 4: Search results
What about accuracy and performance?
Hagen, Stein, Rub Query Session Detection as a Cascade 20
Cascading Method Experimental Results
Accuracy and runtime
Accuracy on Gayo-Avello’s corpus (11 000 queries, 2.7 per session)
Precision Recall F-Measure (β = 1.5)
Geometric 0.8673 0.9431 0.9184Cascading 0.8618 0.9676 0.9328
Performance per step on Gayo-Avello’s corpus
affected F-Measure time factor
Step 1 40.49% 0.8303 0.08 ms 1.0Step 2 35.15% 0.9292 0.20 ms 2.5Step 3 2.05% 0.9316 0.27 ms 3.4Step 4 0.85% 0.9328 9.85 ms 123.1
Hagen, Stein, Rub Query Session Detection as a Cascade 21
Cascading Method Experimental Results
Goal: high quality session test data
Our own use case
Sample sessions from the AOL log as test data.AOL log (cleaned): 35.4 million interactions from 470 000 users.
Some figures
Step 4 involved on 22.5% → 8 million web queries→ 300 ms per search → 1 month
Way out
Drop Step 4 and the sessions on which it would have been invoked
Remaining sessions:F-Measure = 0.9755
Cleaned AOL log:27 minutes
Hagen, Stein, Rub Query Session Detection as a Cascade 22
Cascading Method Experimental Results
Goal: high quality session test data
Our own use case
Sample sessions from the AOL log as test data.AOL log (cleaned): 35.4 million interactions from 470 000 users.
Some figures
Step 4 involved on 22.5% → 8 million web queries→ 300 ms per search → 1 month
Way out
Drop Step 4 and the sessions on which it would have been invoked
Remaining sessions:F-Measure = 0.9755
Cleaned AOL log:27 minutes
Hagen, Stein, Rub Query Session Detection as a Cascade 22
Conclusion
Almost the end: The take-away messages!
Hagen, Stein, Rub Query Session Detection as a Cascade 23
Conclusion
What we have done
Results
Cascading method
Cheap features first
Beats geometric
3 step version: simple, fast,high quality sessions
Future Work
Postprocessing for multi-tasking
Postprocessing for goals/missions
Thank you,
Hagen, Stein, Rub Query Session Detection as a Cascade 24
Conclusion
What we have (not) done
Results
Cascading method
Cheap features first
Beats geometric
3 step version: simple, fast,high quality sessions
Future Work
Postprocessing for multi-tasking
Postprocessing for goals/missions
Thank you,
Hagen, Stein, Rub Query Session Detection as a Cascade 24
Conclusion
What we have (not) done
Results
Cascading method
Cheap features first
Beats geometric
3 step version: simple, fast,high quality sessions
Future Work
Postprocessing for multi-tasking
Postprocessing for goals/missions
Thank you,
Hagen, Stein, Rub Query Session Detection as a Cascade 24