+ All Categories
Home > Documents > Analyzing and Evaluating Query Reformulation Strategies in Web...

Analyzing and Evaluating Query Reformulation Strategies in Web...

Date post: 06-Feb-2021
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
26
Analyzing Query Reformulation & Search Abandonment in Web Search Efthimis N. Efthimiadis University of Washington [email protected] University of Geneva, March 18, 2010 1
Transcript
  • AnalyzingQuery Reformulation &Search Abandonment inWeb Search

    Efthimis N. EfthimiadisUniversity of [email protected]

    University of Geneva, March 18, 2010 1

  • Work presented here is in collaboration with

    Jeff Huang (UW) and Sofia Stamou (UPatras)

    Related Publications:• Huang, J. & Efthimiadis, E. N. (2009). Analyzing and Evaluating Query

    Reformulation Strategies in Web Search Logs. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM 2009), Hong Kong, November 2-6, 2009, pp77-86. (Conference acceptance rate 15%; Nominated for best student paper award)

    • Huang, J. and Efthimiadis, E.N. Search Abandonment in Web Search Logs. Submitted for publication.

    • Stamou, S. & Efthimiadis, E. N. (2009). Queries without Clicks: Successful or Failed Searches? In: Proceedings of the SIGIR 2009 Workshop on the Future of IR Evaluation. Boston, MA., USA. July 23, 2009.

    • Stamou, S., & Efthimiadis, E. N. (2010). Interpreting User Inactivity on Search Results. In Proceedings of the 32nd European Conference on IR Research (ECIR) on Advances in Information Retrieval, Milton Keynes, UK, 28-31 March 2010. Springer, 2010. University of Geneva, March 18, 2010 2

  • AgendaA. Search Interaction: overviewB. Reformulations

    1. Classifier2. Study3. Findings

    C. Abandonment1. Classifier2. Study3. Findings

    D. User StudyE. Future work

    University of Geneva, March 18, 2010 3

  • Search Interaction

    Overview

    QueryUser

    i-Need

    Research on Search Interaction: QF & modelse.g., Bates, Belkin, Fidel, … Ingwersen, Tenopir, …Hartley, Borlund, Toms, …Efthimiadis…& many others

    Results

    S.E.

    • Beyond string matching

    • User Intent (e.g., Broder, Rose)• Query prediction performance

    • difficulty• QRF, QE expansion risk

    • Customize results to intent• present Results

    • spelling correction did you mean this?

    • query refinement related searches

    • related works more like this

    • search trails• Clickthroughs

    Term suggestions

    • Search logsUniversity of Geneva, March 18, 2010

    4

  • REFORMULATIONS

    University of Geneva, March 18, 20105

  • Query Reformulations are…

    a modification of a previous querymade by

    computers or usersused to

    retrieve different search resultsin a

    web search engine

    We study these cases

    University of Geneva, March 18, 2010 6

  • Same Query

    Query Reformulation

    New Query

    Query Types

    7

  • We ask,Which

    reformulationstrategies work?

    How do we classify these reformulation

    strategies?

    How are users doing query reformulation?

    8

  • Exampleuniversity of washington information school

    9University of Geneva, March 18, 2010

  • More examples…Word Reorder

    Stemming

    Word Substitution

    Spelling Correction

    Remove Words

    Reorder Word

    Stemmed

    Term Replacement

    Speling Correctoin

    Remove

    10University of Geneva, March 18, 2010

  • Prior Reformulation Taxonomies

    Present Study Anick [3] Teevan [34] Jansen [16], He [15], Lau [24] Whittle [36] Bruza [7] Guo [13]

    word reorder syntactic variant word order

    whitespace and punctuation

    non-alphanumerics, word merge SPL, PUN

    word splitting, word merging

    remove words remove words / duplicates generalization D(k)

    add words head, modifier add words, add stopwords specialization C(k) ADD

    url stripping domain

    stemming morphological variant stemming and pluralization M(k) DER word stemming

    acronym acronym abbreviations ABR expansion

    substring

    abbreviation

    word substitution alternative, hyponym, change word swaps, synonyms reformulation W(k), w(k) SUB

    spelling correction spelling misspellings M(k) SPE spelling correction

    * not detected elaboration, location reformulation S(k), s(k)

    * not in data capitalization, extra whitespace J(k) CAS

    11University of Geneva, March 18, 2010

  • 12University of Geneva, March 18, 2010

  • A Rule-Based ClassifierFirst automated reformulation

    strategy classifierBased on heuristics,

    – Reformulation types are intuitive, no machine learning

    Primary Goal: High PrecisionSecondary Goal: Adequate Recall

    High precision enables accurate comparison between properties of reformulation types

    13University of Geneva, March 18, 2010

  • Architecture

    user1, query string1, timestamp, rank, urluser1, query string2, timestamp, rank, urluser1, query string3, timestamp, rank, urluser2, query string1, timestamp, rank, urluser3, query string1, timestamp, rank, urluser3, query string2, timestamp, rank, url

    Query Logs

    Classifier

    New Queries

    Same Queries

    Reformulation

    Acronym

    Stemming

    etc...

    1�4�U� n� i � v� e� r� s� i � t � y� � o � f� � G� e� n� e�v�a�,� �M�a�r�c�h� �1�8�,� �2�0�1�0�

  • 36M Queries from AOL Query Logs

    UserId Query Timestamp ClickRank ClickUrl

    16348 lucille roberts 5/3/2006 8:01 1 http://www.lucilleroberts.com

    16348 tmobile 5/22/2006 14:06

    16348 torontolime 5/23/2006 13:48 1 http://www.toronto-lime.com

    16348 welime 5/30/2006 14:58

    16348 we lime 5/30/2006 14:59

    16348 back2basics 5/30/2006 15:07 6 http://back2basics.mypicgallery.com

    16348 nycaribbeanvibes 5/30/2006 15:15 2 http://nycaribbeanvibes.photosite.com

    16473 theused.com 3/1/2006 23:55

    16473 slipknot masks 3/2/2006 0:20 1 http://www.hauntmasters.com

    16473 aol maps 3/2/2006 22:08

    16473 southeast missouri basketbal 3/2/2006 22:11

    16473 southeast missouri basketball 3/2/2006 22:11 5 http://www.semohoops.com

    16473 sikeston basketball 3/2/2006 22:13 2 http://www.semissourian.com

    16473 sikeston basketball 3/2/2006 22:13 1 http://www.topix.net

    15University of Geneva, March 18, 2010

  • Comparitive

    Precision Recall Accuracy

    Present Study 98.2% 61.3% 89.1%

    He et al. 60% 98%

    Jones et al. 87.3%

    Murray et al. 97.3% 76%

    Radlinski et al. 96.5% 92.3%

    EvaluationUnscientific

    Unscientific because:- Different data sources- Different ways of counting (i.e. include same queries or not?)

    16University of Geneva, March 18, 2010

  • FINDINGS

    17University of Geneva, March 18, 2010

  • SkipSkip ClickClick SkipClick ClickSkip

    word reorder

    word substitution

    stemming

    spelling correction

    url stripping

    expand acronym

    superstring

    substring

    whitespace / punctuation

    form acronym

    abbreviation

    remove words

    add words

    same

    new

    Que

    ry R

    efor

    mul

    atio

    n Ty

    peComparing click pattern frequencies between reformulations types

    18University of Geneva, March 18, 2010

  • word reorder

    word substitution

    stemming

    spelling correction

    url stripping

    expand acronym

    superstring

    substring

    whitespace / punctuation

    form acronym

    abbreviation

    remove words

    add words

    same

    new

    SkipSkip SkipClick

    Que

    ry R

    efor

    mul

    atio

    n Ty

    peComparing click pattern frequencies between reformulations types

    19University of Geneva, March 18, 2010

    compare the ratio of SkipSkip to SkipClickto seewhether a user is more likely to click if the initial action is Skip

    Spelling correction, &Expand acronym, have high ratios, i.e., people use these reformulations

  • ClickClick ClickSkip

    word reorder

    word substitution

    stemming

    spelling correction

    url stripping

    expand acronym

    superstring

    substring

    whitespace / punctuation

    form acronym

    abbreviation

    remove words

    add words

    same

    new

    Que

    ry R

    efor

    mul

    atio

    n Ty

    peComparing click pattern frequencies between reformulations types Some reformulations are performed to improve the

    result set, while others redo the result set

    Different reformulations are “effective” depending on the initial action, i.e. the action performed after the initial query

    20University of Geneva, March 18, 2010

  • Comparing websites clicked between reformulation types

    word reorder

    superstring

    word substitution

    stemming

    same

    remove words

    new

    spelling correction

    url stripping

    expand acronym

    substring

    whitespace / punctuation

    add words

    form acronym

    abbreviation

    Same DifferentQ

    uery

    Ref

    orm

    ulat

    ion

    Type

    21University of Geneva, March 18, 2010

  • Reformulation Type Median Time (s) between Queries Mean Rank Change

    word substitution 73 +4.04

    add words 63 +3.19

    substring 33 +3.15

    remove words 68 +3.02

    word reorder 85 +2.86

    expand acronym 42 +2.02

    stemming 33 +2.00

    new 2,417 +1.91

    abbreviation 35 +1.39

    superstring 53 +1.10

    spelling correction 22 +1.03

    form acronym 103 +.64

    whitespace & punctuation 27 +.54

    url stripping 57 +.29

    same 1 -1.83

    Comparing time between queries (secs) and rank change between reformulations types

    22University of Geneva, March 18, 2010

    Positive rank change = successful reformulation

  • Future WorkMulti-reformulation

    Abandonment

    seattle pizza seattle sausage pizza sausage pizza

    Search abandonment defined in terms of reformulation:

    initial query reformulations session end(timeout or new query)

    netbook eee pc eeepc netbook deals

    redefined

    Using instances of sequential reformulations to detect multi-reformulations

    NoClick

    23University of Geneva, March 18, 2010

  • Applications

    • UIs supporting Reformulations

    • Query session boundary detection

    • Intelligent query assistance

    • Personalized Search

    University of Geneva, March 18, 2010 24

  • Summary• We created a taxonomy of query

    reformulation strategies and a rule-based classifier to classify reformulations from the AOL query logs, where characteristics for each reformulation strategy was measured

    • Different reformulations are usefuldepending on the initial action

    • Some reformulations re-rank clicked results higher while others generate new results

    25University of Geneva, March 18, 2010

  • Thank You!

    Efthimis N. EfthimiadisUniversity of Washington

    [email protected]

    Questions?

    University of Geneva, March 18, 2010 39

    mailto:[email protected]

    Slide Number 1Work presented here is in collaboration with�Jeff Huang (UW) and �Sofia Stamou (UPatras) �AgendaSearch Interaction Overview ReformulationsQuery Reformulations are…Query TypesWe ask,ExampleMore examples…Prior Reformulation TaxonomiesReformulation�ClassifierA Rule-Based ClassifierArchitecture36M Queries from AOL Query LogsComparitiveFindingsSlide Number 18Slide Number 19Slide Number 20Slide Number 21Slide Number 22Slide Number 23ApplicationsSummaryThank You!


Recommended