Contextual IR

Contextual IR

Naama Kraus

Slides are based on the papers:Searching with Context, Kraft, Chang, Maghoul, KumarContext-Sensitive Query Auto-Completion, Bar-Yossef and Kraus

Ambiguous queries: jaguar

General queries:haifa

Terminology differences (synonyms)between user and corpusstars - planets

The Problem (recap)• User queries are an imperfect description of their information

needs• Examples:

Contextual IR• Leverage context to better understand the

user’s information need

• Context types– Short-term context

• Current time and location, recent queries, recent page visits, current page viewed, recent tweets, recent e-mails …

– Long-term context (user profile/model)• Long-term search history, user interests, user

demographics (gender, education…), emails, desktop files…

Today’s focus:short-term

context

Example

jaguar

recently viewed page

Document retrieval – use context to disambiguate queries

Searching with Context

Kraft, Chang, Maghoul, Kumar,WWW’06

Searching with Context• Goal: improve document retrieval• Capture user’s recent context

– Piece of text– Extract terms from a page a user is currently viewing, a file

a user is currently editing …

• Proposes three different methods – Query rewriting (QR)

• Add terms to the user’s original query– Rank biasing (RB)

• Re-rank results– Iterative filtering meta-search (IFM)

• Generate sub-queries and aggregate results

Query Rewriting

• Send one simple query to a standard search engine

• Augment top context terms to original query– AND semantics– Parameter: how many terms to add

• Query q • Context term weighted vector (a b c d e)

– Terms are ranked by their weight• Q_new = (q a b) for parameter 2

Rank-Biasing• Send complex query that contains ranking

instructions to the search engine• Does not change the original result set, only the

ranking• <q> = <selection=cat> <optional=persian,2.0>

• Selection terms – original query terms• Optional terms – context terms

– boost is a function of their weight

new query definition

must appear terms

optional terms with boost factor

(influence on ranking)

Iterative Filtering Meta-Search

• Intuition: “explore” different ways to express an information need

• Algorithm outline– Generate sub-queries– Send to search engine– Aggregate results

Sub-query Generation

• Use a query template

• Example:– Query q ; context = (a, b ,c)– Sub-queries

• q a , q b , q c • q a b , q b c • q a b c

Ranking and Filtering

• Issue k sub-queries to standard SE• Obtain results• Challenge – how to combine, rank and filter

results ?

• Use rank aggregation techniques

Rank Averaging

• A rank aggregation method (out of many…)

• Given: k lists of top results• Assign score to each position in the list

– E.g., 1 to first position, 2 to second position …• For each document, average over its scores in

the k lists• The final list is constructed using the average

scores

Context-Sensitive Query Auto-Completion

Z. Bar-Yossef and N. Kraus, WWW’11

Query Auto-Completion

An integral part of the user’ssearch experience

Use Cases• Predict the user’s intended

query– Save her key strokes

• Assist a user to formulate her information need

Motivating Example

I am attending WWW 2011

I need some information about

Hyderabad

hyderabadhyderabad airporthyderabad historyhyderabad mapshyderabad indiahyderabad hotelshyderabad www

Current Desired

MostPopular is not always good enough

User queries follow a power law distribution A heavy tail of unpopular queries

MostPopular is likelyto mis-predict when given a small number of keystrokes

MostPopular Completion

Nearest Completion

www 2011

Idea: leverage recent query context Intuition: the user’s intended query is similar to her context query need a similarity measure between queries (refer to paper)

hyderabadairport

hyderabadhyderabad

maps

hyderabadindia

hydroxycut hyperbolahyundai

hyatt

Nearest Completion: Framework

NearestNeighbors

Search

context

candidatecompletionsRepository

top kcontext-related

completions

offline 1. Expand

completions 2. Index completions

online1. Expand context query2. Search for similar completions 3. Return top k completions

HybridCompletion

Problem• If context queries are irrelevant to current query, NearestCompletion

fails to predict user’s query.

Solution• HybridCompletion: a combination of highly popular and highly

context-similar completions– Completions that are both popular and context-similar get promoted

• hybscore(q) = c Zsimscore(q) + (1-c) Zpopscore(q) , c [0,1]– Convex combination

MostPopular, Nearest, and Hybrid (1)

MostPopular, Nearest, and Hybrid (2)

Anecdotal Examplescontext query MostPopular Nearest Hybrid

french flag italian flag internetim helpirsikeainternet explorer

italian flag itunes and frenchirelanditaly irealand

internetitalian flagitunes and frenchim help irs

neptune uranus ups usps united airlinesusbankused cars

uranus uranasuniversityuniversity of chic…ultrasound

uranus uranasupsunited airlinesusps

improving acer laptop

battery

bank of america

bank of america bankofamericabest buybed bath and b…

battery powered …battery plus cha…

bank of america best buybattery powered …

Date post:	12-Feb-2016
Category:	Documents
Upload:	thy
View:	38 times
Download:	0 times

Contextual IR

Documents