1
A Field Relevance Model for Structured Document Retrieval
JIN YOUNG KIM @ ECIR 2012
Three Themes• The Concept of Field Relevance
• Using Field Relevance for Retrieval
• The Estimation of Field Relevance
2
Relevance Field Weighting
Field Relevance
THE FIELD RELEVANCE
3
IR : The Quest for Relevance• The Role of Relevance• Core Component of Retrieval Models• Basis of (Pseudo) Relevance Feedback
• Retrieval Models based on the Relevance• Binary Independence Model (BM25) [Robertson76]
• Relevance-based Language Model [Lavrenko01]
4
V = (w1 w2 ... wm)
P(w|R)
Structured Document Retrieval• Documents have multiple fields• Emails, products (entities), and so on.
• Retrieval models exploit the structure• Field weighting is common
5
q1 q2 ... qm
f1
f2
fn
...
f1
f2
fn
...
w1
w2
wn
w1
w2
wn
sum
multiply
Relevance for Structured Document Retrieval• Term-level Relevance• Which term is important for user’s information need?
• Field-level Relevance• Which field is important for user’s information need?
6
P(F|R)
F = (F1 F2 … Fn)V = (w1 w2 ... wm)
P(w|R)Field-level relevanceTerm-level relevance
Defining the Field Relevance7
q1 … qi … qm
F 1 …
F j
… F
n
P(F|q1,R) P(F|qm,R)P(F|qi,R)
Field RelevanceThe distribution of per-term relevance over document fields
per-term P(F|w,R)
Query:m words
Collection:n fields for each document
Q = (q1 q2 ... qm)
F = (F1 F2 … Fn)
8
1
1
221
2
• Different fields are relevant for different query-term
‘james’ is relevant when it occurs in
<to>
‘registration’ is relevant when it occurs
in <subject>
Why P(F|w,R) instead of P(F|R)?
Query: ‘james registration’
More Evidence for the Field Relevance• Field Operator / Advanced Search Interface• User’s search terms are found in multiple fields
9
Understanding Re-finding Behavior in Naturalistic Email Interaction Logs. Elsweiler, D, Harvey, M, Hacker., M [SIGIR'11]
Evaluating Search in Personal Social Media Collections Chia-Jung L, Croft, W.B., Kim, J[WSDM12]
THE FIELD RELEVANCE MODEL
10
Retrieval over Structured Documents• Field-based Retrieval Models• Score each field against each query-term• Combine field-level scores using field weights
11
Fixed field weights wj can be too restrictive
• Field Relevance Model
• Comparison with Mixture of Field Language Model
12
Using the Field Relevance for Retrieval
Per-term Field Weight
Per-term Field Score
q1 q2 ... qm
f1
f2
fn
...
f1
f2
fn
...
w1
w2
wn
w1
w2
wn
q1 q2 ... qm
f1
f2
fn
...
f1
f2
fn
...
P(F1|q1)
P(F2|q1)
P(Fn|q1)
P(F1|qm)
P(F2|qm)
P(Fn|qm)
sum
multiply
Structured Document Retrieval: PRM-S13
• Probabilistic Retrieval Model for Semi-structured data • Estimate the mapping between query terms and doc. fields• Use the mapping probability as per-term field weights
[Kim, Xue, Croft 09]
Estimation is based on limited sources.
• Field Relevance Model
• Comparison with the PRM-S• FRM has the same functional form to PRM-S• FRM differs in how per-term field weights are estimated
14
Using the Field Relevance for Retrieval
Per-term Field Weight
Per-term Field Score
Per-term Field Weight
ESTIMATING FIELD RELEVANCE
15
Estimating Field Relevance: in a Nutshell• If User Provides Feedback• Relevant document provides sufficient information
• If No Feedback is Available• Combine field-level term statistics from multiple sources
16
contenttitle
from/to
Relevant Docscontent
titlefrom/to
Collection content
titlefrom/to
Top-k Docs
+ ≅
• Assume a user who marked DR as relevant• Estimate field relevance from the field-level term dist. of
DR
• We can personalize the results accordingly• Rank higher docs with similar field-level term distribution
Estimating Field Relevance using Feedback17
DR
- To is relevant for ‘james’- Content is relevant for ‘registration’
Field Relevance:
Estimating Field Relevance without Feedback18
Unigram is thesame to PRM-S
Similar to MFLM and BM25F
Pseudo-relevance Feedback
• Method• Linear Combination of Multiple Sources• Weights estimated using training queries
• Features• Field-level term distribution of the collection• Unigram and Bigram LM
• Field-level term distribution of top-k docs• Unigram and Bigram LM
• A priori importance of each field (wj)• Estimated using held-out training queries
EXPERIMENTS
19
Experimental Setup• Collections• TREC Emails• IMDB Movies• Monster Resumes
• Distribution of the Most Relevant Field
20
#Documents
#Queries
#RelDocs / Query
TREC 198,394 125 1IMDB 437,281 50 2Monster 1,034,795 60 15
Query Examples (Indri)• Oracle Estimates of Field Relevance
21
TREC
IMDB
Monster
Retrieval Methods Compared• Baselines• DQL / BM25F• MFLM : fixed regardless of terms• PRM-S : estimated using the collection
• Field Relevance Models• FRM-C : estimated using the combination• FRM-O : estimated using relevant documents
22
Differs only in terms of the field weighting!
DQL BM25F MFLM PRM-S FRM-C FRM-OTREC 54.2% 59.7% 60.1% 62.4% 66.8% 79.4%IMDB 40.8% 52.4% 61.2% 63.7% 65.7% 70.4%Monster 42.9% 27.9% 46.0% 54.2% 55.8% 71.6%
Retrieval Effectiveness23
(Metric: Mean Reciprocal Rank)
Fixed Field WeightsPer-term Field Weights
DQL BM25F MFLM PRM-S FRM-C FRM-O20.0%30.0%40.0%50.0%60.0%70.0%80.0%
TRECIMDBMonster
• Aggregated KL-Divergence from Oracle Estimates
• Aggregated Cosine Similarity with Oracle Estimates
Quality of Field Relevance Estimation24
TREC Monster IMDB0.0001.0002.0003.0004.0005.000
MFLMPRM-SFRM-C
TREC Monster IMDB0.0000.2000.4000.6000.8001.000
MFLMPRM-SFRM-C
• Features Revisited• Field-level term distribution of the collection (PRM-S)• Field-level term distribution of top-k documents• A priori relevance of term (prior)
• Results for TREC Collection
Feature Ablation Results25
Feature Set All -rug/rbg -cbg/
rbg -cbg/cug -priorMAP 0.668 0.662 0.651 0.648 0.644%Reduction 0% -0.9% -2.5% -3% -3.6%
Unigram
Bigram
Collection LM
cug cbg
Top-k Docs LM
rug rbg
CONCLUSIONS
26
Summary• Field relevance as a generalization of field weighting• Relevance modeling for structured document retrieval
• Field relevance model for structured doc. retrieval• Using field relevance to combine per-field LM scores
• Estimating the field relevance using relevant docs• Providing a natural way to incorporate relevance feedback
• Estimating the field relevance by combining sources• Improved performance over MFLM and PRM-S
27
Ongoing Work• Large-scale batch evaluation on a book collection• Test collections built using OpenLibrary.org query logs
• Evaluation of the relevance feedback on FRM• Does relevance feedback improves on subsequent results?
• Integrating the term relevance and field relevance• Further improvement is expected when combined
28
Field Relevance
Term Relevance
I’m on the job market!• Structured Document Retrieval• A Probabilistic Retrieval Model for Semi-structured Data [ECIR09]
• A Field Relevance Model for Structured Document Retrieval [ECIR11]
• Personal Search• Retrieval Experiments using Pseudo-Desktop Collections [CIKM09]
• Ranking using Multiple Document Types in Desktop Search [SIGIR10]
• Evaluating an Associative Browsing Model for Personal Info. [CIKM11]
• Evaluating Search in Personal Social Media Collections [WSDM12]
• Web Search• An Analysis of Instability for Web Search Results [ECIR10]
• Characterizing Web Content, User Interests, and Search Behavior by Reading Level and Topic [WSDM12]
29
More at @jin4ir, orcs.umass.edu/
~jykim
OPTIONAL SLIDES
30
Optimality of Field Relevance Estimation• This results in the optimal field weighting• Scores DR as highly as possible against other docs• Under the language modeling framework for IR
31
Per-term Field Weight
Per-term Field Score
Proof on the extended version
Features based on Field-level Term Dists.• Summary
• Estimation
32
Unigram LM (= PRM-S)
Bigram LM
Unigram
Bigram
Collection LM cug cbgTop-k Docs LM rug rbg