Date post: | 19-Dec-2015 |
Category: |
Documents |
View: | 217 times |
Download: | 1 times |
1
Carnegie Mellon
Maximum Likelihood Estimation for Information Thresholding
Yi Zhang & Jamie Callan
Carnegie Mellon University
{yiz,callan}@cs.cmu.edu
2
Overview
Adaptive filtering: definition and challenges Threshold based on score distribution and the sampling
bias problem Maximum likelihood estimation for score distribution
parameters Results of Experiments Conclusion
3
Given an initial description of information needs, a filtering system sifts through a stream of documents,and delivers relevant documents to a user as soon as the document arrives. Relevance feedback maybe available for some of the delivered documents, thus user profiles can be updated adaptively.
Filtering System
Adaptive Filtering
4
Adaptive Filtering
Three major problems Learning corpus statistics, such as idf Learning user profile, such as adding or deleting key words and adjusting
term weights. (Scoring method) Learning delivery threshold. (Binary judgment)
Evaluation Measures Linear utility = r1*RR+r2*NR+r3*RN+r4*NN
Optimizing linear utility => Finding P(relevant|document)
In one dimension: P(relevant|document) = P(relevant|score) F measure
RecallPrecision
Recall*Precision12
2
*β
)β(F
5
A Model of Score Distribution: Assumptions and Empirical Justification
Relevant:
Non-relevant:
According to other researchers, this is generally true for various statistical searching systems (scoring methods, Manmatha’s paper, Arampatzis’s paper)
2
2
2
)(
2
1)|(
uscore
erRscoreP
)cx(e)nrR|score(P
60
50
40
30
20
10
0
document score
num
ber
of d
ocum
ents
0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.5
document score
num
ber o
f doc
umen
ts
160
140
120
100
80
60
40
20
0 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.50
0 0.40 0.42 0.44 0.46 0.48 0.50 0.52
document score
16
14
12
10
8
6
4
2
num
ber
of d
ocu
men
ts
0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.50
120
100
80
60
40
20
document score
num
ber
of d
ocu
men
ts
Figure 1. Density of document scores: TREC9 OHSU Topic 3 and Topic 5
6
Optimize for Linear Utility Measure: from Score Distribution to Probability of Relevancy p: p(r) ratio of relevant documents
)1(**2
1
*2
1
)()|()()|(
)()|(
)(
)()|()|(
)(2
)(
2
)(
2
2
2
2
pepe
pe
nrPnrscorePrPrscoreP
rPrscoreP
scoreP
rPrscorePscorerP
cscoreuscore
uscore
7
Optimize for F Measure: From Score Distribution to Precision and Recall
dxxpdxxunormp
dxxunormp
),exp()1(),,(
),,(
)(Precision
If set threshold at θ:
dxxunorm ),,()(Recall
PR
RPF
*
**)1(maxargmaxarg
2
2*
0.4
1 0.4
2 0.43 0.44 0.45 0.46 0.47 score
0
0.01
0.02
0.03
0.04
0.04
0.06
0.07
i
non-relevant document
relevant document
8
What We Have Now?
A model for score distribution Algorithms to find the optimal threshold for different
evaluation measures given the model Learning task: find the parameters for the model?
9
Bias Problem for Parameter Estimation while Filtering
We only receive feedback for documents delivered
Parameter estimation based on random sampling assumption is biased
Sampling criteria depends on threshold, which changes over time
Solution: maximum likelihood principle, which is guaranteed to be unbiased
document score
20
40
60
80
100
120
140
Estimation based on
all relevant documents
Estimation based on documents delivered
num
ber
of d
ocum
ents
0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.5
Figure: Estimation of parameters for relevant document scores of TREC9 OHSU Topic 3 with a fixed dissemination threshold 0.4435
10
Unbiased Estimation of Parameters Based on Maximum Likelihood Principle (1)
),,,(
)),|,(log(maxarg
))|(log(maxarg
)|(maxarg
)|(maxarg),,,(
1
1
1
*****
pu
where
ScoreRScoreScoreP
DP
DP
DPpu
N
iiii
N
ii
N
ii
ML: the best estimation of parameters is the one that maximizes the probability of training data:
11
Unbiased Estimation of Parameters Based on Maximum Likelihood Principle (2)
)|(
)|(),|(
)|(
|),|,(
)|(
)|,,(
),|,(
i
iii
i
iiii
i
iii
iii
ScoreP
RPRScoreScoreP
ScoreP
RPRScoreScoreScoreP
ScoreP
RScoreScoreScoreP
ScoreRScoreScoreP
For each item inside the sum operation of the previous formula:
12
Unbiased Estimation of Parameters Based on Maximum Likelihood Principle (3)
0.4
1 0.4
2 0.43 0.44 0.45 0.46 0.47 score
0
0.01
0.02
0.03
0.04
0.04
0.06
0.07
i
non-relevant document
relevant document
)(2
)(
)*(2
)(
)1(2
1
)1(2
1
)|()1(
)|(
),|()|(
),|()|(
)|(),,,,(
2
2
2
2
cux
cxux
i
i
i
i
ii
i
i
ii
i
i
epdxep
dxepdxep
dxnrRxScorePp
dxrRxScorePp
nrRScorePnrRP
rRScorePrRP
ScorePpug
Calculating the denominator:
13
Unbiased Estimation of Parameters Based on Maximum Likelihood Principle (4)
N
ii
puLPpu
1),,,(
**** maxarg),,,(
))),,,,(/(ln(2
)(2
2
ii
i pugpuScore
LP
• For a relevant document delivered:
• For a non-relevant document delivered:
))),p,,,u(g/)p1ln(()cScore(LP iii
14
Relationship to Arampatzis’s Estimation
If no threshold exists
The previous formula becomes:
1)|(),,,,( ii ScorePpug
N
ii
puLPpu
1),,,(
**** maxarg),,,(
2
2
2
)(
uScore
LP ii
• For a relevant document delivered:
• For a non-relevant document delivered:
))1ln(()( pcScoreLP ii
Corresponding result will be the same as Arampatzis’s
15
Unbiased Estimation of Parameters Based on Maximum Likelihood Principle (5)
Optimization using conjugate gradient descent algorithm
Smoothing using conjugate prior: Prior for p: beta distribution: Prior for variance: Set:
21 )1( pp
2
2
2
005.0,001.0,001.0 21
16
Experimental Methodology (1) Optimization goal (similar to the measure used by
TREC9):T9U’=2*Relevant_Retrieved-Non_Relevant_Retrieved=2RR-NR
Corresponding rule: deliver if :
Dataset OHSUMED data (348566 articles from 1887 to 1991. 63 OHSUMED queries and
500 MeSH headings to simulate user profiles) FT data (210158 articles from Financial Times 1991 to 1994. TREC topics 351-400
to simulate user profiles)
Each profile begins with 2 relevant documents and an initial user profile
No profile updating for simplicity.
33.0)score|rR(P
17
Experimental Methodology (2)
Four runs for each profile Run1 : biased estimation of parameters because sampling bias was not considered Run3 : maximum likelihood estimation.
Both runs will stop delivering documents if the threshold is set too high, especially in the early stages of filtering. We introduced a minimum delivery ratio: If a profile has not achieved the minimum delivery ratio, its threshold will be decreased automatically:
Run 2: biased estimation + minimum delivery ratio Run 4: maximum likelihood estimation + minimum delivery ratio
Time: 21 minutes for the whole process of 63 OHSU topics on 4 years of OHSUMED data (ML algorithm)
18
Results: OHSUMED Data
Run 1: Biased estimation
Run 2: Biased estimation+ min. delivery Ratio
Run 3: Unbiased estimation
Run4:Unbiased estimation+min. delivery ratio
OHSU topics
T9U’ utility 1.84 3.25 2.7 8.17Avg. docs. delivered per profile
3.83 9.65 5.73 18.40
Precision 0.37 0.29 0.36 0.32
Recall 0.036 0.080 0.052 0.137
MESH topics
T9U’ utility 1.89 4.28 2.44 13.10Avg. docs. delivered per profile
3.51 11.82 6.22 27.91
Precision 0.42 0.39 0.40 0.34
Recall 0.018 0.046 0.025 0.068
19
Results: Financial Times
Run 1:
Biased estimation
Run 2:
Biased estimation + min. delivery ratio
Run 3:
Unbiased estimation
Run 4:
Unbiased estimation + min. delivery ratio
T9U’ utility 1.44 -0.209 0.65 0.84Avg. docs. Delivered per profile
9.58 10.44 9.05 12.27
Precision 0.20 0.17 0.22 0.26
Recall 0.161 0.167 0.15 0.193
20
Result Analysis: Difference Between Run 4 and Run 2 on TREC9 OHSU Topics
• For most of the topics, ML (Run 4) delivered more documents than Run 2
•For some of the topics , ML (run 4) has a much higher utility than Run 2, while they are similar in most of the other topics
0 20 40 60 80-20
0
20
40
60
80
100
120
0 20 40 60 80-40
-20
0
20
40
60
80
100
Utility: ML - Biased Docs delivered:ML -Biased
Topics Topics
21
Conclusion
Score density distribution Relevant documents: normal distribution Non-relevant documents: exponential distribution
Bias problem due to non-random sampling can be solved based on the maximum likelihood principle
Significant improvement in the TREC-9 filtering task. Future work
Thresholding while updating profiles Non-random sampling problem in other task