Analysis of Precision and Recall

7/30/2019 Analysis of Precision and Recall

1/15

1

A Statistical Analysis of thePrecision-Recall Graph

Ralf Herbrich

Microsoft Research

UK

Joint work with Hugo Zaragoza and Simon Hill


2/15

2

Overview

The Precision-Recall Graph

A Stability Analysis

Main Result Discussion and Applications

Conclusions


3/15

3

Features of Ranking Learning

We cannot take differences of ranks.

We cannot ignore the order of ranks.

Point-wise loss functions do not capture theranking performance!

ROC or precision-recall curves do capture

the ranking performance.

We need generalisation error bounds for

ROC and precision-recall curves!


4/15

4

Precision and Recall

Given: Sample z=((x1,y1),...,(xm,ym)) 2 (X {0,1})

m withk positive yi together with a function f:X ! R.

Ranking the sample: Re-order the sample: f(x(1)) f(x(m))

Record the indices i1,, ik of the positive y(j).

Precision pi and ri recall:


5/15

5

Precision-Recall: An Example

After reordering:

f(x(i))


6/15

6

Break-Even Point

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Recall

Precision

Break-Even point


7/15

7

Average Precision

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Recall

Precision


8/15


9/15

9

Stability Analysis

Case 1: yi=0

Case 2: yi=1


10/15

10

Proof

Case 1: yi=0

Case 2: yi=1


11/15

11

Main Result

Theorem: For all probability measures, for all>1/m, for allf:X ! R, with probability at least1- over the IID draw of a training and test

sample both of size m, if both training samplez and test sample z contain at least dmepositive examples then


12/15

12

Proof

1. McDiarmids inequality: For any function

g:Zn ! R with stability c, for all probability

measures P with probability at least 1-

over the IID draw ofZ

2. Set n= 2m and call the two m-halfes Z1 andZ2. Define gi (Z):=A(f,Zi). Then, by IID


13/15

13

Discussions

First bound which shows that asymptotically

(m!1) training and test set performance (in

terms of average precision) converge!

The effective sample size is onlythe number

of positive examples, in fact, only 2m .

The proof can be generalised to arbitrary test

sample sizes.

The constants can be improved.


14/15

14

Applications

Cardinality bounds

Compression Bounds

(TREC 2002)

No VC bounds!

No Margin bounds!

Union bound:


15/15

15

Conclusions

Ranking learning requires to consider non-

point-wise loss functions.

In order to study the complexity of algorithms

we need to have large deviation inequalities

for ranking performance measures.

McDiarmids inequality is a powerful tool.

Future work is focused on ROC curves.

Date post:	14-Apr-2018
Category:	Documents
Upload:	vinutha-raghavendra
View:	217 times
Download:	0 times

Analysis of Precision and Recall

Documents