PRanking with Ranking
Based on joint work with Yoram Singer at the Hebrew University of Jerusalem
Koby Crammer Technion – Israel Institute of Technology
Problem
3 3 0 1
Machine Prediction
User’s Rating
Ranking Loss 3
Ranking – Formal Description
• Instances
• Labels
• Structure
• Ranking rule
• Ranking Loss
• Algorithm works in rounds • On each round the ranking
algorithm : – Gets an input instance – Outputs a rank as prediction – Receives the correct rank-
value – Computes loss – Updates the rank-prediction
rule
Online Framework Problem Setting
€
1 2 3 4 5
x1 is preferred over x2
Goal
• Algorithms Loss
• Loss of a fixed function
• Regret
• No statistical assumptions over data • The algorithm should do well irrespectively of
specific sequence of inputs and target labels
€
Lt = yi − ˆ y ii=1
t
∑
€
Lt f( ) = yi − f xi( )i=1
t
∑
€
Lt − inff ∈FLt f( )
Background
Binary Classification
1 2
w
The Perceptron Algorithm Rosenblatt, 1958
• Hyperplane w
1 2
w
(X,1)
• Hyperplane w • Get new instance x • Classify x :
sign( )
The Perceptron Algorithm Rosenblatt, 1958
• Hyperplane w • Get new instance x • Classify x :
sign( ) • Update (in case of a mistake)
w
1 2
1 2
w
(X,1)
The Perceptron Algorithm Rosenblatt, 1958
A Function Class for Ranking
Our Approach to Ranking • Project
Our Approach to Ranking • Project
• Apply Thresholds
< >
4 3 2 1 Rank
Update of a Specific Algorithm
it if its not Broken Least change as possible
One step at a time
PRank
• Direction w, Thresholds
w
5 1 3 2 4
Rank Levels
Thresholds
PRank
• Direction w, Thresholds
• Rank a new instance x
w
5 1 3 2 4
PRank
• Direction w, Thresholds
• Rank a new instance x • Get the correct rank y
w
5 1 3 2 4
Correct Rank Interval
PRank
• Direction w, Thresholds
• Rank a new instance x • Get the correct rank y • Compute Error-Set E w
5 1 3 2 4
PRank – Update
• Direction w, Thresholds
• Rank a new instance x • Get the correct rank y • Compute Error-Set E • Update :
–
w
PRank – Update
• Direction w, Thresholds
• Rank a new instance x • Get the correct rank y • Compute Error-Set E • Update :
–
–
w
x
x
w
PRank – Summary of Update
• Direction w, Thresholds
• Rank a new instance x • Get the correct rank y • Compute Error-Set E • Update :
–
–
w
x
x
w
x
Predict :
Get the true rank y
Compute Error set :
Get an instance x Maintain
No
? Yes
Update
The PRank Algorithm
Analysis Two Lemmas
Consistency
• Can the following happen?
w
b 4 b 2 b 2 b 3 b 1
• Can the following happen?
• The order of the thresholds is preserved after each round of PRank : .
Consistency
No
w
b 4 b 2 b 2 b 3 b 1
Given : • Arbitrary input sequence
Easy Case: • Assume there exists a model that ranks all
the input instances correctly – The total loss the algorithm suffers is bounded
Hard Case: • In general
– A “regret” is bounded
Regret Bound
€
Lt − inff ∈F
˜ L t f( )
Margin(x,y) = min
Ranking Margin
w
1 2 4 5 3
,
Margin(x,y) = min
Ranking Margin
w
1 2 4 5 3
,
Margin(x,y) = min
Ranking Margin
w
1 2 4 5 3
,
Margin(x,y) = min
Ranking Margin
Margin = min Margin
w
1 2 4 5 3
,
• Input sequence , • Norm of instances is bounded • Ranked correctly by a normalized ranker with Margin>0
Mistake Bound
Number of Mistakes PRank Makes
Given :
Then :
Exploit Structure
Loss Range Structure
Classification None
Regression Metric
Ranking Order
Under Constraint
Over Constraint
Other Approaches
• Treat Ranking as Classification or Regression
• Reduce a ranking problem into a classification problem over pair of examples
– Not simple to combine preferences predictions over pairs into a singe consistent ordering
– No simple adaptation for online settings
Basu, Hirsh, Cohen 1998
Freund, Lyer, Schapire, Singer 1998 Herbrich, Graepel, Obermayer 2000
E.g.
E.g.
Empirical Study
An Illustration
PRank Ranking
MC-Perceptron Classification
Widrow-Hoff Regression
• Five concentric ellipses • Training set of 50 points • Three approaches
• Pranking • Classification • Regression
Each-Movie database
• 74424 registered Users • 1648 listed Movies • Users ranking of movies • 7451 Users saw >100 movies • 1801 Users saw >200 movies
Ranking Loss, 100 Viewers
Ran
k Lo
ss
Round
WH MC-Perceptron PRank
Over constrained
Under constrained
Accurately constrained
Regression Classification PRank
Ranking Loss, 200 Viewers
WH MC-Perceptron PRank
Round
Ran
k Lo
ss
Regression Classification PRank
Demonstration
(1) User choose movies from this list
(2) Movies chosen and ranked by user
(3) Press the ‘learn’ key. The systems learns the user’s taste
(4) The system re-ranks the training set
(5) The system re-ranks a new fresh set of yet unseen movies
(6) Press the ‘flip’ button to see what movies you should not view
(7) The flipped list
• Many alternatives to formulate ranking • Choose one that models best your problem • Exploit and Incorporate structure • Specifically:
– Online algorithm for ranking problems via projections and conservative update of the projection’s direction and the threshold values
– Experiments on a synthetic dataset and on Each-Movie data set indicate that the PRank algorithm performs better then algorithms for classification and regression