Date post: | 19-Dec-2015 |
Category: |
Documents |
View: | 213 times |
Download: | 0 times |
On the Limits of Dictatorial Classification
Reshef MeirSchool of Computer Science and Engineering, Hebrew University
Joint work with Shaull Almagor, Assaf Michaely and Jeffrey S. Rosenschein
Strategy-Proof Classification
• An Example
• Motivation
• Our Model and previous results
• Filling the gap: proving a lower bound
• The weighted case
ERM
Motivation Model Results
Strategic labeling: an example
Introduction
5 errors
There is a better classifier! (for me…)
Motivation Model ResultsIntroduction
If I just change the
labels…
Motivation Model ResultsIntroduction
2+5 = 7 errors
ClassificationThe Supervised Classification problem:
– Input: a set of labeled data points {(xi,yi)}i=1..m
– output: a classifier c from some predefined concept class C ( e.g., functions of the form f : X{-,+} )
– We usually want c to classify correctly not just the sample, but to generalize well, i.e., to minimize
R(c) ≡the expected number of errors w.r.t. the distribution D
(the 0/1 loss function)
Motivation ResultsIntroduction Model
E(x,y)~D[ c(x)≠y ]
Classification (cont.)• A common approach is to return the ERM
(Empirical Risk Minimizer), i.e., the concept in C that is the best w.r.t. the given samples (has the lowest number of errors)
• Generalizes well under some assumptions on the concept class C (e.g., linear classifiers tend to generalize well)
With multiple experts, we can’t trust our ERM!
Motivation ResultsIntroduction Model
Where do we find “experts” with incentives?
Example 1: A firm learning purchase patterns– Information gathered from local retailers– The resulting policy affects them – “the best policy, is the policy that fits my pattern”
Introduction Model ResultsMotivation
Users Reported Dataset
Classification AlgorithmClassifier
Introduction Model Results
Example 2: Internet polls / polls of experts
Motivation
Introduction Model Results
Motivation from other domains
Motivation
Aggregating partitions
Judgment aggregation
Facility location (on the binary cube)
Agent A B A & B A | ~B
T F F T
F T F F
F F F T
A problem instance is defined by
• Set of agents I = {1,...,n}• A set of data points
X = {x1,...,xm} X• For each xkX agent i has a label yik{,}
– Each pair sik=xk,yik is a sample– All samples of a single agent compose the labeled dataset
Si = {si1,...,si,m(i)} • The joint dataset S= S1 , S2 ,…, Sn is our input
– m=|S|• We denote the dataset with the reported labels by S’
Introduction Motivation ResultsModel
Agent 1 Agent 2 Agent 3
Input: Example
––
–
–
+
+
–
X Xm
Y1 {-,+}m Y2 {-,+}m Y3 {-,+}m
S = S1, S2,…, Sn = (X,Y1),…, (X,Yn)
Introduction Motivation ResultsModel
–+
–
+
-
-
–
–+
+
–
-
+
+
Mechanisms
• A Mechanism M receives a labeled dataset S and outputs c = M(S) C
• Private risk of i: Ri(c,S) = |{k: c(xik) yik}| / mi
• Global risk: R(c,S) = |{i,k: c(xik) yik}| / m
• We allow non-deterministic mechanisms– Measure the expected risk
Introduction Motivation ResultsModel
% of errors on Si
% of errors on S
ERM
We compare the outcome of M to the ERM:c* = ERM(S) = argmin(R(c),S)r* = R(c*,S)
c C
Can our mechanism simply compute and return the ERM?
Introduction Motivation ResultsModel
(Lying)
Requirements
1. Good approximation: S R(M(S),S) ≤ α∙r*
2. Strategy-Proofness (SP): i,S,Si‘ Ri(M(S-i , Si‘),S) ≥ Ri(M(S),S)
• ERM(S) is 1-approximating but not SP• ERM(S1) is SP but gives bad approximation
Are there any mechanisms
that guarantee both SP and
good approximation?
Introduction Motivation ResultsModel
MOST IMPORTANT
SLIDE
(Truth)
• A study of SP mechanisms in Regression learning
– O. Dekel, F. Fischer and A. D. Procaccia, SODA (2008), JCSS (2009). [supervised learning]
• No SP mechanisms for Clustering
– J. Perote-Peña and J. Perote, Economics Bulletin (2003) [unsupervised learning]
Introduction Motivation Model Results Related work
Results
A simple case
• Tiny concept class: |C|= 2• Either “all positive” or “all negative”
Theorem: • There is a SP 2-approximation mechanism• There are no SP α-approximation mechanisms,
for any α<2
Introduction Motivation Model
Meir, Procaccia and Rosenschein, AAAI 2008
Previous work
Results
General concept classes
Theorem: Selecting a dictator at random is SP and guarantees approximation
– True for any concept class C– Generalizes well from sampled data when C has a
bounded VC dimension
Open question #1: are there better mechanisms?Open question #2: what if agents are weighted?
Introduction Motivation Model
Meir, Procaccia and Rosenschein, IJCAI 2009
Previous work
n23
A lower boundIntroduction Motivation Model Results
Theorem: There is a concept class C (where |C|=3), for which any SP mechanism has an approximation ratio of at least n
23
Our main result:
o Matching the upper bound from IJCAI-09
o Proof is by a careful reduction to a voting scenario
o We will see the proof sketch
Proof sketchIntroduction Motivation Model Results
Gibbard [‘77] proved that every (randomized) SP voting rule for 3 candidates, must be a lottery over dictators*.
We define X = {x,y,z}, and C as follows:
We also restrict the agents, so that each agent can have mixed labels on just one point
x y zcx + - -
cy - + -
cz - - +
x y z- - - - - - - - ++++ - - - - ++++++++
++++++++ - - - - - - - - ++ - - - - - -
Proof sketch (cont.)Introduction Motivation Model Results
x y z- - - - - - - - ++++ - - - - ++++++++
++++++++ - - - - - - - - ++ - - - - - -
Suppose that M is SP
Proof sketch (cont.)Introduction Motivation Model Results
x y z- - - - - - - - ++++ - - - - ++++++++
++++++++ - - - - - - - - ++ - - - - - -
Suppose that M is SP
1. M must be monotone on the mixed point
2. M must ignore the mixed point
3. M is a (randomized) voting rule
cz > cy > cx
cx > cz > cy
Proof sketch (cont.)Introduction Motivation Model Results
x y z- - - - - - - - ++++ - - - - ++++++++
++++++++ - - - - - - - - ++ - - - - - -
4. By Gibbard [‘77], M is a random dictator
5. We construct an instance where random dictators perform poorly
cz > cy > cx
cx > cz > cy
31
32
Weighted agentsIntroduction Motivation Model Results
• We must select a dictator randomly
• However, probability may be based on weight
• Naïve approach: o Only gives 3-approximation
• An optimal SP algorithm:o Matches the lower bound of
iwipr )(
)1(2)(
i
i
w
wipr
n23
Future work• Other concept classes
• Other loss functions (linear loss, quadratic loss,…)
• Alternative assumptions on structure of data
• Other models of strategic behavior
• …
Introduction Motivation Model Results