+ All Categories
Home > Documents > Arijit Biswas (University of Maryland, College Park) and ...parikh/Publications/Bis... · Free-form...

Arijit Biswas (University of Maryland, College Park) and ...parikh/Publications/Bis... · Free-form...

Date post: 26-Jun-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
1
Simultaneous Active Learning of Classifiers & Attributes via Relative Feedback Arijit Biswas (University of Maryland, College Park) and Devi Parikh (Virginia Tech) Attributes-based Feedback for Training Classifiers I think this is a forest ( ). No, it is TOO OPEN ( ) for a forest. Query [images more open than query] These must not be forests either. This is not a forest [Parkash & Parikh ECCV 2012] I think this is a forest ( ). Yes, this is a forest. Query This is forest and it is not anything else. [Parkash & Parikh ECCV 2012] + Leads to better discriminative performance than label-based feedback - Needs pre-trained attribute models - All negative examples treated to be equally likely - Query image not selected intelligently Contributions I Learn attribute & category models simultaneously on the fly; do not require pre-trained attributes more flexible & practical I Actively select query image to maximize expected gain from attributes-based feedback faster learning I Intelligently weigh instances based on feedback robust learning I Large Relative Face Attributes Dataset created! I 60 categories [PubFig, Kumar et al. 2009] I 29 attributes I Available online I Large vocabulary of categories; users can only verify I Realistic in surveillance, bird or leaf recognition 'White' 'DarkHair' 'StraightHair' 'Beard' 'GoodLooking' 'young' 'BagsUnderEyes' 'Baldness' 'Chubbiness' 'BushyEyebrows' 'EyesOpen' 'HighCheekbones' 'MasculineLooking/Male' 'MouthOpen' 'Mustache' 'NarrowEyes' 'PointyNose' 'BigNose' 'Nose-to-MouthLines' 'RosyCheeks' 'RoundFace' 'RoundJaw' 'Shiny Skin' 'LongSideburns' 'Smiling' 'VisibleTeeth' 'VisibleForehead' 'WearingLipstick' 'BigLips' Weighing Negative Examples I think this is a forest. No, it is TOO OPEN for a forest. Less likely to not be forest Query Images ordered based on openness attribute, rightmost is most open More likely to not be forest Query I w l Q (x ) captures the likelihood at iteration Q that unlabeled image x is not from class l I Computed using attributes-based feedbacks over past iterations w l Q (x )= Q X q =1 n q (x ) (1) I Where n q (x ) is 0 if: l was not predicted label for x q OR l was correctly predicted OR x does not have more a m q (attribute feedback at iteration q ) than x q i.e. attribute strength r m q (x ) < r m q (x q ) I Otherwise, n q (x ) is number of images between x q and x , when sorted by attribute a m q Simultaneous Attribute and Category Learning I Do not need pre-trained attributes! I User can introduce any attribute at any time; highly flexible I If user says: “x q is too a m to be l ”: I learner fetches images labeled as l I appends O m with constraints ˆ O m = O m ∪{(x q , x j )} I where, x j s have been labeled as l and O m are ordering constraints used to train a m I think this is a forest. No, it is TOO OPEN for a forest. [images more open than query] These must not be forests either. Query They should be less open than query [images labeled as forest Used to train attribute models Used to train category models Active Selection of Images I Select image that reduces expected system entropy the most I Novel active learning algorithm in attribute based feedback setup Present Entropy of system: H = - N X i =1 K X k =1 p k (x i ) log(p k (x i )) (2) I p k (x i ) is probability of image x i belonging to class k (according to classifier h k ) and N is number of images in unlabeled set Which image should I ask about? Select the image which reduces expected entropy of the system I Expected change in entropy of system: ΔH (i )= H - p 0 H 0 + p 1 M X m=1 p 1+ m H 1+ m + M X m=1 p 1- m H 1- m (3) I p 0 is the probability that the user accepts label for x i ; H 0 is resultant entropy of system I p 1 = 1 - p 0 is the probability that the user rejects label and provides an attributes-based feedback; H 1 is resultant entropy I There are 2M possible feedback statements (M attributes, “too” or “not enough” I Chances of the supervisor picking any of M attributes with “too” response is p 1+ m and “not enough” response is p 1- m . I Resultant entropy of the system is H 1+ m and H 1- m respectively. Efficient Active Selection: I Brute force method to find best image has high computational cost I Requires learning 2NM ranking functions at each iteration I We propose a fast approximation by clustering I Train ranking function only for every cluster center instead of every image I Need to train only 5 - 7% ranking functions Labeled Image 1 Labeled Image 2 Cluster 1 Cluster 2 ........ Cluster C Increasing Attribute Strength Images Labeled with class l Unlabeled Images predicted to be from class l Cluster Representative Collecting Relative Attribute Data from Mturk I Exhaustive data collection to run experiments automatically while still using feedback from real users I Show example images from a pair of categories to 10 workers on Mturk and ask which category has a stronger presence of attribute I Two interfaces used for experiments: I Mturk workers provide free form attribute feedback I Mturk workers choose an attribute from a list that corresponds to the most obvious difference between the two categories Miley Cyrus Hugh Laurie Who Is older? Experimental Results Name of Method Attribute Feedback Used Weighting Scheme Used Attributes Learned Query Image Selection Baseline passive no N/A N/A random Baseline active no N/A N/A max-entropy Parkash & Parikh -active yes no pre-trained max-entropy Parkash & Parikh -passive yes no pre-trained random Proposed passive-pre-trained-weights yes yes pre-trained random Proposed passive-on-the-fly-without-weights yes no on-the-fly random Proposed passive-on-the-fly-weights yes yes on-the-fly random Proposed active-maxent-on-the-fly-weights yes yes on-the-fly max-entropy Proposed yes yes on-the-fly proposed Table: Summary of the algorithms that we compare I Run experiments in two different domains faces and shoes I Weighing negative samples improves performance; especially with pre-trained attributes I Learning attributes on the fly lets us add more correct negative examples to the classifier I No pre-training cost but better classification with attributes on the fly! 0 50 100 150 25 30 35 40 45 50 Number of Iterations Classification Accuracy Parkash and Parikh-passive Proposed passive-pretrained-weights Proposed passive-on-the-fly-without-weights Proposed passive-on-the-fly-weights (a) Shoes-750-10 0 50 100 150 20 25 30 35 40 45 50 Number of Iterations Classification Accuracy Parkash and Parikh-passive Proposed passive-pre-trained-weights Proposed passive-on-the-fly-without-weights Proposed passive-on-the-fly-weights (b) Pubfig-772-8 Figure: Impact of our proposed weighting scheme and of learning attribute models on the fly on performance 0 50 100 150 20 25 30 35 40 45 50 55 Number of Iterations Classification Accuracy Baseline passive Baseline active Parkash and Parikh-active [17] Parkash and Parikh-passive Proposed passive-pre-trained-weights Proposed passive-on-the-fly-weights Proposed active-maxent-on-the-fly-weights Proposed (a) Pubfig-772-8 0 50 100 150 200 250 300 21 22 23 24 25 26 27 28 29 Number of Iterations Classification Accuracy Baseline passive Baseline active Proposed passive-on-the-fly-weights Proposed active-maxent-on-the-fly-weights Proposed (b) Pubfig-900-60 0 50 100 150 28 30 32 34 36 38 40 42 44 46 Number of Iterations Classification Accuracy Baseline passive Baseline active Proposed passive-on-the-fly-weights Proposed active-maxent-on-the-fly-weights Proposed (c) Shoes-750-10 Figure: Comparing our proposed approach to various baselines (Table above) I Our active method outperforms the passive and traditional maximum entropy image selection methods Additional Experiments I Fast active approach is not a lot worse than brute force (tested on a smaller dataset) I Attribute models learned on the fly are worse as attribute predictors in general I We compare our two interfaces for data collection I Ideal: a system where people can provide free form attribute feedback I However that involves natural language processing: future work 0 50 100 150 200 250 300 20 30 40 50 60 70 80 Number of Iterations Attribute Model Accuracy on-the-fly attributes pretrained attributes (a) Quality of attribute predictors 0 20 40 60 80 100 20 25 30 35 40 45 50 55 Number of Iterations Classification Accuracy Baseline passive Proposed passive-on-the-fly-weights Proposed (fast) Proposed (Brute-force, slow) (b) Comparison to brute force 0 50 100 150 20 25 30 35 40 45 Number of Iterations Classification Accuracy Free-form feedback interface Multiple-choice feedback procedure Baseline passive (c) Free-form attributes-feedback Figure: (a) Attribute models learnt on the fly are worse attribute models per say, but are better suited for providing classifier-feedback than pre-trained attribute models. (b) Our clustering-based fast active learning approach does not perform significantly worse than the brute-force version of our approach which would be prohibitively slow. (c) A comparison between two interfaces for collecting attributes-based feedback Conclusion I We extend the relative attributes-based feedback setup for learning classifiers: more accurate, robust and practical I Gain in classification accuracy by significant margin I Collected and made available Relative Face Attributes Dataset for 60 classes University of Maryland, College Park and Virginia Tech email: [email protected] WWW: http://filebox.ece.vt.edu/ ˜ parikh/attribute_feedback/
Transcript
Page 1: Arijit Biswas (University of Maryland, College Park) and ...parikh/Publications/Bis... · Free-form feedback interface Multiple-choice feedback procedure Baseline passive (c)Free-form

Simultaneous Active Learning of Classifiers & Attributes via Relative FeedbackArijit Biswas (University of Maryland, College Park) and Devi Parikh (Virginia Tech)

Attributes-based Feedback for Training Classifiers

I think this is a

forest ( ).

No, it is TOO OPEN ( ) for a forest.

Query

[images more open than query]

These must not be forests either.This is not a forest

[Parkash & Parikh ECCV 2012]

I think this is a

forest ( ).

Yes, this is a forest.

Query

This is forest and it is not anything else.

[Parkash & Parikh ECCV 2012]

+ Leads to better discriminativeperformance than label-basedfeedback

- Needs pre-trained attributemodels

- All negative examplestreated to be equally likely

- Query image not selectedintelligently

ContributionsI Learn attribute & category models simultaneously on the fly; do not require pre-trained attributes→ more flexible & practicalI Actively select query image to maximize expected gain from attributes-based feedback→ faster learningI Intelligently weigh instances based on feedback→ robust learning

I Large Relative Face Attributes Dataset created!I 60 categories [PubFig, Kumar et al. 2009]I 29 attributesI Available online

I Large vocabulary of categories; users can only verifyI Realistic in surveillance, bird or leaf recognition

'White' 'DarkHair' 'StraightHair' 'Beard' 'GoodLooking' 'young' 'BagsUnderEyes' 'Baldness' 'Chubbiness' 'BushyEyebrows' 'EyesOpen' 'HighCheekbones' 'MasculineLooking/Male' 'MouthOpen' 'Mustache'

'NarrowEyes' 'PointyNose' 'BigNose' 'Nose-to-MouthLines' 'RosyCheeks' 'RoundFace' 'RoundJaw' 'Shiny Skin' 'LongSideburns' 'Smiling' 'VisibleTeeth' 'VisibleForehead' 'WearingLipstick' 'BigLips'

Weighing Negative Examples

I think this is a forest.

No, it is TOO OPEN for a forest.

Less likely to not be forest

Query

Images ordered based on openness attribute, rightmost is most open

More likely to not be forest

Query

I w lQ(x) captures the likelihood at iteration Q that unlabeled image x is not from class l

I Computed using attributes-based feedbacks over past iterations

w lQ(x) =

Q∑q=1

nq(x) (1)

I Where nq(x) is 0 if: l was not predicted label for xq OR l was correctly predicted OR x does not have more amq (attribute feedbackat iteration q) than xq i.e. attribute strength rmq(x) < rmq(xq)

I Otherwise, nq(x) is number of images between xq and x , when sorted by attribute amq

Simultaneous Attribute and Category Learning

I Do not need pre-trained attributes!I User can introduce any attribute at any time; highly flexibleI If user says: “xq is too am to be l”:

I learner fetches images labeled as lI appends Om with constraints Om = Om ∪ {(xq,x j)}I where, x j s have been labeled as l and Om are ordering

constraints used to train am

I think this is a forest.

No, it is TOO OPEN for a forest.

[images more open than query]

These must not be forests either.

Query

They should be less open than query

[images labeled as forest

Used to train attribute models

Used to train category models

Active Selection of ImagesI Select image that reduces expected system entropy the mostI Novel active learning algorithm in attribute based feedback setup

Present Entropy of system: H = −N∑

i=1

K∑k=1

pk(x i) log(pk(x i)) (2)

I pk(x i) is probability of image x i belonging to class k (according to classifier hk) and N is number of images in unlabeled set

Which image should I ask about?

Select the image which reduces expected entropy of the system

I Expected change in entropy of system:

∆H(i) = H −

p0H0 + p1

M∑m=1

p1+m H1+

m +M∑

m=1

p1−m H1−

m

(3)

I p0 is the probability that the user accepts label for x i; H0 is resultant entropy of systemI p1 = 1− p0 is the probability that the user rejects label and provides an attributes-based feedback; H1 is resultant entropyI There are 2M possible feedback statements (M attributes, “too” or “not enough”I Chances of the supervisor picking any of M attributes with “too” response is p1+

m and “not enough” response is p1−m .

I Resultant entropy of the system is H1+m and H1−

m respectively.

Efficient Active Selection:I Brute force method to find best image has high

computational costI Requires learning 2NM ranking functions at each iterationI We propose a fast approximation by clusteringI Train ranking function only for every cluster center instead of

every imageI Need to train only 5− 7% ranking functions

Labeled Image 1

Labeled Image 2

Cluster 1 Cluster 2 ........ Cluster C

Increasing Attribute Strength

Images Labeled with class lUnlabeled Images predicted to be from class lCluster Representative

Collecting Relative Attribute Data from MturkI Exhaustive data collection to run experiments

automatically while still using feedback fromreal users

I Show example images from a pair of categories to10 workers on Mturk and ask which categoryhas a stronger presence of attribute

I Two interfaces used for experiments:I Mturk workers provide free form attribute

feedbackI Mturk workers choose an attribute from a list

that corresponds to the most obvious differencebetween the two categories

Miley Cyrus Hugh Laurie

Who Is older?

Experimental ResultsName of Method Attribute Feedback Used Weighting Scheme Used Attributes Learned Query Image SelectionBaseline passive no N/A N/A randomBaseline active no N/A N/A max-entropyParkash & Parikh -active yes no pre-trained max-entropyParkash & Parikh -passive yes no pre-trained randomProposed passive-pre-trained-weights yes yes pre-trained randomProposed passive-on-the-fly-without-weights yes no on-the-fly randomProposed passive-on-the-fly-weights yes yes on-the-fly randomProposed active-maxent-on-the-fly-weights yes yes on-the-fly max-entropyProposed yes yes on-the-fly proposed

Table: Summary of the algorithms that we compare

I Run experiments in two differentdomains faces and shoes

I Weighing negative samples improvesperformance; especially with pre-trainedattributes

I Learning attributes on the fly lets usadd more correct negative examples tothe classifier

I No pre-training cost but betterclassification with attributes on thefly!

0 50 100 15025

30

35

40

45

50

Number of Iterations

Cla

ssifi

catio

n A

ccur

acy

Parkash and Parikh−passiveProposed passive−pretrained−weightsProposed passive−on−the−fly−without−weightsProposed passive−on−the−fly−weights

(a) Shoes-750-10

0 50 100 15020

25

30

35

40

45

50

Number of Iterations

Cla

ssifi

catio

n A

ccur

acy

Parkash and Parikh−passiveProposed passive−pre−trained−weightsProposed passive−on−the−fly−without−weightsProposed passive−on−the−fly−weights

(b) Pubfig-772-8

Figure: Impact of our proposed weighting scheme and of learning attribute models on the fly onperformance

0 50 100 15020

25

30

35

40

45

50

55

Number of Iterations

Cla

ssifi

catio

n A

ccur

acy

Baseline passiveBaseline activeParkash and Parikh−active [17]Parkash and Parikh−passiveProposed passive−pre−trained−weightsProposed passive−on−the−fly−weightsProposed active−maxent−on−the−fly−weightsProposed

(a) Pubfig-772-8

0 50 100 150 200 250 30021

22

23

24

25

26

27

28

29

Number of Iterations

Cla

ssifi

catio

n A

ccur

acy

Baseline passiveBaseline activeProposed passive−on−the−fly−weightsProposed active−maxent−on−the−fly−weightsProposed

(b) Pubfig-900-60

0 50 100 15028

30

32

34

36

38

40

42

44

46

Number of Iterations

Cla

ssifi

catio

n A

ccur

acy

Baseline passiveBaseline activeProposed passive−on−the−fly−weightsProposed active−maxent−on−the−fly−weightsProposed

(c) Shoes-750-10

Figure: Comparing our proposed approach to various baselines (Table above)

I Our active method outperforms the passive and traditional maximum entropy image selection methods

Additional ExperimentsI Fast active approach is not a lot worse than brute force (tested on a smaller dataset)I Attribute models learned on the fly are worse as attribute predictors in generalI We compare our two interfaces for data collection

I Ideal: a system where people can provide free form attribute feedbackI However that involves natural language processing: future work

0 50 100 150 200 250 300

20

30

40

50

60

70

80

Number of Iterations

Attr

ibut

e M

odel

Acc

urac

y

on−the−fly attributespretrained attributes

(a) Quality of attribute predictors

0 20 40 60 80 10020

25

30

35

40

45

50

55

Number of Iterations

Cla

ssifi

catio

n A

ccur

acy

Baseline passiveProposed passive−on−the−fly−weightsProposed (fast)Proposed (Brute−force, slow)

(b) Comparison to brute force

0 50 100 15020

25

30

35

40

45

Number of Iterations

Cla

ssifi

catio

n A

ccur

acy

Free−form feedback interfaceMultiple−choice feedback procedureBaseline passive

(c) Free-form attributes-feedback

Figure: (a) Attribute models learnt on the fly are worse attribute models per say, but are better suited for providing classifier-feedback than pre-trained attributemodels. (b) Our clustering-based fast active learning approach does not perform significantly worse than the brute-force version of our approach which would beprohibitively slow. (c) A comparison between two interfaces for collecting attributes-based feedback

ConclusionI We extend the relative attributes-based feedback setup for learning classifiers: more accurate, robust and practicalI Gain in classification accuracy by significant marginI Collected and made available Relative Face Attributes Dataset for 60 classes

University of Maryland, College Park and Virginia Tech email: [email protected] WWW: http://filebox.ece.vt.edu/˜parikh/attribute_feedback/

Recommended