+ All Categories
Home > Documents > Learning Instance Specific Distance Using Metric Propagation

Learning Instance Specific Distance Using Metric Propagation

Date post: 30-Dec-2015
Category:
Upload: oren-ballard
View: 26 times
Download: 0 times
Share this document with a friend
Description:
Learning Instance Specific Distance Using Metric Propagation. De-Chuan Zhan, Ming Li, Yu-Feng Li, Zhi-Hua Zhou LAMDA Group National Key Lab for Novel Software Technology Nanjing University, China {zhandc, lim, liyf, zhouzh}@lamda.nju.edu.cn. Distance metric learners are introduced…. - PowerPoint PPT Presentation
Popular Tags:
24
http:// lamda.nju.edu.cn Learning Instance Specific Distance Using Metric Propagation De-Chuan Zhan, Ming Li, Yu-Feng Li, Zhi-Hua Zhou LAMDA Group National Key Lab for Novel Software Technology Nanjing University, China {zhandc, lim, liyf, zhouzh}@lamda.nju.edu.cn
Transcript
Page 1: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn

Learning Instance Specific Distance Using Metric

PropagationDe-Chuan Zhan, Ming Li, Yu-Feng Li, Zhi-Hua Zhou

LAMDA GroupNational Key Lab for Novel Software Technology

Nanjing University, China

{zhandc, lim, liyf, zhouzh}@lamda.nju.edu.cn

Page 2: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cnDistance based classification

K-nearest neighbor classification

SVM with Gaussian kernels

Is the distance reliable?

Distance metric learners are introduced…

Are there any more natural measurements?

Page 3: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Any more natural measurements?

When sky is compared to other pictures…

… our work

Can we assign a specific distance measurement for each

instance, both labeledlabeled and unlabeledunlabeled?

When Phelps II is compared to other athletes…

Color, probably texture features

Speed of swimming, shape of feet…

Page 4: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cnOutline

Introduction

Our Methods

Experiments

Conclusion

Page 5: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Introduction

Distance Metric Learning

Many machine learning algorithms rely on the distance metric for input data patterns.

• Classification

• Clustering

• RetrievalThere are many metric learning algorithms developed

[Yang, 2006]Problem:

Focus on learning a uniform Mahalanobis distance for ALL instances

Page 6: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

• Instead of applying a uniform distance metric for every example, it is more natural to measure distances according to specific properties of data

• Some researchers define distance from sample’s own perspective

• QSim [Zhou and Dai, ICDM’06] [Athitsos et al., TDS’07]• Local distance functions [Frome et al., NIPS’06, ICCV’07]

Introduction

Other distance functions

Page 7: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Introduction

Query sensitive similarity

Actually, instance specific similarities or query specific similarities are studied in other fields before:

The problem:

Query similarity is based on pure heuristics.

In content-based image retrieval, there has been a study which tries to compute query sensitive similarities.

The similarities among different images are decided after receiving a query image. [Zhou and Dai, ICDM’06]

Page 8: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

• [Frome et al. NIPS’06]

Introduction

Local distance functions

Dji>Djk

Dij>D

kj

The distance from the j-th instance to the i-th instance is larger than that from the j-th to the k-th

1. Cannot generalize directly2.The local distance defined is not directly

comparable. • [Frome et al. ICCV’07]

All constraints can be tired together.Requiring more heuristics for testing.

The problem:

Local distance functions for unlabeled data are N/A.

Page 9: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Introduction

Our Work

Can we assign a specific distance measurement for each instance,

both labeled and unlabeledboth labeled and unlabeled?

Yes, we learn Instance Specific Distance via Metric Propagation

Page 10: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cnOutline

Introduction

Our Methods

Experiments

Conclusion

Page 11: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

• Focus on learning instance specific distance for both labeled and unlabeled data.

Our Methods

Intuition

• For labeled data: the pair of examples come

from the same class should be closer to each other

• For unlabeled data: Metric propagation on a

relationship graph

Page 12: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Our Methods

The ISD Framework

• Instead of directly conducting metric propagation while learning the distances for labeled examples, we formulate the metric propagation with a regularized framework.

The Loss function for labeled data

Induced by the labels of instances, provides the side information

A regularization term responsible forthe implicit metric propagation

is a convex loss function, such as hinge lossin classification or least square loss in regression

The j-th instance belongs to a class other than the i-th, or the j-th instance is a neighbor of i-th instance, i.e., allCannot-links and some of the must-links are considered

Inspired by [Zhu 2003], the regularization term can bedefined as:

Page 13: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Our Methods

The ISD Framework – relationship to FSM

Replaced with high-order side information,such as triplets information

L is set to identity matrix

FSM [Frome et al. NIPS’06] is a

special case of ISD

Although only pair-wised side information is investigated in our work,the ISD Framework is a common frame…

Page 14: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Our Methods

The ISD Framework – update graph

Given structurePredefined graph

Graph Weights

Initialize In new ISD space

Updated Graph Weights

Final ISD

Page 15: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Our Methods

ISD with L1-loss

Introducing slack variables

Solving it respect to all w simultaneously is of great challenge.The computational cost is too expensive.

Convex problem we employ the alternating descent method to solve it, i.e.to sequentially solve one w for one instance at each time by fixing other ws till converges or maxiters reached.

Page 16: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Our Methods

ISD with L1-loss (con’t)

Primal:

Dual:

Page 17: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Our Methods

Acceleration: ISD with L2-loss

For acceleration: The alternating descent method is used to solve the problem Reduce the number of constraints by considering some must- linksHowever, the number of inequality constraints may be large

Inspired by nu-SVM, we probably can obtain a more efficient method:

Page 18: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Our Methods

Acceleration: ISD with L2-loss

drop

Dual:

A linear equality constraint

We will project the solution back to the feasible regionafter we get the optimization results:Thus, this dual variable can be efficiently solved using Sequential Minimal Optimization.

Page 19: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cnOutline

Introduction

Our Methods

Experiments

Conclusion

Page 20: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

• Data sets:– 15 UCI data sets– COREL image dataset (20 classes, 100

images/class)

• 2/3 labeled training set; 1/3 unlabeled for testing, 30 runs

• Compared methods– ISD-L1/L2– FSM/FSSM (Frome et al. 2006 & 2007)– LMNN (Weinberger et al. 2005)– DNE (Zhang et al, 2007)

• Parameters are selected via cross validation

Experiments

Configurations

Page 21: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Experiments

Classification Performance

Comparison of test error rates (mean±std.)

12 11

The win/tie/loss counts ISD vs. other methodst-test, 95% significance level

Page 22: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Experiments

Influence of the number of iteration rounds

Updating rounds

Starting from Euclidean

The error rates of ISD-L1 are reduced on most datasets as the number of update increasing

The error rates of ISD-L2 reduce on some datasets.The error rates of ISD-L2 reduce on some datasets.However, on others, the performance are degenerated.

Overfitting – L2-loss is more sensitive to noise

Page 23: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cn

Experiments

Influence of the amount of labeled data

ISD is less sensitive to the influence ofthe amount of labeled data

When the amount of labeled samples is limited, the superiority of ISD is more apparent

Page 24: Learning Instance Specific Distance Using Metric Propagation

http://lamda.nju.edu.cn/zhandc

http://lamda.nju.edu.cnConclusion

Main contribution: A method for learning instance-specific distance

for labeled as well as unlabeled instances.

Future work: The construction of the initial graph

Label propagation, metric propagation, … any more properties to propagate?

Thanks!


Recommended