Empirical Evaluation of Dissimilarity Measures for Color and Texture Presented by: Dave Kauchak...

Post on 19-Jan-2016

212 views 0 download

Tags:

transcript

Empirical Evaluation of Dissimilarity Measures for Color

and Texture

Presented by:

Dave Kauchak

Department of Computer Science

University of California, San Diego

dkauchak@cs.ucsd.edu

Jan Puzicha, Joachim M. Buhmann, Yossi Rubner & Carlo Tomasi

The Problem: Image Dissimilarity

D( ) = ?,

Where does this problem arise in computer vision?

Image Classification Image Retrieval Image Segmentation

Classification

?

?

?

Retrieval

Jeremy S. De Bonet, Paul Viola (1997). Structure Driven Image Database Retrieval. Neural Information Processing 10 (1997).

Segmentation

http://vizlab.rutgers.edu/~comanici/segm_images.html

Histograms for image dissimilarity

Examine the distribution of features, rather than the features themselves

General purpose (i.e. any distribution of features)

Resilient to variations (shadowing, changes in illumination, shading, etc.)

Can use previous work in statistics, etc.

Histogram Example

Histogramming Image Features

Color Texture Shape Others… Create histogram through binning or some

procedure to get a distribution

Color

Which is more similar?

L*a*b* was designed to be uniform in that perceptual “closeness” corresponds to Euclidean distance in the space.

L*a*b*

L – lightness (white to black)

a – red-greeness

b – yellowness-blueness

Texture

Texture is not pointwise like color

Texture involves a local neighborhood

Gabor Filters are commonly used to identify texture features

Gabor Filters

Gabor filters are Gaussians modulated by sinusoids

They can be tuned in both the scale (size) and the orientation

A filter is applied to a region and is characterized by some feature of the energy distribution (often mean and standard deviation)

Examples of Gabor Filters

Scale: 4 at 108° Scale: 5 at 144°Scale: 3 at 72°

Creating Histograms from Features

Regular Binning– Simple– Choosing bins important. Bins may be too large or

too small

Adaptive Binning– Bins are adapted to the distribution (usually using

some form of K-means)

Marginal Histograms

Marginal histograms only deal with a single feature

Normal Binning

Marginal binning resulting in 2 histograms

Cumulative Histogram

Normal Histogram

Cumulative Histogram

Dissimilarity Measure Using the Histograms

Heuristic Histogram Distances Non-parametric Test Statistics Information-Theoretic diverges Ground distance measures

Notation

D(I,J) is the dissimilarity of images I and J f(i;J) is histogram entry i in histogram of image

J fr(i;J) is marginal histogram entry i of image J Fr(i;J) is the cumulative histogram

Heuristic Histogram Distances

Minkowski-form distance Lp

Special cases:– L1: absolute, cityblock, or

Manhattan distance– L2: Euclidian distance– L: Maximum value distance

p

i

pJifIifJID

/1

),(),(),(

More heuristic distances

Weighted-Mean-Variance (WMV)

– Only includes minimal information about distribution

r

rr

r

r JIJIJID rr

),(

Non-parametric Test Statistics

Kolmogorov-Smirnov distance (K-S)

);();(max),( JiFIiFJID rrr

Cramer/von Mises type (CvM)

i

rrr JiFIiFJID 2));();((),(

Cumulative Difference Example

Histogram 1 Histogram 2 Difference

- =

K-S = CvM =

Non-parametric Test Statistics (cont.)

2-statistic (chi-square)– Simple statistical measure to decide if two samples

came from the same underlying distribution

i JifIif

JifIifJID

);();(

);();(),(

2

Information-Theoretic diverges

How well can one distribution be coded using the other as a codebook?

Kullback-Leibler divergence (KL)

Jeffrey-divergence (JD)

i Jif

IifIifJID

);(

);(log);(),(

i if

JifJif

if

IifIifJID

)('

);(log);(

)('

);(log);(),(

Ground Distance Measure

Based on some metric of distance between individual features Earth Movers Distance (EMD)

– Minimal cost to transform one distribution to the other Only measure that works on distributions with a different

number of bins

EMD

One distribution can be seen as a mass of earth properly spread in space, the other as a collection of holes in that same space

Distributions are represented as a set of clusters and an associated weight

Computing the dissimilarity then becomes the transportation problem

Transportation Problem

Some number of suppliers with goods Some other number of consumers

wanting goods Each consumer-supplier pair has an

associated cost to deliver one unit of the goods

Find least expensive flow of goods from supplier to consumer

Various properties of the metrics

K-S, CvM and WMV are only defined for marginal distributions

Lp, WMV, K-S, CvM and, under constraints, EMD all obey the triangle inequality

WMV is particularly quick because the calculation is quick and the values can be pre-computed offline

EMD is the most computationally expensive

Key Components for Good Comparison

Meaningful quality measure– Subdivision into various tasks/applications

(classification, retrieval and segmentation)

Wide range of parameters should be measured An uncontroversial “ground truth” should be

established

Data Set: Color

Randomly chose 94 images from set of 2000– 94 images represent separate classes

Randomly select disjoint set of pixels from the images– Set size of 4, 8, 16, 32, 64 pixels– 16 disjoint samples per set per image

Data Set: Texture

Brodatz album– Collection of wide range of texture (e.g. cork, lawn,

straw, pebbles, sand, etc.)

Each image is considered a class (as in color) Extract sets of 16 non-overlapping blocks

– sizes 8x8, 16x16,…, 256x256

Setup: Classification

k-Nearest Neighbor classifier is used– Nearest Neighbor classification: given a collection

of labeled points S and a query point q, what point belonging to S is closest to q?

– k nearest is a majority vote of the k closest points– k = 1, 3, 5 and 7

Average misclassification rate percentage using leave-one-out

Setup: Classification (cont.)

Bins: {4, 8, 16, 32, 64, 128, 256} Texture case, three sets of filters were used of

sizes 12, 24 and 40 filters 1000 CPU hours of computation

Results: Classification, color data set

Results: Classification, texture data set

Results: Classification

For small sample sizes, the WMV measure performs best in the texture case.– WMV only estimates means and variances– Less sensitive to sampling noise

EMD also performs well for small sample sizes– Local binning provides additional information

For large sample sizes 2 test performs best

Results: Classification (cont.)

For texture classification, marginal distributions do better than multidimensional distributions except for very large sample sizes (256x256)– Binning is not well adapted to the data since it is

fixed for all the 94 classes– EMD, which uses local adaption does much better

For multidimensional histograms, the more bins the better the performance

For texture, usually 12 filters is enough

Setup: Image Retrieval

Vary sample size Vary number of images retrieved Performance measured based on precision

(i.e. percent correct of the images retrieved) vs. the number of images retrieved

Results: Image Retrieval

Results: Image Retrieval (cont.)

Similar to classification– EMD, WMV, CvM and K-S performed well for small

sample sizes– JD, 2 and KL perform better for larger sizes

Setup: Segmentation

100 images Each image consists of 5 different Brodatz textures For multivariate, the bins are adapted to specific image

Setup: Segmentation (cont.)

Image is divided into 16384 sites (128 x 128 grid)

A histogram is calculate for each site Each site histogram is then compared with 80

randomly selected sites Image sites with high average similarity are

then grouped

Results: Segmentation

Results: Segmentation (cont.)

Median 20% Quantile

L1 marginal 8.2% 12%

2 marginal 8.1% 13%

JD marginal 8.1% 12%

KS marginal 10.8% 20%

CvM marginal 10.9% 22%

L1 full 6.8% 9%

2 full 6.6% 10%

JD full 6.8% 10%

Results: Segmentation (cont.)

Binning can be adapted to image– Increased accuracy in representing

multidimensional distributions– Adaptive multivariate outperforms marginal

Best results were obtained by 2

EMD suffers from high computational complexity

Conclusions

No measure is overall best Marginal histograms and aggregate measures are best

for large feature spaces Multivariate histograms perform well with large sample

sizes EMD performs generally well for both classification and

retrieval, but with a high computational cost 2 is a good overall metric (particularly for larger

sample sizes