+ All Categories
Home > Technology > Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

Date post: 08-Sep-2014
Category:
Upload: uchida-yusuke
View: 679 times
Download: 3 times
Share this document with a friend
Description:
Recently, the Fisher vector representation of local features has attracted much attention because of its effectiveness in both image classification and image retrieval. Another trend in the area of image retrieval is the use of binary feature such as ORB, FREAK, and BRISK. Considering the significant performance improvement in terms of accuracy in both image classification and retrieval by the Fisher vector of continuous feature descriptors, if the Fisher vector were also to be applied to binary features, we would receive the same benefits in binary feature based image retrieval and classification. In this paper, we derive the closed-form approximation of the Fisher vector of binary features which are modeled by the Bernoulli mixture model. In experiments, it is shown that the Fisher vector representation improves the accuracy of image retrieval by 25% compared with a bag of binary words approach.
Popular Tags:
21
Image Retrieval with Fisher Vectors of Binary Features KDDI R&D Laboratories, Inc. Yusuke Uchida, Shigeyuki Sakazawa
Transcript
Page 1: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

Image Retrieval with Fisher Vectors of Binary Features

KDDI R&D Laboratories, Inc. Yusuke Uchida, Shigeyuki Sakazawa

Page 2: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 2

Image retrieval using local features

• Local Invariant Feature: – Robust against occlusion, illumination change, viewpoint

change, and so on • Applications:

– Product search (Amazon Flow), landmark recognition (Google Goggles), augmented reality (Qualcomm Vuforia), …

Page 3: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 3

Trends in image retrieval using local features

• 1999: SIFT [Lowe,ICCV’99]

• 2003: SIFT + Bag-of-visual words [Sivic+,ICCV’03]

• 2007: SIFT + Fisher vector [Perronnin+,CVPR’07,ECCV’10]

– New effective image representation • 2011: Local binary features (ORB [Rublee+,ICCV’11], FREAK, BRISK)

– Efficient alternatives to SIFT or SURF

• In this presentation: – Propose Fisher vector of binary features for image retrieval – Model binary features by Bernoulli mixture model (BMM) – Derive closed-form approximation of Fisher vector of BMM – New normalization method is applied to Fisher vector

Page 4: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 4

Pipeline of image retrieval using local features

Classifier (e.g. SVM) Similarity search

--・・・-

--・・・-

--・・・-

--・・・-

--・・・-

A single vector representation of the image

--・・・-

Region detection

Feature description

Aggregation A set of feature vector X

Page 5: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 5

Position of this research

Bag-of-visual words Fisher vector Continuous (SIFT, SURF)

[1] [2, 3]

Binary (ORB, FREAK, BRISK)

[4] This research

Aggregation methods

Desc

ripto

r typ

e

[1] J. Sivic and A. Zisserman, "Video google: A text retrieval approach to object matching in videos," in Proc. of ICCV’03. [2] F. Perronnin and C. Dance, "Fisher kernels on visual vocabularies for image categorization," in Proc. of CVPR’07. [3] F. Perronnin, et al., "Improving the fisher kernel for large-scale image classification,” in Proc. of ECCV’10. [4] D. Galvez-Lopez and J. D. Tardos, "Real-time loop detection with bags of binary words," in Proc. of IROS’11.

Accurate

Fast

Page 6: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 7

Fisher kernel [Jaakkola+, NIPS’98]

• The generation process of X is modeled by a probability density function p(X|λ) with a parameter set λ

• Describe X by the gradient of the log-likelihood function L(X|λ) = log P(X|λ) (=Fisher score)

• Similarity between X and X’ is defined by the Fisher kernel K(X,X’):

)|'(L)|(L)',( 1T λλ λλλ XFXXXK ∇∇= −

Fisher score (gradient of log-likelihood function)

Fisher information matrix ])|()|([E Tλλ λλλ xLxLF ∇∇=

[5] T. Jaakkola and D. Haussler, "Exploiting generative models in discriminative classifiers," in Proc. of NIPS'98.

Page 7: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 8

Fisher vector [Perronnin+, CVPR’07]

• Explicit feature mapping for Fisher kernel – As the Fisher information matrix (FIM) F is positive

semidefinite and symmetric, it has a Cholesky decomposition:

– Thus Fisher kernel can be rewritten as a dot-product between Fisher vectors zX and zX’: where )|(L λλλ XLzX ∇=

Fisher score Decomposed FIM

)|'(L)|(L)',( 1T λλ λλλ XFXXXK ∇∇= −

λλλ LLF T=−1

'T)',( XX zzXXK =

Page 8: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 9

Fisher vector of GMM [Perronnin+, CVPR’07]

∑=

Σ=N

iiitit xNwxp

1

),;()|( µλ

F

,)|()|(1∏=

=T

ttxpXp λλ

• SIFT features are modeled by Gaussian mixture model (GMM) Closed-form approximation of the Fisher vector of GMM under the following assumptions: 1. The Fisher information matrix F is diagonal 2. The number of features extracted from an image is

constant and equal to T 3. The posterior probability r(i) is peaky

• Compared with bag-of-visual words, Fisher vector contains higher order information

Page 9: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 10

Local binary features [Rublee+, ICCV’11]

• Local binary features: ORB, BRISK, FREAK, and many others – One or two magnitudes faster than SIFT or SURF – Multi-scale FAST detector or its variants – Binary descriptor based on binary tests on pixel’s luminance

• Resulting in a binary vector (0, 0, 1, 0, 1, …, 1)

A part of binary tests of ORB

256

• Binary tests are defined by pairs of positions • If the luminance of the first position is brighter

than the luminance of second position; then the test generate bit ‘1’

Loca

l fea

ture

regi

on

Page 10: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 11

Modeling binary features by BMM

• Model binary features by Bernoulli mixture model (BMM)

∏=

=T

ttxpXp

1

)|()|( λλ

∑=

=N

itiit xpwxp

1)|()|( λλ

∏=

−−=D

d

xid

xidti

tdtdxp1

1 )1()|( µµλ

}..1,..1,,{ DdNiw idi === µλ

Naïve Bayes assumption

Each feature is generated from one of the N components

Single multivariate Bernoulli distribution

X: A set of T binary feature X = (x1, …, xT) with D dimension (D bits) λ: a set of parameters

Notations

Page 11: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 12

Visualizing clustering results of BMM

• The parameters λ are estimated by EM algorithm (for N =32) using 1M training ORB features binary tests with top 5 high probability of generating bit “0” binary tests with top 5 high probability of generating bit “1”

• Mixture model successfully captures underlying bit correlation

All binary tests defined in ORB Four components (clusters) out of N = 32 components

Page 12: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 13

Fisher vector of BMM

• Definition of Fisher vector:

• Fisher score w.r.t. μid

)|(L λλλ XLzX ∇=Fisher score Decomposed FIM

)|(log)|(L λλ tt xpx =

∑=

=N

itiit xpwxp

1)|()|( λλ

)1()1()()|(L

1

1

tdtd

td

xid

xid

x

tid

t ix−

−−

=∂

∂µµ

γµ

λ

∑=

== N

jtjj

tiitt

xpw

xpwxipi

1)|(

)|(),|()(λ

λλγ

Posterior probability

∑= t txX )|(L)|(L λλ

∏=

−−=D

d

xid

xidti

tdtdxp1

1 )1()|( µµλ

Page 13: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 14

Fisher vector of BMM

• Fisher information w.r.t. μid (fμid)

= 0

Posterior probability is peaky

Fisher score

Page 14: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 15

Posterior probability 𝑝(𝑖|𝑥𝑡 , 𝜆)

• Histogram of max𝑖𝑝 𝑖 𝑥𝑡 , 𝜆 (𝑁 = 256)

• Peaky!

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Page 15: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 16

Vector normalization • Normalization is essential part of Fisher vector representation [3] • Power normalization [3]

– 𝑧𝑧𝑖𝑖 = sgn 𝑧𝑖𝑖 |𝑧𝑖𝑖|𝛼 (𝛼 = 0.5) • L2 normalization [3]

– 𝑧𝑧𝑖𝑖 = 𝑧𝑖𝑖/ ∑ 𝑧𝑖𝑖2𝑖𝑖

• Intra normalization [6] (originally proposed for VLAD not for FV) – perform L2 normalization within each BMM component

– 𝑧𝑧𝑖𝑖 = 𝑧𝑖𝑖/ ∑ 𝑧𝑖𝑖2𝑖

[3] F. Perronnin, et al., "Improving the fisher kernel for large-scale image classification,” in Proc. of ECCV’10. [6] R. Arandjelovic and A. Zisserman, "All about VLAD," in Proc. of CVPR'13.

Originally proposed for FV

Page 16: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 17

Experimental setup

• Dataset: Stanford Mobile Visual Search – http://www.stanford.edu/~dmchen/mvs.html – CD class is used for evaluation

• Performance measure: mean average precision (MAP) • Binary feature: ORB (OpenCV implementation, 4 scales, 900 features/image)

100 Reference image

400 Query images

Page 17: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 18

Experimental results (1)

• Compare the proposed Fisher vector with BoVW (N=1024) • Evaluate normalization methods (P=Power, In=Intra normalization)

Number of mixture components

BoVW Imp. FV

• Fisher vector without any normalization achieves poor results • Power and/or L2 normalization significantly improves FV • Intra normalization outperforms the others in all N!

Pure FV

better In Norm FV

Page 18: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 19

Experimental results (2)

• Add independent images to database as a distractor

• The Fisher vector achieves better performance in all database sizes • The degradation of the Fisher vector is relatively small

=Proposed FV

Page 19: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 20

Summary

• Proposed Fisher vector of binary features for image retrieval – Model binary feature by Bernoulli mixture model (BMM) – Derive closed-form approximation of Fisher vector of BMM – Apply new normalization method to Fisher vector

• Future work

– Encode Fisher vector into a compact code for efficiency (The method proposed in [7] seems promising)

– Apply proposed Fisher vector to other binary features (e.g., audio fingerprints)

[7] Y. Gong et al., "Learning Binary Codes for High-Dimensional Data Using Bilinear Projections," in Proc. of CVPR'13.

Page 20: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 21

Page 21: Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)

2014/8/1 22

Fisher vector of BMM (Fisher score)

• Fisher score w.r.t. μid

∏≠=

−− −−=∂

∂ D

dee

xie

xie

x

id

ti tetetdxp

,1

11 )1()1()|( µµµ

λ)1(

)1()()|(

)|()|(L

1

1

tdtd

td

xid

xid

x

tt

tid

id

t ixp

xpx

−−

=∂∂

=∂

∂µµ

γλ

λµ

µλ

∑=

== N

jtjj

tiitt

xpw

xpwxipi

1)|(

)|(),|()(λ

λλγOccupancy probability (posterior probability)

)|(L λλλ XLzX ∇=Fisher score Decomposed FIM )|(log)|(L λλ tt xpx =

∑=

=N

itiit xpwxp

1)|()|( λλ


Recommended