+ All Categories
Transcript

MIT CSAILVision interfaces

Towards efficient matching with random hashing methods…

Kristen GraumanGregory Shakhnarovich

Trevor Darrell

MIT CSAILVision interfaces

Motivation: Content-based image retrieval

Data set of 30 scenes in Boston• 1,079 database images• 89 query images

Features:• Harris-Affine detector (max m=3,595)

• MSER detector(max m=1,707)

• SIFT-PCA descriptors

Query

MIT CSAILVision interfaces

Content-based image retrieval

Pyramid match: ~1 second / query

Optimal match: ~2 hours / query

Number top retrievals

Acc

ura

cy

Even this is far too slow forany web-scale application!

MIT CSAILVision interfaces

Sub-linear time image search

N

<< N

h0111101

0110111

0110101

Randomized hashing techniques useful for sub-linear query time of very large image databasesN

Linear scan

MIT CSAILVision interfaces

Pyramid match hashing

• For fixed-size sets, Locality-Sensitive Hashing [Indyk & Motwani 1998] provides bounded approximate similarity search over bijective matching [Indyk & Thaper 2003]; [Grauman & Darrell CVPR 2004, 2005]

• For varying set sizes, embedding of pyramid match (with product normalization) makes random hyperplane hashing possible under set intersection hash family of [Charikar 2002]. [Grauman PhD 2006]

MIT CSAILVision interfaces

MIT CSAILVision interfaces

MIT CSAILVision interfaces

MIT CSAILVision interfaces

Single Frame Pose Estimation via Approximate Nearest Neighbor regression

• Obtain large DB of pose-appearance mappings• Exploit fast methods for approximate nearest

neighbor search in high dim. spaces. (e.g., LSH [Indyk and Motwani ‘98-’00].)

MIT CSAILVision interfaces

Approximate nearest neighbor techniques

… … …Rendered (& hashed)PoseDB

input

Hashfcns.

similar examples fall into same bucket in one or more hash table

MIT CSAILVision interfaces

Single Frame Pose Estimation via Approximate Nearest Neighbor regression

• Render large DB of pose-appearance mappings• Exploit fast methods for approximate nearest neighbor

search in high dim. spaces. (e.g., LSH [Indyk and Motwani ‘98-’00].)

Problem: signal distance dominated by nuisance variables

Idea: find embedding (i.e., hash functions for LSH) most relevant to parameter (pose) similarity… [Shakhnarovich et. al ’03, Shakhnarovich ‘05]

MIT CSAILVision interfaces

Pose estimation and Similarity-sensitive hashing

… … …Rendered (& hashed)PoseDB

input

Pose-sensitiveHashfcns.

NN similar in pose, not image

[Shakhnarovich et. al ’03, Shakhnarovich ‘05]

MIT CSAILVision interfaces

SSE / BoostPro

Similarity Sensitive Embedding

- Compute embedding H: I {0, 1}N such that

| H(I(1)) - H(I(2)) | is small if 1 is close to 2

| H(I(1)) - H(I(2)) | is large otherwise

- Use the embedding with approximate nearest neighbors retrieval (LSH)

- Find H by training boosted classifier to learn “same-pair” and concatenate resulting weak learners …

[Shakhnarovich 2005]

MIT CSAILVision interfaces

PSH results

~200,000 examples in DB; 2 sec

[Shakhnarovich et al. 2003, 2005]

MIT CSAILVision interfaces

Conclusions

• Random Hashing techniques allow broad search; well suited for very high dimensional spaces

• Useful in domains where there is no prior knowledge about how to cluster or model data…

• Similarity (parameter) sensitive hashing can find distance related to task…effectively learn problem dependent distance measure and efficient means to index.


Top Related