VisualRank : Applying PageRank to Large-Scale Image Search

transcript

VisualRank: Applying PageRank to Large-Scale Image Search

Yushi Jing, Member, IEEEShumeet Baluja, Member, IEEE

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, NOVEMBER 2008

[24] Y. Jing, S. Baluja, and H. Rowley, “Canonical Image Selection from the Web,” Proc. Sixth Int’l Conf. Image and Video Retrieval, pp. 280-287, 2007.

Outline

• Introduction• Similarity graph[24]• PageRank & VisualRank• Hashing• Experiments• Conclusion

Outline

• Introduction• Similarity graph[24]• PageRank & VisualRank• Hashing• Experiments• Conclusion

Search for “d80” & “coca cola” by traditional search engine

Introduction

• Visual theme, ex: “coca cola” logo• CBIR: content-based image retrieval– Pure– Composite• “Visual-filter” via Probabilistic Graphical Models(PGMs)

• Compare:– Object category learner– image search engine

[7] R. Fergus, P. Perona, and A. Zisserman, “A Visual Category Filter for Google Images,” Proc. Eighth European Conf. Computer Vision, pp. 242-256, 2004.

Introduction

• Combine[24]– pairwise visual similarity among images– nonvisual signals

• VisualRank– Based on PageRank– Large number of queries & images

• Goal– More accurate search ranking

introducton

Outline

• Introduction

• Similarity graph[24]• PageRank & VisualRank• Hashing• Experiments• Conclusion

Features generation

• Local descriptor– SIFT & compare[29]

[29] K. Mikolajczyk and C. Schmid, “A Performance Evaluation of Local Descriptors,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, Oct. 2005.

Similarity graph

• pairwise

Similarity graph

• Top 1000 results of “Mona-lisa”

Outline

• Introduction• Similarity graph[24]

• PageRank & VisualRank• Hashing• Experiments• Conclusion

PageRank

• Conception– Vote– eigenvector centrality A

PR(A) = PR(B) + PR(C) + PR(D)

PageRank

q=0.15Random walk

PageRank

• Markov matrix

VisualRank

usually d>0.8

Link spam

• Well connected image V.S. VisualRank, “Nemo”

Outline

• Introduction• Similarity graph[24]• PageRank & VisualRank

• Hashing• Experiments• Conclusion

Matching

• Precluster– “Paris”, “Eiffel Tower”, and “Arc de Triomphe”

• Top-N, and compute VisualRank• Hashing– Locality Sensitive Hashing (LSH)– Feature descriptor as the key

Locality Sensitive Hashing (LSH)

• An approximate k-NN technique• Hash function:

– a is d-dimensional random vector– b is real number from range– W defines the quantization of the features– V is the original feature vector

Flow(1/3)

1. Resize 500*500 pix, 1000 web images 3000,000 to 700,000 feature vectores

2. L hash table H=H1, H2,…,HL, each with K hash functions, L=40, W=100, K=3

Flow(2/3)

3. Matched descriptor– Have same key more than C=3 hash table

4. Hough Transform

Flow(3/3)

5. Similarity– Matched images• More than 3 features

– no. of matches divide by their avg. number of local features

6. Given similarity matrix S, and use VisualRank

Outline

• Introduction• Similarity graph[24]• PageRank & VisualRank• Hashing

• Experiments• Conclusion

Experiments

• 2,000 most popular product queries on Google, ex: “ipod”, “Xbox”

• the top 1,000 search results each query in July 2007 Google

• Filter– Fewer than 5% images at least one connection– Remaining 1,000 queries

Experiment 1

• Evaluate– “irrelevancy” of our ranking

• Mixed Top 10 VisualRank & top 10 google Remove duplicates and ask “which are least relevant?”

• Ask 150 evaluators, randomly 50 queries

Experiment 2

• VisualRankbias,

• pT=VjT=[1/m, …, 1/m, 0, …, 0]

• HeuristicRank – a pure CBIR system

Experiment 3

• Collected 40 top images each click numbers from google

• Compare – Sum of VisualRank top 20 click numbers– Sum of default ranking top 20 click numbers

• VisualRank exceeds 17.5% than default Google ranking

Landmarks

• 80 common landmark, ex: “Eiffel Tower,”“Big Ben,” “Coliseum,” and “Lincoln Memorial.”

Outline

• Introduction• Similarity graph[24]• PageRank & VisualRank• Hashing• Experiments

• Conclusion

Conclusion

• VisualRank applying PageRank conception and combined – Default Google ranking– similarity graph between images

• VisualRank can outperform the default Google on the vast majority of queries

• Reduce the number of irrelevant images efficiently

VisualRank : Applying PageRank to Large-Scale Image Search

Documents