Mathias Lux This work is licensed under a Creative Commons Attribution 3.0 Unported License.
What is LIRE?
• Library for CBIR
• Easy access & instant “success”
• Few loc to index & search
It’s based on Lucene
• Java text retrieval framework – based on inverted lists
• Top level Apache project
• Extends to Solr
Modular Feature Architecture
LireFeature as the basic Interface
• Extraction,
• Distance function,
• Serialization (byte[] based)
• toString(), field name, …
Fast Access & Linear Search
• Efficient coding of serialization – transformation to byte[] – run length coding for sparse vectors
• Custom Lucene codec – Lucene field compression – update to DocValues in v1.0
Search with sub Linear Time Complexity • Hashing based approach for global features
– Locality sensitive hashing • bit sampling
– Proximity based hashing • nearest neighbors as “buckets”, • cp. work of G. Amato
• Local features supported – SIFT, SURF, k-means, VLAD
Tools
• Parallel Indexing – consumer-producer based
– up to the capabilities of the VM / HDD
• Intermediate byte based data format – small footprint, efficient, relative paths
Extending LIRE
• Implement a global feature – extraction, distance function, serialization
• Lire takes care of the rest – Parallel indexing, hashing, search
Using Parts of LIRE
Take what you need … • Feature implementations
– cp. work of Xinchao Li et al. at Mediaeval 2013
• Image processing – Canny Edge Detector, SWT (coming soon),
• Tools & code base – FastMap, Suffix Tree Clustering, …
UCID Data Set MAP precision 10 ER
CEDD 0,431 0,420 0,553
Color Correlogram 0,586 0,480 0,370
Color Layout 0,277 0,285 0,679
Edge Histogram 0,180 0,202 0,813
FCTH 0,447 0,415 0,531
JCD 0,470 0,435 0,508
Joint Histogram 0,348 0,313 0,603
LBP Opponent Joined 0,266 0,267 0,729
Local Binary Patterns (LBP) 0,228 0,221 0,714
Opponent Histogram 0,319 0,309 0,649
PHOG 0,232 0,235 0,725
RGB Color Histogram 0,403 0,358 0,550
Rotation Invariant LBP 0,165 0,174 0,813
Scalable Color 0,172 0,183 0,840
SPCEDD 0,575 0,487 0,366
SPLBP 0,264 0,251 0,683
Surf BoVW 0,348 0,313 0,634
VLAD-SURF 0,370 0,356 0,603
CEDD
Color Correlogram
Color Layout
Edge Histogram
FCTH
JCD
Joint Histogram
LBP Opponent Joined
Local Binary Patterns (LBP)
Opponent Histogram
PHOG
RGB Color Histogram
Rotation Invariant LBP
Scalable Color
SPCEDD
SPLBP
Surf BoVW
VLAD-SURF
SIMPLICity Data Set MAP precision 10 ER
CEDD 0,513 0,706 0,193
Color Correlogram 0,498 0,740 0,159
Color Layout 0,439 0,612 0,303
Edge Histogram 0,333 0,500 0,401
FCTH 0,499 0,703 0,207
JCD 0,520 0,730 0,183
Joint Histogram 0,449 0,689 0,197
LBP Opponent Joined 0,418 0,569 0,347
Local Binary Patterns (LBP) 0,358 0,587 0,295
OpponentHistogram 0,450 0,635 0,270
PHOG 0,365 0,547 0,355
RGB Color Histogram 0,450 0,704 0,191
Rotation Invariant LBP 0,338 0,520 0,375
Scalable Color 0,305 0,470 0,464
SPCEDD 0,599 0,772 0,144
SPLBP 0,395 0,556 0,348
SURF BoVW 0,338 0,464 0,475
VLAD-SURF 0,365 0,518 0,407
CEDD
Color Correlogram
Color Layout
Edge Histogram
FCTH
JCD
Joint Histogram
LBP Opponent Joined
Local Binary Patterns (LBP)
OpponentHistogram
PHOG
RGB Color Histogram
Rotation Invariant LBP
Scalable Color
SPCEDD
SPLBP
SURF BoVW
VLAD-SURF
Hashing - BitSampling
100k images from flickr, 50 results cp. to linear search
0,000
0,100
0,200
0,300
0,400
0,500
0,600
0,700
0,800
0,900
1,000
0 500 1000 1500 2000 2500 3000
JCD
CEDD
FCTH
ACC
PHOG
OPH
ColHist
ColLay
EH
SPCEDD
Hashing - Proximity
100k images from flickr, 50 results cp. to linear search
0,000
0,100
0,200
0,300
0,400
0,500
0,600
0,700
0,800
0,900
1,000
0 500 1000 1500 2000 2500 3000
JCD
CEDD
FCTH
ACC
PHOG
OPHIST
ColHist
Collay
EH
SPCEDD
Apache Solr Integration
• Motivation: – Use a search and retrieval server with all its tools
• Objectives: – indexing & management – efficient content based image search – content based ranking of results
Solr Plugin
• Custom Request Handler – Uses Solr’s request and response framework – Allows for content based retrieval
• Custom ValueSourceFunction – Added to text based search queries – Allows for ranking based on the distance function
Solr Plugin
• Custom type of index field – DocValue based binary field
– transmission base64 encoded
• Custom Indexer – XML documents to be uploaded to Solr
SOLR Plugin
• http://demo-itec.uni-klu.ac.at/liredemo/wipo.html
• Local demo
Future Work
• DocValues based indexing – make linear search faster
• Proximity hashing – metric spaces approach – more accurate
• Release version 1.0 – adding docs & features freeze
Acknowledgements
I’d like to thank Anna-Maria Pasterk, Arthur Li, Arthur Pitman, Bastian Hösch, Benjamin Sznajder, Christian Penz, Christine Keim, Christoph Kofler, Dan Hanley, Daniel Pötzinger, Fabrizio Falchi, Franz Graf, Giuseppe Amato, Glenn Macstravic, James Charters, Janine Lachner, Katharina Tomanec, Lukas Esterle, Manuel Oraze, Marian Kogler, Marko Keuschnig, Michael Riegler, Rodrigo Carvalho Rezende, Roman Divotkey, Roman Kern, Savvas Chatzichristofis and Sandeep Gupta.
Lecture Book