Date post: | 07-Aug-2015 |
Category: |
Technology |
Upload: | sean-moran |
View: | 279 times |
Download: | 0 times |
REGULARISED CROSS-MODALHASHING
SEAN MORAN†, VICTOR LAVRENKO† [email protected]
CROSS-MODAL HASHING: FAST RETRIEVAL WITH HASHCODES
• Encourage collisions between similar images and documents
110101
010111
H
H
010101
111101
.....
Query
Database
Query
NearestNeighbours
Hashtable
Tiger
Compute Similarity
Camera
Camera
Tiger
Tiger
Flowers
Flowers
Car
Car
Tiger
REGULARISED CROSS-MODAL HASHING (RCMH)
• Step A: Randomly initialise word modality bits B ∈ {−1, 1}N×K
N : # data-points, K: # bits
• Repeat M times:
– Step B: Graph Regularisation
B← sgn(α SD−1B + (1−α) B
)∗ S ∈ {0, 1}N×N : adjacency matrix, D ∈ ZN×N
+ diagonaldegree matrix, B ∈ {−1, 1}N×K : bits, α ∈ [0, 1]: interpola-tion parameter, sgn: sign function
– Step C: Word Data-Space Partitioning
for k = 1. . .K : min ||fk||2 + C∑N
i=1 ξik
s.t. Bik(fkᵀai + bk) ≥ 1− ξik for i = 1. . .N
∗ fk ∈ <D: word hyperplane, bk ∈ <: bias, ai ∈ <D:word descriptor, Bik: bit k for word data-point ai,ξik: slack variable
– Step C: Visual Data-Space Partitioning
for k = 1. . .K : min ||gk||2 + C∑N
i=1 ξik
s.t. Bik(gkᵀvi + tk) ≥ 1− ξik for i = 1. . .N
∗ gk ∈ <D: visual hyperplane, tk ∈ <: bias, vi ∈ <D:visual descriptor, Bik: bit k for word data-point ai
– Step D: Update matrix B using word hyperplanes: bik =sgn(fkᵀai + bk)
• Use hyperplanes {fk,gk}Kk=1 to encode unseen data-points
STEP B: GRAPH REGULARISATION
a
d
0 1
0 0
0 1
c e
b
1 01 1
1 0
CarCar
Car
TigerCar
TigerCar
TigerCar
0 1
STEP C: DATSET PARTITIONING IN WORD AND VISUAL SPACE (LEARN HASHING HYPERPLANES)
Word Space
a
c
f 1
0 1
0 0
b
e
d
f 2
a2
a1
1 1
1 0
1 0
Car
Car
TigerCar
TigerCar
TigerCar
g1
a
c
e
d
0 1
1 1
1 0
b
g2
Visual Space
v1
v2
0 0
0 11 0
QUANTITATIVE RESULTS (NUS-WIDE, 64 BITS, HAMMING RANKING MEAN AVERAGE PRECISION (MAP))
0.30
0.35
0.40
0.45
Text Database - Image Query
CRH
CVH
CMSSH
IMH
PDH
RCMH
Model
mA
P
+6%vs PDH
0.30
0.35
0.40
0.45
0.50
Image Query - Text Database
CRH
CVH
CMSSH
IMH
PDH
RCMH
Model
mA
P
+9%vs PDH
• State-of-the-art retrieval effectivenesson standard datasets
• Graph regularisation can producehigh-quality cross-modal hashcodes
• No need to solve an intermediateeigenvalue problem (saves O(D3))
• Future work: extend to fast cross-lingual (e.g. Spanish-English) re-trieval