Date post: | 17-Jan-2016 |
Category: |
Documents |
Upload: | arnold-quinn |
View: | 219 times |
Download: | 0 times |
1
Embedding and Similarity Search for Embedding and Similarity Search for Point Sets under TranslationPoint Sets under Translation
Minkyoung Cho and David M. Mount University of Maryland
SoCG 2008
2
Point Pattern Point Pattern MatchingMatching
Point Pattern Matching Given two point sets P, Q, find Q’ Q to minimize Dist(P, Q’) = min dist(tP, Q’) where t is a geometric
transformation. (e.g., translation, rotation, …)
P
Q
3
Point Pattern Point Pattern Similarity SearchSimilarity Search
Point Pattern Similarity Search
A collection of point sets S=S={P1,P2,…,PN}
has been preprocessed. Given a
query set Q, find (approximate)
nearest Pi with respect to a
distance function and transformation group.
Q
…
…
…
…
S = {P1, P2, …, PN}
4
ResultsResults
Transformation
Space Index Note
Geometric Hashing[Wolfson & Rigoutsos 97]
TranslationRotationAffine …
O(Nnk+1)(k: frame size)
YES Space complexity
EMDM into Euclidean space [Indyk & Thaper03]
None O(Nn) YES Embedding EMD to L1
EMD under transformation sets [Cohen & Guibas99]
ScalingTranslation
O(Nn) NO Brute-force, Heuristic
Ours Translation O(Nn log2n )
YES EmbeddingSD to L1
EMD: Earth Mover’s DistanceSD: Symmetric Difference Distance
5
Problem DefinitionProblem Definition
Point Pattern Similarity Searching::
• Distance Measure: Symmetric Difference
Distance
• Error Model: Outliers (but No Noise)
• Transformation: Translation • Restriction: Coordinates are integers
P\QQ\PQΔP
P = {p1,p2,p3,p4}Q = {p1,p2,p5,p6}
}6p,5p{}4p,3p{QΔP 4QΔP
{0,12,14,23,35,54,59,64}P =
{0,12,14,23,35,54,59,64}{ 12,14,23,35,54, 64}{12,14,17,23,35,54,62,64}t=3
{15,17,20,26,38,57,65,67}Q =
QP… ……… ……… …
6
Motivation: Sources of ComplexityMotivation: Sources of Complexity
• Combination of Translation + Outliers
• Translation Only - translate the point set by aligning leftmost point to the
origin - trivial matching
• Outliers Only - Reduce to Nearest neighbor search in Hamming cube (By hashing or random sampling)
7
IntuitionIntuition
P1
Q
P2
P3
P4
PN
f
ff
f
f
f
Metric space
8
Embedding: Basic DefinitionsEmbedding: Basic Definitions
Given metric spaces (X, d) and (X', d'), a map f: X X’ is called an embedding.
The contraction of f is the maximum factor by which distancesare shrunk, i.e.,
The expansion or stretch of f is the maximum factor bywhich distances are stretched:
The distortion of f is the product of the contraction and expansion.
))Y(f),x(f('d)y,x(d
maxXy,x
)y,x(d))Y(f),x(f('d
maxXy,x
9
Main Result: PreliminariesMain Result: Preliminaries
• Main result: There exists an randomized embedding that maps a point set under symmetric difference with respect to translation into a metric space L1 with distortion O(log2 n).
• Assumption: – Each point set has at most n elements and is in dimension d.– Coordinates are integers of magnitude polynomial in n
• Distance Function: Symmetric Difference with respect to translation
<PΔQ> = min |(P + t)ΔQ|
• Target Metric: L1
t
d
1iii1
d yxyx ,Ry,x
10
Outline of AlgorithmOutline of Algorithm
1. Transform d-dimension points into 1-d dimension points.
(Distortion: 1)
2. Reduce the domain size using a linear hash function.
(Distortion: O(1))
3. Make invariant under translation.
(Distortion: O(log2n))
4. Reduce the target domain size using a universal hash function.
(Distortion: O(1))
{3,6,10,14,22}
1 0000 0 01 0 1 1
{101010, ..., 010100, …, 11101}
3 00 0 02 0 1
O(nlogn)
11
Translation InvariantTranslation Invariant
1 0000 0 01 0 1 1 1 0 0 01 0
{ 1101, 0001,0000, 0010,1100, 1010}…ρ = 4
P =
s
12
Intuition Intuition
1 0000 0 01 0 1 0
1 1000 0 10 0 1 0
hP
hQ
Φ2Q={10,00,01,00,11,00,10,01,00,11,00}
Φ2P={10,01,00,10,01,00,10,00,00,01,00}
Φ4P={1101,0000,0010,1100,0000,0001,1000,0010,0101,0000,0010}
Φ4Q={1011,0100,0010,0101,1000,0011,1100,0010,0100,1001,0000}
s
s
If one of probes hits mismatched positions, then the bit patterns generated may differ.
The probability that one of probes hits mismatched positions increases when the probe size increases.
13
Relationship between Relationship between ρρ (probe size) and (probe size) and δδ**
*δ)s(lnO/*δ
s2
1
s
n2δ
δs
ρ2
Unknown
QΔΦPΦ ρρ
δ: estimated distanceδ*: original distance
Upper bound
Expectation
>2s-2
???
s/2i
increasesρ
Distanceof Invariants
14
EmbeddingEmbedding
)s(logOδ*
)s(logO*δ
122s
QΦPΦ2QΨPΨ E 1L
L
0i
in2log
0i
1iii1
δ
???s2
QΔΦPΦ ρρ*δ
1
.5
20 21 22 … 2L 2H 2log 2n=2n… … …
*
n2log
0i
*Hn2log
Hi
*1H
0i
i1
δ)n(logOδ2δ2QΨPΨ
Distanceof Invariants
δ: estimated distanceδ*: original distance
15
Build TimeBuild Time
The expensive operations are of building invariant and hashing for large domain.
Building invariant : (# of Probes) * (# of Translations) Trivial: O(s) * s = O(n log n) * O(n log n) = O(n2 log2 n)
Universal hash function: (# of Elements) * (Matrix operation) = (# of Elements) * (Input Size) * (Output Size)Trivial: O(s) * O(s) * O(log s) = O(s2 log s) = O( n2 log3 n )
We can improve it to O( n log3 n ) if we merge two operations. Surprise!!!
16
Merge Two OperationsMerge Two Operations
1 1000 0 10 0 1 0P=s
1 0 1 0 1
1 0 1 0 1r0
H
…
y0y1y2 ys-1
…
f
)P),fr((Conv 0
…
Convolution can be computed in O(n log n) where n is the size of array
rlog s
17
Main Result: Formal StatementMain Result: Formal Statement
Given failure probability β, there exists a randomized embedding from a point set P into a vector ΨP of dimension O(n (log2n) log(1/β)) such that for any P, Q
This embedding can be computed in time O(n (log4n) log(1/β))
QΔP nlog2QΨPΨ )i(
- β1 at least .with prob QΔP nlog17
1QΨPΨ )ii(
18
Open ProblemsOpen Problems
• Q1. Can we improve the distortion bound? currently O(log2
n) Cormode & Muthukrishnan show how to embed a
string under edit distance with moves into L1 with O(log n log* n) distortion.
• Q2. Can we derandomize the algorithm? Cormode & Muthukrishnan’s algorithm is deterministic. • Q3. Can we improve space/time complexities?
19
Other ExtensionsOther Extensions
• Q1. Can we support a distance measure (e.g., Hausdorff distance that is robust to noisy data)?
• Q2. Can we handle other transformation groups?
- integer scaling? - integer scaling +
translation? - affine transformations
over finite vector spaces?
Point Pattern Similarity Searching::
• Distance Measure: Symmetric Difference
Distance
• Error Model: Outliers (but No Noise)
• Transformation: Translation • Restriction: Coordinates are integral
20
Thank You!
21
Translation InvariantTranslation Invariant
1 0000 0 01 0 1 1
P = {3,6,10,14,22}
h(x) = x mod s (e.g. s = 11)
1 0 0 01 0
{ 1101, 0001,0000, 0010,1100, 1010}…
h’(x) : (for simplicity, x mod 10)
2 0001 2 01 0 0
ΦρP = {13,0,2,12,1,…,10}
ρ = 4
ΦρP =
hP =
s
0 0000 0 00 0 0
0 1 2 3 4 5 6 7 8 9
22
Trial 1: Geometric Hashing for TranslationTrial 1: Geometric Hashing for Translation
• Naïve Version: - Space complexity is O( N n2 ) since the frame size is 1.
- With outliers in a query: # of queries will increase
• Adaptive Version: To reduce space complexity, if store only c transformed sets,
then # of queries will increase.
• Outliers may lead a false matching, thus they will increase the prob. of the false positive.
23
Geometric Hashing with Outliers (delete)Geometric Hashing with Outliers (delete)
Based on the outliers $r$ and the frame size $k$, the number of queries will increase to get a correct result.
method 1. Pr[ choose a valid frame set] = ( 1 – r/n )^k method 2. (r + 1) different trials ( deterministic) method 3. pigeonhole theorem. Pr[ choose a valid frame set] = 1-r/(n/k)
[Grimson&Huttenlocher 90] : Outliers lead a false matching and increase the prob. of the false positive.
24
d-Dimension d-Dimension 1-Dimension 1-Dimension
Let u be the maximum coordinate value of each point. Then, we can map a d-dimensional point set to a 1-dimensional point set with coordinates of size at most (3u)d. without changing the symmetric difference distance under translation.
0 1 0 10
0 0 1 00
0 1 0 00
0 1 0 10 … 0 0 1 00 … 0 1 0 00 …
(1,1)
(5,3)
1 35[6,15] [21,30]
25
# of Primes & Collision Prob.# of Primes & Collision Prob.
•Collision Probability h(x) = x mod s where s is a prime number in Θ (n log n) ( where s is chosen uniformly at random )
For x != y Pr[h(x) = h(y)] = Pr[(x mod s) = (y mod s)] = Pr[(x-y) mod s = 0] Since x, y Є Znc, |x – y| < nc.
Pr[h(x) = h(y)] < c/(# of primes) = 1/O(n)
• Prime Number Theorem There exist O(m/log m) prime numbers in range between 1 and m.
26
Distance Distortion by HashingDistance Distortion by Hashing
We can achieve o(1) distortion with the hash function which the probability of collision is 1/O(n).
Note that the distance is always contracted due to collision.
27
Linear Hash Function (X)Linear Hash Function (X)
• h(x) = x mod s where s is a prime number in Θ(n log n)
• Linearity h( x + t ) = h(x) + h(t) - translation ΦρP = Φρ(P+t)
P = {3,6,10,14,22}
1 0000 0 01 0 1 1S
28
Distance Distortion by Hashing (X)Distance Distortion by Hashing (X)
We can achieve o(1) distortion with the hash function which the probability of collision is 1/O(n).
Note that the distance is always contracted due to collision.
29
Universal Hash Function for large domainUniversal Hash Function for large domain
Since the maximum probe size is O(n log n), the input domain of hash function is O(2O(n log n)). However, it has only θ(n log n) elements.
• H: 2s 2k
H(x) = R x + b (mod (2,2,…,2)) R: a random k x s matrix b: k bits random row vector.
• Time Complexity: For compute a value : O( k s ) = O( (log n) n log n ) = O( n log2 n ) For, all s (= O(n log n) ) , the time is O( n2 log3 n ).
30
Relationship between Relationship between ρρ and and δδ**
*δ)s(lnO
δ*
s2
1
s
n2δ
δs
ρ2
Unknown
QΔΦPΦ ρρ
δ is a guess distanceδ* is an optimal distance
Upper bound
Expectation
>2s-2
???
s/2i
31
Effect of Hash FunctionsEffect of Hash Functions
*δ)s(logO
δ*
s2
1
s
n2δ
δs
ρ2
???
QΔΦPΦ ρρ
h
h’
32
Merge Two Operations using FFT & Merge Two Operations using FFT & ConvolutionConvolution
П = random_probe( ρ, s ) For t = 1, …., s, x(t) = (hP + t)[П] // make an invariant
For t = 1, …, s. x’(t) = H x(t) + b ( mod (2,2,2,…,2) ) // H: O(log s) x ρ
matrix
ΦρP[x’(t)]++Time Complexity: O(s) * O(matrix multi) = O( s ) * O(s log s)------------------------------------------------------------------------ H = [r1, r2, …, rO(log s)]’ // ri : a binary row bit vector
Hx(t) = [ r1 x(t), r2 x(t), r3 x(t), …, rO(logs) x(t)]’
ri x(t) = ri (hP + t)[П] = (hP + t)[П ri]
[ri x(0), ri x(1), …, ri x(s)] = fliplr(hP) [П ri]
Time Complexity: O(log s) * O(convolution) = O( log s ) * O(s log s)
33
Build TimeBuild Time
Trivial running time Ours
d-dimension -> 1-dimension
O(dn) O(dn)
Linear Hashing O(n) O(n)
Invariant under Translation
O(n^2 log^2 n)
O( n log^3 n)Universal Hashing(due to the domain size, we need to use matrix multiplication )
O(n^2 log^4 n)