+ All Categories
Home > Documents > Computing Nearest-Neighbor Fields via Propagation-Assisted...

Computing Nearest-Neighbor Fields via Propagation-Assisted...

Date post: 26-May-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
8
Computing Nearest-Neighbor Fields via Propagation-Assisted KD-Trees Kaiming He Jian Sun Microsoft Research Asia Abstract Matching patches between two images, also known as computing nearest-neighbor fields, has been proven a use- ful technique in various computer vision/graphics algo- rithms. But this is a computationally challenging nearest- neighbor search task, because both the query set and the candidate set are of image size. In this paper, we propose Propagation-Assisted KD-Trees to quickly compute an ap- proximate solution. We develop a novel propagation search method for kd-trees. In this method the tree nodes checked by each query are propagated from the nearby queries. This method not only avoids the time-consuming backtracking in traditional tree methods, but is more accurate. Experiments on public data show that our method is 10-20 times faster than the PatchMatch method [4] at the same accuracy, or reduces its error by 70% at the same running time. Our method is also 2-5 times faster and is more accurate than Coherency Sensitive Hashing [22], a latest state-of-the-art method. 1. Introduction Non-parametric patch sampling is of central importance in various computer vision/graphics algorithms, including image inpainting/retargeting [31, 28], denoising [8, 10], texture synthesis [13, 30], super-resolution [17], matching self-similarities [26], and rendering [21]. A common step in non-parametric patch sampling is to compute approxi- mate nearest-neighbor fields (ANNF) [4, 22]: given two im- ages A and B, find for every patch in A a similar patch in B. ANNF computation is a special nearest-neighbor search problem, where the query/candidate set consists of all (over- lapping) patches in the image A/B. It is challenging to com- pute ANNF quickly, because the sizes of both sets are very large. Traditional approximate nearest-neighbor (ANN) algo- rithms organize the candidates to facilitate searching. A popular and effective way is through trees, such as the kd- tree [15] and other varieties [30, 11, 23, 24]. Tree-based methods organize the candidates adaptively to their distri- bution in the searching space. A query can find its ANN by checking a small portion of candidates. Tree-based methods are still among the state-of-the-art ANN solutions nowadays [24, 29]. However, the backtracking behavior [15, 2] in the tree methods prevents interactive or real-time computation of ANN fields, where the amount of queries is massive. The queries are treated individually in the traditional ANN methods. But they are strongly dependent in ANNF problems, because they are overlapping patches from the same image. A milestone method called PatchMatch [4] ob- serves that the images are coherent, so the matching result of a query patch can be propagated to the nearby queries and thus be reused. This method is found to be much faster than kd-trees, enabling ANNF computation at interactive rate. Nevertheless, PatchMatch does not consider the candidate distributions and visits many implausible candidates. Also, the results tend to be trapped in local optimum due to the short-distance propagation. In this paper, we propose Propagation-Assisted KD- Trees for efficient ANNF computation. Our key insight is that we can jointly exploit the distribution of the candidates (all patches in B) and the dependency of the queries (all patches in A). After organizing the candidates in a kd-tree, our method checks for each query its own leaf 1 and an ex- tra leaf propagated from nearby queries. The candidates in the propagated leaf need not be spatially close, so the algo- rithm can jump out of local optimum. Our algorithm is very fast because it only checks a small number of candidates (e.g., 20 per query) and has no backtracking. In addition to efficiency, our algorithm is highly accurate thanks to the data-adaptive structure of the kd-tree. In the experiments on the public data set [9], we observe 10-20 times speedup versus PatchMatch at the same accuracy, or 70% less error at the same running time. Very recently, a method called Coherency Sensitive Hashing (CSH) [22] improves Patch- Match by introducing a hashing scheme. We find that our method is 2 to 5 times faster than this latest state-of-the-art method and is more accurate. Interestingly, our experiments also show a traditional kd- tree (where each query is treated independently) combined with a suitable representation has comparable performance with PatchMatch. This is in contrast to the results reported 1 In our experiments a typical leaf contains 8 candidates. 1
Transcript
Page 1: Computing Nearest-Neighbor Fields via Propagation-Assisted ...kaiminghe.com/publications/cvpr12nnf.pdf · 3.2. Building a KDTree After computingthe WHT, we builda traditionalkd-tree

Computing Nearest-Neighbor Fields via Propagation-Assisted KD-Trees

Kaiming He Jian Sun

Microsoft Research Asia

Abstract

Matching patches between two images, also known as

computing nearest-neighbor fields, has been proven a use-

ful technique in various computer vision/graphics algo-

rithms. But this is a computationally challenging nearest-

neighbor search task, because both the query set and the

candidate set are of image size. In this paper, we propose

Propagation-Assisted KD-Trees to quickly compute an ap-

proximate solution. We develop a novel propagation search

method for kd-trees. In this method the tree nodes checked

by each query are propagated from the nearby queries. This

method not only avoids the time-consuming backtracking in

traditional tree methods, but is more accurate. Experiments

on public data show that our method is 10-20 times faster

than the PatchMatch method [4] at the same accuracy, or

reduces its error by 70% at the same running time. Our

method is also 2-5 times faster and is more accurate than

Coherency Sensitive Hashing [22], a latest state-of-the-art

method.

1. Introduction

Non-parametric patch sampling is of central importance

in various computer vision/graphics algorithms, including

image inpainting/retargeting [31, 28], denoising [8, 10],

texture synthesis [13, 30], super-resolution [17], matching

self-similarities [26], and rendering [21]. A common step

in non-parametric patch sampling is to compute approxi-

mate nearest-neighbor fields (ANNF) [4, 22]: given two im-

ages A and B, find for every patch in A a similar patch in

B. ANNF computation is a special nearest-neighbor search

problem, where the query/candidate set consists of all (over-

lapping) patches in the imageA/B. It is challenging to com-

pute ANNF quickly, because the sizes of both sets are very

large.

Traditional approximate nearest-neighbor (ANN) algo-

rithms organize the candidates to facilitate searching. A

popular and effective way is through trees, such as the kd-

tree [15] and other varieties [30, 11, 23, 24]. Tree-based

methods organize the candidates adaptively to their distri-

bution in the searching space. A query can find its ANN by

checking a small portion of candidates. Tree-based methods

are still among the state-of-the-art ANN solutions nowadays

[24, 29]. However, the backtracking behavior [15, 2] in the

tree methods prevents interactive or real-time computation

of ANN fields, where the amount of queries is massive.

The queries are treated individually in the traditional

ANN methods. But they are strongly dependent in ANNF

problems, because they are overlapping patches from the

same image. A milestone method called PatchMatch [4] ob-

serves that the images are coherent, so the matching result

of a query patch can be propagated to the nearby queries and

thus be reused. This method is found to be much faster than

kd-trees, enabling ANNF computation at interactive rate.

Nevertheless, PatchMatch does not consider the candidate

distributions and visits many implausible candidates. Also,

the results tend to be trapped in local optimum due to the

short-distance propagation.

In this paper, we propose Propagation-Assisted KD-

Trees for efficient ANNF computation. Our key insight is

that we can jointly exploit the distribution of the candidates

(all patches in B) and the dependency of the queries (all

patches in A). After organizing the candidates in a kd-tree,

our method checks for each query its own leaf 1 and an ex-

tra leaf propagated from nearby queries. The candidates in

the propagated leaf need not be spatially close, so the algo-

rithm can jump out of local optimum. Our algorithm is very

fast because it only checks a small number of candidates

(e.g., 20 per query) and has no backtracking. In addition

to efficiency, our algorithm is highly accurate thanks to the

data-adaptive structure of the kd-tree. In the experiments

on the public data set [9], we observe 10-20 times speedup

versus PatchMatch at the same accuracy, or 70% less error

at the same running time. Very recently, a method called

Coherency Sensitive Hashing (CSH) [22] improves Patch-

Match by introducing a hashing scheme. We find that our

method is 2 to 5 times faster than this latest state-of-the-art

method and is more accurate.

Interestingly, our experiments also show a traditional kd-

tree (where each query is treated independently) combined

with a suitable representation has comparable performance

with PatchMatch. This is in contrast to the results reported

1In our experiments a typical leaf contains 8 candidates.

1

Page 2: Computing Nearest-Neighbor Fields via Propagation-Assisted ...kaiminghe.com/publications/cvpr12nnf.pdf · 3.2. Building a KDTree After computingthe WHT, we builda traditionalkd-tree

image B

(candidate patches)

image A

(query patches)

pA(x-1, y)

pA(x, y)

pB(x’-1, y’)

pB(x’, y’)

Figure 1. Propagation in PatchMatch. Each square represents the

top left pixel of a patch, so the nearby patches are actually over-

lapping. The solid arrow indicates a good match. The dash arrow

indicates a candidate to be checked.

in [4], which finds kd-trees are inferior. Our discovery indi-

cates that candidate distributions (exploited by kd-trees) and

query dependency (exploited by PatchMatch) can be almost

equally contributing to ANNF computation. Thus combin-

ing these two aspects as in our method can make significant

improvement.

2. Related Work

We review the works most related to our method: Tree-

based methods, and PatchMatch and its improvements.

Tree-based methods. A classical method for ANN search

is the kd-tree [15]. It is a binary tree where each node

denotes a subset of the candidate data with a partitioning

boundary of the data space. The partitioning is adaptive to

the distribution of the data. Given a query, the search meth-

ods [15, 2] descend the tree to a leaf and backtrack other

leaves that are close to the query. The searching accuracy

depends on the amount of backtracking [2, 27].

To better organize the candidates and reduce backtrack-

ing, various types of trees have been proposed, including

the k-means tree [16, 24], bbd-tree [3], TSVQ [30], vp-tree

[23], rp-tree [11], multiple randomized kd-trees [27, 24],

and so on. Though all these methods can improve accuracy,

they spend extra effort in building the trees, e.g., the build-

ing time of a k-means tree [24] is 10-20 times more than a

kd-tree. This is not favored in ANNF computation, because

we often have to build the tree on-line.

Recent studies [24, 29] show that tree-based methods, in-

cluding the simple kd-tree, are still among the state-of-the-

art ANN methods nowadays. But it is worth noticing that in

these evaluations the queries are treated independently.

PatchMatch. The PatchMatch method [4] utilizes the de-

pendency among the queries and performs searching col-

laboratively. It is observed that the images are coherent: the

patches in a neighborhood in image A are likely to match

the patches in a neighborhood in image B. Typically, if a

pair of patches are similar, they are likely to remain similar

when shifted one or some pixels simultaneously (see Fig.1).

Thus the matching result of a query patch can be propagated

Figure 2. The first 16 Walsh-Hadamard Transform (WHT) bases

(white = 1, black = -1). Each basis is a 4n×4n-pixel kernel.

to the next query, providing a good initial guess which is up-

dated by some randomly sampled candidates. The random

search in turn provides better sources for propagation. This

process is iterated. PatchMatch has been shown 20 to 100

times faster than kd-trees plus Principal Component Anal-

ysis (PCA) [4]. But because PatchMatch does not organize

the data beforehand, most randomly sampled candidates are

unlikely to be good matches. And its local propagation of-

ten leads to over-smoothed results.

Most recently, a method called Coherency Sensitive

Hashing (CSH) [22] improves PatchMatch by combining

Locality Sensitive Hashing (LSH) [12]. All the patches

(queries and candidates) are hashed into bins, and similar

patches have a good chance to fall into the same bin. Ran-

dom candidates are sampled from the bins that most po-

tentially contain good matches. This hashing method is

combined with propagation. This method is also iterative,

switching the hashing function in each pass. It shows 3-

4 times of speedup versus PatchMatch. But this binning

scheme is in general unbalanced and not aware of the data

distribution, and the bin sizes need to be adjusted carefully.

Random sampling is required in PatchMatch/CSH to re-

duce candidates checked. Our method has no randomness,

and reduces candidates checked via data organization.

3. Algorithm

We observe that the distribution of the candidates and

the dependency of the queries can be exploited jointly. We

use a kd-tree with a suitable representation to organize the

candidates. Then we propose a novel propagation-assisted

search method for fast and accurate querying.

3.1. Patch Representation

A p-by-p patch in a color image can be represented by a

vector in a 3p2-dimensional space. The similarity between

two patches is described by the L2 distance in this space.

Because the kd-tree is less effective for high dimensional

data, it is recommended to reduce dimensionality via PCA

[27, 4]. But projecting each patch on each PCA basis re-

quires slow O(3p2) time pre-computation.

Instead we use the Walsh-Hadamard Transform (WHT)

[20] as the bases (Fig. 2). It is a Fourier-like orthogonal

Page 3: Computing Nearest-Neighbor Fields via Propagation-Assisted ...kaiminghe.com/publications/cvpr12nnf.pdf · 3.2. Building a KDTree After computingthe WHT, we builda traditionalkd-tree

(a) Standard search

(err: 11.13)

e0

e1

1200

120

(b) Priority search

(err: 10.71)

e0

e1

1200

120

(d) Propagation from GT

(err: 2.77)

e0

e1

1200

120

(c) Randomized kd-trees

(err: 7.80)

e0

e1

1200

120

(e) Our propagation search

(err: 3.04)

e0

e1

1200

120

Figure 3. The joint distribution P (e0, e1) in different search strategies: (a) standard search; (b) priority search; (c) randomized kd-trees; (d)

propagation from GT; (e) our propagation method. A darker color means higher probability. The Leaf #0 is the same in all these figures,

so the marginal distribution P (e0) is identical. We only change Leaf #1. In the brackets is the average error of the best candidate in the

union set of Leaf #0 and #1. The error in Leaf #0 alone is 13.84. The error is reduced only when e1 < e0, i.e., below the diagonal line. The

distributions are computed using the images in [25]. Each leaf contains m = 8 candidates in this test. Here the error is computed through

the 24-d representation.

transform. In natural images the first few WHT bases con-

tribute a large portion of L2 distance [20], just like the PCA

bases. Speed-wise, projecting on each WHT basis requires

only 2 operations per patch [6]. We use the first 16 WHT

bases for the Y channel2 and 4 for each chrominance chan-

nel (Cb/Cr) throughout this paper. Thus we represent each

patch by a 24-d vector. In CSH [22] the WHT is used for

constructing hashing functions.

3.2. Building a KD­Tree

After computing the WHT, we build a traditional kd-tree

[15] in the 24-d representation space. Given any candidate

set, we choose the dimension with the maximum spread3

and split the space by the median value of the candidate da-

ta in this dimension. The median split is to ensure balance.

The candidate set is divided recursively until the each ter-

minal node (leaf) contains at most m candidates. We test

m=8 to 64 in this paper.

3.3. Analysis of Search Strategies

In the search step, a query greedily descends the tree

from the root to a leaf by checking the partitioning bound-

ary in each node. We denote this leaf as Leaf #0. The query

lies in the hyper-cube determined by this leaf, so any search

strategy should check all candidates in Leaf #0. But there is

a chance that better candidates lie in other leaves, e.g., when

the query is very close to the boundary. One may find a bet-

ter result by checking other leaves. A common question in

tree-based methods is: how to determine the next leaves to

be checked?

The standard search method [15] backtracks in depth-

first order and examines those leaves close to the query.

2Computing 16 WHT bases requires p to be a multiple of 4. But we

can relax this constraint by interpolation and allow any patch size.3The spread is defined as the difference between the largest and small-

est values in the dimension [3].

Denote the first leaf visited in backtracking as Leaf #1. We

define e1 as the error of the best match in this leaf:

e1(pA) = minpB∈Leaf #1

(‖pA − pB‖2)− ‖pA − p̂B‖2, (1)

where pA is a query patch, pB is a candidate patch, and

p̂B is the true nearest neighbor in the whole image. We

can define e0 as the error of the best match in Leaf #0 in a

similar way. To see how well Leaf #1 improves the result,

we test on the images in [25] and plot the joint probability

distribution P (e0, e1) in Fig. 3 (a). Notice that Leaf #1 can

improve the result only when e1 < e0, i.e., below the line

e1 = e0. But the distribution is mostly above this line. In

this example the average error in Leaf #0 alone (mean(e0))is 13.84, and combining Leaf #1 reduces the error to 11.13

(mean(min(e0, e1))). The improvement is minor.

The priority search method [2] backtracks the sub-trees

in the order of their distance to the query, so better leaves are

expected to be visited earlier. In this case, the distribution

P (e0, e1) is shown in Fig. 3 (b). We see that most points are

still located above the line e1 = e0, and the improvement is

not much (error = 10.71).

In [27] it is observed that the leaves visited via back-

tracking are mutually dependent, and the improvement of

visiting more leaves is diminishing. The method in [27]

instead builds multiple randomized kd-trees, so the search-

es are largely independent among trees. Here we generate

Leaf #1 by descending a second kd-tree with random rota-

tion [27]. We find the joint distribution P (e0, e1) is well

symmetric (Fig. 3 (c)). Almost a half of the results are im-

proved by Leaf #1. The error is reduced to 7.80. This is

perhaps the best outcome we can expect for any individual

query after checking two leaves.

But the queries in ANNF tasks can be handled cooper-

atively. If a query has found a good match, it can provide

information for the nearby queries. Suppose a query patch

Page 4: Computing Nearest-Neighbor Fields via Propagation-Assisted ...kaiminghe.com/publications/cvpr12nnf.pdf · 3.2. Building a KDTree After computingthe WHT, we builda traditionalkd-tree

image Bimage A

Leaf #1

pA(x-1, y)

pA(x, y)

pB(x’-1, y’)

pB(x’, y’)

Figure 4. Propagation-assisted kd-tree. The symbols are analogous

to those in Fig. 1. The dotted arrow indicates the propagated leaf.

Given the matching result pA(x − 1, y) → pB(x′− 1, y′), the

candidates for pA(x, y) are all the patches in the leaf that contains

the patch pB(x′, y′) (like the shaded ones).

pA(x−1, y) has found a similar patch pB(x′−1, y′) (Fig.4).

We use it to improve the result of pA(x, y). We take the leaf

that contains pB(x′, y′) as Leaf #1. This is an extension of

the propagation step in PatchMatch: we propagate a group

of candidates instead of a single one. These candidates ap-

pear similar but need not be spatially nearby (see the shaded

patches in Fig.4). We name this strategy as propagation-

assisted kd-tree search.

In a very special case that pB(x′ − 1, y′) is the ground

truth best match of pA(x− 1, y), we plot the error distribu-

tion of pA(x, y) in Fig. 3 (d). We observe that over half of

them are improved by Leaf #1 (below the line e1 = e0). The

error is significantly reduced to 2.77. This test suggests that

finding a very good result without backtracking is possible,

when the queries can propagate their information.

Though the propagation search using the ground truth

best match of pA(x − 1, y) to help the search of pA(x, y)is not practical, we can expect a similar performance if the

match of pA(x−1, y) is good enough. Actually we can find

a good match of pA(x−1, y) by another propagation search

(propagated from pA(x− 2, y)), and so on. We describe the

details in the next subsection. The distribution P (e0, e1)in Fig. 3 (e) is generated by the Leaf #1 obtained in our

algorithm. It is not much different from Fig. 3 (d), and the

error (=3.04) is only slightly larger.

3.4. Propagation­Assisted KD­tree Search

Next we describe our search algorithm driven by the

above analysis. Our algorithm scans the image A in raster

order (from left to right, top to bottom). For a query patch

pA(x, y) being scanned, we do the following 3 steps:

Step 1: Descend the tree to the Leaf #0.

Step 2: Propagate a leaf from left, i.e., along x-axis (as in

Fig. 4). Specifically, denoting pB(x′ − 1, y′) as the result

found by pA(x − 1, y). We pick out the patch pB(x′, y′)

and retrieve the leaf containing it. This leaf can be retrieved

without any traversing4. Similarly, we also propagate a leaf

from top, i.e., along y-axis.

Step 3: Find the nearest-neighbor of pA(x, y) in all the

leaves obtained in Step 1 & 2.

Our algorithm is non-iterative and finishes in one scan. It

need not randomly sample candidates. The above algorith-

m alone performs quite well, but we can further improve

its speed and accuracy by three more operations, namely,

enrichment, pruning, and re-ranking.

Enrichment The kd-tree is a good method for searching

k nearest-neighbors (k > 1). We adopt this property to fur-

ther improve quality. We maintain the best k candidates for

each query. Although this does not impact the outcome of

the current query, the extra k − 1 candidates can propagate

(at most) k − 1 leaves to the following query, enriching its

candidates pool. The propagation is just similar to that in

Step 2. In all the experiments we set k = 2. Combining

with the “pruning” operation introduced below, the enrich-

ment operation reduces the error by 15% with almost no

extra time cost.

Pruning For each query we obtain at most 2k propagated

leaves (k from left and k from top) besides the query’s own

Leaf #0. To reduce time cost, we select only one leaf from

these 2k leaves. This is achieved by first checking the patch

which “guides” us to a leaf (e.g., pB(x′, y′) in Fig.4). We

have 2k such guide patches and compute their similarities

to the query, approximately measuring how good the leaves

may be. We only keep the single leaf with the best guide

patch, and ignore the others. The plot in Fig. 3 (e) is given

by the Leaf #1 obtained in this way.

With this pruning technique our algorithm checks at

most 2k+2m candidates per query: 2k “guide” patches, and

2m patches from Leaf #0 and #1 (each leaf has m patches).

Candidate checking and re-ranking Given the candi-

dates in Leaf #0 and #1, the simplest way to find an ANN of

a query is to compute its L2 distance to each candidate using

the 24-d representation. Although this strategy may lose ac-

curacy due to dimensionality reduction, in experiments we

find that its time-error tradeoff is satisfactory.

We can obtain a better solution by computing the L2 dis-

tance in the original 3p2-dimensional space, but this is much

slower. A compromising way is to findK nearest-neighbors

4We maintain a leaf pointer for each candidate pB when building a tree.

Page 5: Computing Nearest-Neighbor Fields via Propagation-Assisted ...kaiminghe.com/publications/cvpr12nnf.pdf · 3.2. Building a KDTree After computingthe WHT, we builda traditionalkd-tree

using the 24-d representation, and re-rank these K candi-

dates in the original space. Re-ranking not only improves

the result of the current query, but also impacts the propaga-

tion quality. We set K = 2 in this paper5. We compare both

strategies (without and with re-ranking) in experiments.

3.5. Complexity

Given an N -pixel image, the tree building time is

O(dN logN) where d = 24 is the dimensionality. The

reason for the factor d is because all dimensions of all da-

ta in each node are scanned to find the maximum spread.

To reduce time cost, we only sample 1/d data in spread

computation (in other operations we still use all data). This

simplification rarely changes the first several branches, and

has negligible influence on quality in experiments. The tree

building time is thus reduced to O(N logN).

Suppose both images are of N pixels. The search time is

O(N logN)+O(Nmd), where the first term is for descend-

ing the tree and the second term is for candidate checking.

In practice the linear part O(Nmd) is dominant, e.g., when

N=1Mp (logN=20), m=8, and d=24. Table 1 shows the

typical running time of each stage.

image size WHT tree building search total

N O(Nd) O(NlogN)O(NlogN)

+O(Nmd)

0.5Mp 0.08s 0.2s 0.32s 0.6s

2Mp 0.3s 0.9s 1.5s 2.7s

Table 1. Typical running time of each stage (m=8, no re-ranking).

The memory complexity of our method is about O(Nd),mainly for storing the WHT coefficients. In practice, it

takes around 100MB memory when N=1Mp.

4. Experiments

We compare our algorithm with PatchMatch (PM) and

CSH. Our algorithm is implemented in C++. The PM and

CSH codes are from the authors’ websites [25, 9], both es-

sentially in C++. All algorithms are run on a PC with an

Intel Core i7 3.0GHz CPU and 8GB RAM. Here we com-

pare single core implementations6. We experiment on the

public data set VidPairs [9]. This is a challenging data set

where each pair of images have very large displacement.

Time-accuracy tradeoffs In Fig. 5 we show the time-

accuracy curves using 8-by-8 patches and 2Mp images. The

accuracy is described by the average L2 distance between

5We have tested K > 2 in re-ranking (and also k > 2 in enrichment)

and found that the gain is not much given the extra running time.6Our method can be parallelized just like PM. In dual/quad-core con-

figurations we observe similar speedup versus PM (parallel CSH has not

been provided in [9]).

each query patch and its nearest neighbor found by an algo-

rithm, using the original RGB colors of the p-by-p pixels.

The running time is averaged on all 133 image pairs in the

data set. We test our method using m ∈ {8, 16, 32, 64}.

We also test our method without and with re-ranking. All

pre-computation time (WHT and tree-building) is included

in all reports.

In Fig. 5 it is clear that our algorithm significantly out-

performs PM and CSH. At the same accuracy our fastest

setting (no re-ranking and m=8) is 18 times faster than PM

and 3 times faster than CSH; a more precise setting (re-

ranking, m=8) is 4-5 times faster than CSH to approach

similar accuracy. Given the same running time, our method

reduces the error (i.e., the discrepancy between ground

truth) of PM by 70% and the error of CSH by 50% in most

settings (e.g., on our curve “re-ranking”). Our algorithm

manages to achieve very good accuracy (e.g., re-ranking,

m=64) that is not available in PM/CSH after 30 iterations.

When m=8 our method checks about 20 candidates per

query. In comparison, this number is ∼60 in PM and 50 in

CSH after 5 iterations (their default). Our method is more

accurate even if it checks much fewer candidates. This is

because the accuracy of PM/CSH can only be guaranteed

given a sufficiently large number of random samples.

It is worth noticing that our better time-accuracy trade-

off versus PM is not purely due to the shorter representation

of WHT. Though PM uses the original representation, the

average effective dimensionality checked is only 20-40%

of 3p2 due to “early stop” [4] (when the partial sum ex-

ceeds the current best sum), while the WHT hardly bene-

fits from early stop due to its compact energy. Moreover,

though the shorter WHT representation leads to faster can-

didate checking, it also impacts quality. In experiment we

find PM+WHT generally has worse time-accuracy tradeoffs

than PM alone.

We also compare our method with PM/CSH using vari-

ous image sizes and patch sizes, as shown in Fig. 6. Sim-

ilar performance comparisons are observed: our method is

about 10-20 times faster than PM and 2-5 times faster than

CSH to achieve the same accuracy.

Comparisons with traditional kd-trees We also com-

pare the traditional kd-tree method (no propagation) in

Fig. 5. We fix the parameter m=8 when building this tree.

The marker “⊗” (leftmost on the curve) shows the perfor-

mance without any backtracking: it is better than the first

iteration of PM/CSH, even if PM/CSH has already exploit-

ed propagation. This means that a kd-tree with WHT rep-

resentations is a good way to organize the candidates. Al-

though CSH also groups the candidates before searching,

its binning structure is in general unbalanced and less data-

adaptive. On the contrary, a kd-tree is fully balanced and

well adaptive to the data.

Page 6: Computing Nearest-Neighbor Fields via Propagation-Assisted ...kaiminghe.com/publications/cvpr12nnf.pdf · 3.2. Building a KDTree After computingthe WHT, we builda traditionalkd-tree

40

45

50

55

60

65

0 5 10 15 20 25 30 35 40 45 50

L2dist.

seconds

PM

CSH

Ours (no re-ranking)

Ours (re-ranking)

Traditional kd-tree

Ground Truth

2Mp, 8x8

Figure 5. Time-accuracy tradeoffs averaged on 133 image pairs. The image size is 2Mp, and the patch size is 8-by-8. Each marker on

PM/CSH’s curve represents the performance after each iteration. Each marker on our method’s curve represents its performance at each

m value (m=8,16,32,64, from faster to slower). Each marker “×” on the traditional kd-tree’s curve represents its performance at each

C value, where C=16,32,64,128,256 is the maximum candidates visited per query. Specially, the marker “⊗” represents the traditional

kd-tree without backtracking. The dash line is the ground truth average L2 distance.

The markers “×” in Fig. 5 show the performance of a tra-

ditional kd-tree with different strengths of backtracking. We

use the standard backtracking implementation of the ANN

lib[1]. We set the “error bound” ε = 3 and vary the maxi-

mum number C of candidates visited by any query (C=16

to 256). Fig. 5 shows that its performance is slightly bet-

ter than PM. This is in contrast to the result reported in [4],

where the kd-tree plus PCA is found inferior. Our result

is reasonable: the traditional kd-tree exploits the candidate

distributions and PM exploits the query dependency, and

these two aspects can be equally contributing. All patch-

es in the candidate/query set are from one or another im-

age, so the data relations inside each set should have similar

strength. Thus a traditional kd-tree can actually be compa-

rable with PM.

ANN fields and error maps In Fig. 7 we demonstrate the

ANNFs and the error maps. The ANNF is given by a map

of the coordinates of the matched patches. In Fig. 7 we find

that the ANNF of PM is much smoother than the ground

truth. Our method and CSH overcome this problem because

both methods allow non-local propagation. But our ANNF

appears much more similar to the ground truth than CSH

(e.g., see the zoom-in regions).

On the error maps, we compute the error e of a query

patch pA by:

e(pA) = ‖pA − pB‖2 − ‖pA − p̂B‖2, (2)

where pB is the matched patch given by any algorithm , p̂Bis the ground truth match, and ‖ · ‖2 is computed by the

original RGB representation. The error map of the ground

truth matching is an all-zero map. Fig. 7 shows our matched

patches have smaller error in general, typically on the edges

and texture regions.

120

124

128

132

136

140

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

PM

CSH

Ours (no re-ranking)

Ours (re-ranking)

Ground Truth

70

75

80

85

90

95

100

105

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

PM

CSH

Ours (no re-ranking)

Ours (re-ranking)

Ground Truth

80

85

90

95

100

105

110

115

0 5 10 15 20 25 30 35 40 45 50

PM

CSH (N/A)

Ours (no re-ranking)

Ours (re-ranking)

Ground Truth

seconds

L2 dist.

seconds

L2 dist.

seconds

L2 dist.

0.4Mp, 8x8

2Mp, 12x12

20

24

28

32

36

40

0.4Mp, 4x4

Figure 6. Comparison with PM/CSH in different settings. CSH is

not available in the third figure because it does not support patch

sizes other than 2nx2n.

Reconstruction One can reconstruct the image A by the

matched patches found in image B. This is a main step in

image inpainting, retargeting, and reshuffling [31, 28]. Any

pixel in the reconstructed image is “voted” by all the patch-

Page 7: Computing Nearest-Neighbor Fields via Propagation-Assisted ...kaiminghe.com/publications/cvpr12nnf.pdf · 3.2. Building a KDTree After computingthe WHT, we builda traditionalkd-tree

img B

PM CSH GTOurs

img A

PM

CSH

Ours

GT

Figure 7. ANN fields and the error maps. 1st row: inputs. 2nd-4th

rows: the ANN fields (left) and the error maps (right) of PM, CSH

and our method. 5th row: ground truth ANNF (left) and zoom-

in of all the ANNFs. The ANNFs only show the x-coordinates,

visualized in hue. In the error maps a darker pixel means larger

error (a GT error map is all white). Here we use 0.4Mp images

and 8x8 patches. The PM and CSH are after 5 iterations, and ours

is using m=64 with re-ranking (running time: PM 2.3s, CSH 1.4s,

and ours 1.2s). The images are from the VidPairs set.

es covering this pixel. In Fig. 8 we show the reconstruction

results. The “GT” reconstruction is the result voted by the

ground truth matched patches. Our results are visually bet-

ter than PM/CSH in various cases like thin structures (Fig. 8

top), textures (Fig. 8 middle), edges (Fig. 8 bottom), and

faces (Fig. 8 bottom, note the eyes and mouth). Our results

are visually comparable to the ground truth.

When the image A is iteratively reconstructed using the

same image B (as in [31, 28]), we only need to build the kd-

tree using the patches in B once. Thus our speedups versus

PM/CSH can be even greater in these applications.

Scalability Fig. 9 shows our running time at 10 different

image sizes. The time is almost linear in image sizes, since

the candidate checking time is dominant.

5. Discussions and Future Work

We have shown our method is faster and more accurate

than existing methods including PatchMatch. But it is worth

CSH GTOursPM

PM CSH GTOurs

PM CSH GTOurs

Figure 8. Visual comparisons of the reconstructed images. On the

top of each group are the images A and B. The images are 0.4Mp

and the patches are 8x8. PM and CSH are run 1 iteration, and

our method is m=8 without re-ranking. Thus the running time of

all methods is nearly the same (∼0.5s). The images are from the

VidPairs set. This figure is best viewed in the electronic version.

Page 8: Computing Nearest-Neighbor Fields via Propagation-Assisted ...kaiminghe.com/publications/cvpr12nnf.pdf · 3.2. Building a KDTree After computingthe WHT, we builda traditionalkd-tree

0

0.5

1

1.5

2

2.5

3

0 0.5 1 1.5 2

seconds

image size (Mp)

Figure 9. Scalability of our method (m=8, no re-ranking).

mentioning that PatchMatch has other advantages. First,

PatchMatch has been generalized to search across scales

and rotations [5] or color transformations [18]. Second, the

similarity function (L2) in PatchMatch can be extended to

special forms other than Lp norms, such as those tailored

for image matting [19] or stereo vision [7]. Besides, Patch-

Match is more memory efficient. In the future we will im-

prove our method in these directions.

The key idea of our method is that querying can be per-

formed collaboratively in traditional ANN methods if the

queries are dependent. It motivates us to study other sim-

ilar scenarios. For example, in object categorization [14]

overlapping patches are quantized by searching nearest-

neighbors in a codebook. It may be accelerated by prop-

agation search. We will study this topic in the future.

References

[1] ANN library. www.cs.umd.edu/˜mount/ANN/.

[2] S. Arya and D. M. Mount. Algorithms for fast vector quan-

tization. In Proc. DCC ’93: Data Compression Conf, pages

381–390, 1993.

[3] S. Arya, D. M. Mount, N. S. Netanyahu, R. Silverman, and

A. Y. Wu. An optimal algorithm for approximate nearest

neighbor searching fixed dimensions. J. ACM, 1998.

[4] C. Barnes, E. Shechtman, A. Finkelstein, and D. B. Gold-

man. Patchmatch: a randomized correspondence algorithm

for structural image editing. In SIGGRAPH, 2009.

[5] C. Barnes, E. Shechtman, D. B. Goldman, and A. Finkel-

stein. The generalized patchmatch correspondence algorith-

m. In ECCV, pages 29–43, 2010.

[6] G. Ben-Artzi, H. Hel-Or, and Y. Hel-Or. The gray-code filter

kernels. TPAMI, pages 382–393, 2007.

[7] M. Bleyer, C. Rhemann, and C. Rother. Patchmatch stereo

- stereo matching with slanted support windows. In BMVC,

2011.

[8] A. Buades, B. Coll, and J.-M. Morel. A non-local algorithm

for image denoising. In CVPR, pages 60–65, 2005.

[9] CSH website. www.eng.tau.ac.il/˜simonk/CSH/.

[10] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian. Image

denoising by sparse 3-d transform-domain collaborative fil-

tering. TIP, pages 2080–2095, 2007.

[11] S. Dasgupta and Y. Freund. Random projection trees and low

dimensional manifolds. In Proceedings of the 40th annual

ACM symposium on Theory of computing, 2008.

[12] M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni.

Locality-sensitive hashing scheme based on p-stable distri-

butions. In Proceedings of the twentieth annual symposium

on Computational geometry, pages 253–262, 2004.

[13] A. A. Efros and T. K. Leung. Texture synthesis by non-

parametric sampling. In ICCV, page 1033, 1999.

[14] L. Fei-Fei and P. Perona. A bayesian hierarchical model for

learning natural scene categories. In CVPR, pages 524–531,

2005.

[15] J. H. Friedman, J. L. Bentley, and R. A. Finkel. An algorithm

for finding best matches in logarithmic expected time. ACM

Trans. Math. Softw., pages 209–226, 1977.

[16] K. Fukunaga and P. M. Narendra. A branch and bound algo-

rithm for computing k-nearest neighbors. IEEE Trans. Com-

put., pages 750–753, 1975.

[17] D. Glasner, S. Bagon, and M. Irani. Super-resolution from a

single image. In ICCV, 2009.

[18] Y. HaCohen, E. Shechtman, D. B. Goldman, and D. Lischin-

ski. Non-rigid dense correspondence with applications for

image enhancement. In SIGGRAPH, 2011.

[19] K. He, C. Rhemann, C. Rother, X. Tang, and J. Sun. A global

sampling method for alpha matting. In CVPR, pages 2049–

2056, 2011.

[20] Y. Hel-Or and H. Hel-Or. Real time pattern matching using

projection kernels. In ICCV, pages 1430–1445, 2003.

[21] A. Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and D. H.

Salesin. Image analogies. In SIGGRAPH, 2001.

[22] S. Korman and S. Avidan. Coherency sensitive hashing. In

ICCV, 2011.

[23] N. Kumar, L. Zhang, and S. Nayar. What is a good nearest

neighbors algorithm for finding similar patches in images?

In ECCV, pages 364–378, 2008.

[24] M. Muja and D. G. Lowe. Fast approximate nearest neigh-

bors with automatic algorithm configuration. In In VISAP-

P International Conference on Computer Vision Theory and

Applications, 2009.

[25] PatchMatch website. http://gfx.cs.princeton.

edu/pubs/Barnes_2009_PAR/.

[26] E. Shechtman and M. Irani. Matching local self-similarities

across images and videos. In CVPR, pages 1–8, 2007.

[27] C. Silpa-Anan and R. Hartley. Optimised kd-trees for fast

image descriptor matching. In CVPR, pages 1–8, 2008.

[28] D. Simakov, Y. Caspi, E. Shechtman, and M. Irani. Sum-

marizing visual data using bidirectional similarity. In CVPR,

pages 1–8, 2008.

[29] A. Vedaldi and B. Fulkerson. Vlfeat: an open and portable

library of computer vision algorithms. In ACM Multimedi-

a’10, 2010.

[30] L.-Y. Wei and M. Levoy. Fast texture synthesis using tree-

structured vector quantization. In SIGGRAPH, 2000.

[31] Y. Wexler, E. Shechtman, and M. Irani. Space-time video

completion. In CVPR, pages 120–127, 2004.


Recommended