[IEEE 2009 16th IEEE International Conference on Image Processing ICIP 2009 - Cairo, Egypt...

ROBUST COLOR EDGE DETECTION THROUGH TENSOR VOTING

Rodrigo Moreno1, Miguel Angel Garcia2, Domenec Puig1, Carme Julia1∗

1 Rovira i Virgili University, Intelligent Robotics and Computer Vision Group,Dept. of Computer Science and Mathematics, Av. Paısos Catalans 26, 43007 Tarragona, Spain

2 Autonomous University of Madrid, Dept. of Informatics Engineering,Cra. Colmenar Viejo Km 15, 28049 Madrid, Spain

ABSTRACT

This paper presents a new method for color edge detection

based on the tensor voting framework, a robust perceptual

grouping technique used to extract salient information from

noisy data. The tensor voting framework is adapted to en-

code color information via tensors in order to propagate them

into a neighborhood through a voting process specifically de-

signed for color edge detection by taking into account percep-

tual color differences, region uniformity and edginess accord-

ing to a set of intuitive perceptual criteria. Perceptual color

differences are estimated by means of an optimized version

of the CIEDE2000 formula, while uniformity and edginess

are estimated by means of saliency maps obtained from the

tensor voting process. Experiments show that the proposed

algorithm is more robust and has a similar performance in

precision when compared with the state-of-the-art.

Index Terms— Image edge analysis, tensor voting,

CIELAB, CIEDE2000.

1. INTRODUCTION

The performance of many computer vision applications di-

rectly depends on the effectiveness of a previous edge de-

tection process. The final goal of edge detection is to find

“meaningful discontinuities” in a digital image. Although

many edge detectors have been proven effective (e.g. [1], [2]),

their performance decreases for noisy images.

This paper proposes a new edge detector that has a sim-

ilar performance to the state-of-the-art methods for noiseless

images and, in addition, a better one for noisy images. The

proposed detector is based on an adaptation to edge detection

(Section 2) of the tensor voting framework (TVF) [3]. First,

an encoding process specifically designed to encode color,

uniformity and edginess into tensors is introduced (Section

2.1). Second, a voting process specifically tailored to the edge

∗This research has been partially supported by the Spanish Ministry of

Science and Technology under project DPI2007-66556-C03-03, by the Com-

missioner for Universities and Research of the Department of Innovation,

Universities and Companies of Catalonia’s Government and by the European

Social Fund.

detection problem is also presented (Section 2.2 and 2.3). Al-

though every color channel is processed independently, possi-

ble correlations between channels are also taken into account

by the proposed method. A comparison of the proposed de-

tector with state-of-the-art methods is shown in Section 3.

2. TENSOR VOTING FRAMEWORK FOR COLOREDGE DETECTION

The input of the proposed method is the set of pixels of a color

image. Thus, positional and color information is available for

every input pixel. Positional information is used to determine

the neighborhood of every pixel, while color information is

used to define the tensors in the encoding step. The next sub-

sections describe the details of the proposed edge detector.

2.1. Encoding of Color Information

Before applying the proposed method, color is converted to

the CIELAB space. Every CIELAB channel is then normal-

ized in the range [0, π/2]. In the first step of the method, the

information of every pixel is encoded through three second or-

der 2D tensors, one for each normalized CIELAB color chan-

nel. These tensors are represented by 2×2 symmetric positive

semidefinite matrices that can be graphically represented by

2D ellipses. There are two extreme cases for the proposed

tensors: stick tensors, which are stick-shaped ellipses with a

single eigenvalue, λ1, different from zero, and ball tensors,

which are circumference-shaped ellipses whose λ1 and λ2

eigenvalues are equal to each other. Three perceptual mea-

sures are encoded in the tensors associated with every input

pixel, namely: the most likely normalized noiseless color at

the pixel (in the specific channel), a metric of local unifor-

mity (how edgeless its neighborhood is), and an estimation

of edginess (how likely finding edges or texture at the pixel’s

location is). The most likely normalized noiseless color is en-

coded by the angle α between the x axis, which represents the

lowest possible color value in the corresponding channel, and

the eigenvector corresponding to the largest eigenvalue. For

example, in channel L, a tensor with α = 0 encodes black,

whereas a tensor with α =π

2encodes white. In addition,

local uniformity and edginess are encoded by means of the

2153978-1-4244-5654-3/09/$26.00 ©2009 IEEE ICIP 2009

Fig. 1. Encoding process for channel L. Color, uniformity and

edginess are encoded by means of α and the normalized s1 = (λ1−λ2)/λ1 and s2 = λ2/λ1 saliencies respectively.

normalized s1 = (λ1 − λ2)/λ1 and s2 = λ2/λ1 saliencies

respectively. Thus, a pixel located at a completely uniform

region is represented by means of three stick tensors, one for

each color channel. In contrast, a pixel located at an ideal

edge is represented by means of three ball tensors, one for

every color channel. Figure 1 shows the graphical interpreta-

tion of a tensor for channel L.

Before applying the voting process, it is necessary to ini-

tialize the tensors associated with every pixel. The most

likely noiseless colors can be initialized with the colors of

the input pixels encoded by means of the angle α between

the x axis and the principal eigenvector, as described be-

fore. However, since metrics of uniformity and edginess

are usually unavailable at the beginning of the voting pro-

cess, normalized saliency s1 is initialized to one and nor-

malized saliency s2 is initialized to zero. These initializa-

tions allow the method to estimate more appropriate values

of the normalized saliencies for the next stages, as described

in the next subsection. Hence, the initial color information

of a pixel is encoded through three stick tensors oriented

along the directions that represent that color in the normal-

ized CIELAB channels: Tc(p) = �tc(p) �tc(p)T , where Tc(p)is the tensor of the c-th color channel (L, a and b) at pixel p,�tc(p) = [cos (Cc(p)) sin (Cc(p))]T , and Cc(p) is the nor-

malized value of the c-th color channel at p.

2.2. Voting Process

The voting process requires three measurements for every pair

of pixels p and q: the perceptual color difference, ΔEpq; the

joint uniformity measurement, Uc(p, q), used to determine if

both pixels belong to the same region; and the likelihood of a

pixel being impulse noise, ηc(p). ΔEpq is calculated through

CIEDE2000 [4], while Uc(p, q) = s1c(p) s1c(q), and ηc(p) =s2c(p) − μ ˆs2c

(p) if p is located at a local maximum and zero

otherwise, where μ ˆs2c(p) represents the mean of s2c over the

neighborhood of p.

In the second step of the method, the tensors associated

with every pixel are propagated to their neighbors through

a convolution-like process. This step is independently ap-

plied to the tensors of every channel (L, a and b). The voting

process is carried out by means of specially designed tenso-

rial functions referred to as propagation functions, which take

into account not only the information encoded in the tensors

but also the local relations between neighbors. Two prop-

agation functions are proposed for edge detection: a stickand a ball propagation function. The stick propagation func-

tion is used to propagate the most likely noiseless color of a

pixel, while the ball propagation function is used to increase

edginess where required. The application of the first func-

tion leads to stick votes, while the application of the second

function produces ball votes. Stick votes are used to elimi-

nate noise and increase the edginess where the color of the

voter and the voted pixels are different. Ball votes are used

to increase the relevance of the most important edges. The

voting process described in [3] cannot directly be applied to

edge detection, since a pixel cannot appropriately propagate

its information to its neighbors without taking into account

the local relations between that pixel and its neighbors.

A stick vote can be seen as a stick-shaped tensor, STc(p),with a strength modulated by three scalar factors. The pro-

posed stick propagation function, Sc(p, q), which allows a

pixel p to cast a stick vote to a neighboring pixel q for channel

c is given by:

Sc(p, q) = GS(p, q) ηc(p) SV ′c(p, q) STc(p), (1)

with STc(p), GS(p, q), ηc(p) and SV ′c(p, q) being defined

as follows. First, the tensor STc(p) encodes the most likely

normalized noiseless color at p. Thus, STc(p) is defined as

the tensorized eigenvector corresponding to the largest eigen-

value of the voter pixel, that is, STc(p) = �e1c(p) �e1c(p)T ,

being �e1c(p) the eigenvector with the largest eigenvalue of

the tensor associated with channel c at p. Second, the three

scalar factors in (1), each ranging between zero and one, are

defined as follows. The first factor, GS(p, q), models the in-

fluence of the distance between p and q in the vote strength.

Thus, GS(p, q) = Gσs(||p − q||), where Gσs(·) is a de-

caying Gaussian function with zero mean and a user-defined

standard deviation σs. The second factor, ηc(p) defined as

ηc(p) = 1 − ηc(p), is introduced in order to prevent a pixel

p previously classified as impulse noise from propagating its

information. The third factor, SV ′c, takes into account the in-

fluence of the perceptual color difference, the uniformity and

the noisiness of the voted pixel. This factor is given by:

SV ′c(p, q) = ηc(q) SV c(p, q) + ηc(q), (2)

where: SV c(p, q) = [Gσd(ΔEpq)+Uc(p, q)]/2, and ηc(q) =

1 − ηc(q). SV c(p, q) allows a pixel p to cast a stronger stickvote to q either if both pixels belong to the same uniform re-

gion, or if the perceptual color difference between them is

small. That behavior is achieved by means of the factors

Uc(p, q) and the decaying Gaussian function on ΔEpq with

a user-defined standard deviation σd. A normalizing factor of

two is used in order to make SV c(p, q) to vary from zero to

2154

one. The term ηc(q) in (2) makes noisy voted pixels, q, to

adopt the color of their voting neighbors, p, disregarding lo-

cal uniformity measurements and perceptual color differences

between p and q. The term ηc(q) in (2) makes SV ′c to vary

from zero to one. The effect of ηc(q) and ηc(q) on the strength

of the stick vote received at a noiseless pixel q is null.

In turn, a ball vote can be seen as a ball-shaped ten-

sor, BT(p), with a strength controlled by the scalar factors

GS(p, q), ηc(p) and BV c(p, q), each varying between zero

and one. The ball propagation function, Bc(p, q), which al-

lows a pixel p to cast a ball vote to a neighboring pixel q for

channel c is given by:

Bc(p, q) = GS(p, q) ηc(p) BV c(p, q) BT(p), (3)

with BT(p), GS(p, q), ηc(p) and BV c(p, q) being defined as

follows. First, the ball tensor, represented by the identity ma-

trix, I, is the only possible tensor for BT(p), since it is the

only tensor that complies with the two main design restric-

tions: a ball vote must be equivalent to casting stick votes for

all possible colors using the hypothesis that all of them are

equally likely and, the normalized s1 saliency must be zero

when only ball votes are received at a pixel. Second, GS(p, q)and ηc(p) are the same as the factors introduced in (1) for the

stick propagation function. They are included for similar rea-

sons to those given in the definition of the stick propagation

function. Finally, the scalar factor BV c(p, q) is given by:

BV c(p, q) =Gσd

(ΔEpq) + Uc(p, q) + Gσd(ΔEc

pq)3

, (4)

where Gσd(·) = 1 − Gσd

(·) and Uc(p, q) = 1 − Uc(p, q).BV c(p, q) models the fact that a pixel p must reinforce the

edginess at the voted pixel q either if there is a big percep-

tual color difference between p and q, or if p and q are not

in a uniform region. This behavior is modeled by means of

Gσd(ΔEpq) and Uc(p, q). The additional term Gσd

(ΔEcpq) is

introduced in order to increase the edginess of pixels in which

the only noisy channel is c, where ΔEcpq denotes the per-

ceptual color difference only measured in the specific color

channel c. The normalizing factor of three in (4) allows the

ball propagation function to cast ball votes with a strength

between zero and one.

The proposed voting process at every pixel is carried out

by adding all the tensors propagated towards it from its neigh-

bors by applying the above propagation functions. Thus,

the total vote received at a pixel q for each color channel

c, TVc(q), is given by: TVc(q) =∑

p∈neigh(q) Sc(p, q) +Bc(p, q). The voting process is applied twice. The first ap-

plication is used to obtain an initial estimation of the normal-

ized s1 and s2 saliencies, as they are necessary to calculate

Uc(p, q) and ηc(p). For this first estimation, only perceptual

color differences and spatial distances are taken into account.

At the second application, the tensors at every pixel are ini-

tialized with the tensors obtained after the first application.

After this initialization, (1) and (3) can be applied in their full

definition, since all necessary data are available.

After applying the voting process described above, it is

necessary to obtain eigenvectors and eigenvalues of TVL(p),TVa(p) and TVb(p) at every pixel p in order to analyze its

local perceptual information. The voting results can be in-

terpreted as follows: uniformity increases with the normal-

ized s1 saliency and edginess increases as the normalized

s2 saliency becomes greater than the normalized s1 saliency.

Hence, the map of normalized s2 saliencies can be used di-

rectly as an edginess map. Standard post-processing steps

such as non-maximum suppression, hysteresis or thresholding

can then be applied to the normalized s2 saliency map in or-

der to obtain binary edge maps. The results can be improved

by reducing the noise in the image. This denoising step can be

achieved by replacing the pixel’s color by the most likely nor-

malized noiseless color encoded in its tensors. The method

can then be applied to the denoised images iteratively, which

improves the final performance of the edge detector.

2.3. Parameters of the CIEDE2000 formula

The CIEDE2000 formula [4], which estimates the percep-

tual color difference between two pixels p and q, ΔEpq, has

three parameters, kL, kC and kH , to weight the differences in

CIELAB luminance, chroma and hue respectively. They can

be adjusted to make the CIEDE2000 formula more suitable

for every specific application by taking into account factors

such as noise or background luminance, since those factors

were not explicitly taken into account in the definition of the

formula. These parameters must be greater than or equal to

one. Based on the formulation given in [5], the following

equations for these parameters are proposed:

kL = FBLFηL

, kC = FBCFηC

, kH = FBhFηh

, (5)

where FBm are factors that take into account the influence of

the background color on the calculation of color differences

for the color component m (L, C and h) and Fηm are factors

that take into account the influence of noise on the calculation

of color differences in component m. On the one hand, big

color differences in chromatic channels become less percep-

tually visible as background luminance decreases. Thus, the

influence of the background on the CIEDE2000 formula can

be modeled by FBL= 1 and FBC

= FBh= 1 + 3 (1 − YB),

where YB is the mean background luminance. On the other

hand, big color differences become less perceptually visible

as noise increases. The influence of noise on CIEDE2000 can

be modeled by means of Fηm= MAD(I)m − MAD(G)m,

where I is the image, G is a Gaussian blurred version of I and

MAD(·)m is the median absolute difference (MAD) calcu-

lated on component m. Fηmis set to 1 in noiseless regions.

3. RESULTSFifteen outdoor images from the Berkeley segmentation data

set [6] and their corresponding ground truths have been used

2155

LGC Compass TVED LGC Compass TVED

PSNR (dB)

FOMO , FOMN

16.74

0.45, 0.38

16.05

0.45, 0.43

21.13

0.45, 0.43

PSNR (dB)

FOMO , FOMN

16.93

0.45, 0.43

20.20

0.44, 0.40

22.28

0.46, 0.44

Fig. 2. First row: original image and the edginess maps generated by the LGC, Compass and TVED methods respectively for two different

images. Second row: noisy version of the same images and their corresponding edginess maps (LGC, Compass and TVED). PSNR and FOM

for the original (FOMO) and the noisy image (FOMN ) are indicated below the images.

in the experiments. The methods proposed by Maire et al. [2],

referred to as the LGC method, and by Ruzon and Tomasi

[1], referred to as the Compass method, have been used in

the comparisons, since they are representative of the state-of-

the-art in edge detection. The default parameters of the LGC

method have been used. The Compass algorithm has been ap-

plied with σ = 2, since the best overall performance of this

algorithm has been attained with this standard deviation. Five

iterations of the proposed method, referred to as TVED, have

been run with parameters σs = 1.3 and σd = 2.5. Gaus-

sian noise with a standard deviation of 30 has been added to

the images for the robustness analysis in order to simulate

very noisy scenarios. Performance has been evaluated by us-

ing two metrics: the Pratt’s Figure of Merit (FOM) [7] in or-

der to measure precision, and the Peak Signal to Noise Ratio

(PSNR) in order to measure robustness by comparing differ-

ences between two edginess maps: those generated for both

the noiseless and the noisy version of the same image.

Figure 2 shows the edginess maps detected for two of

the tested images1. It can be seen that LGC generates fewer

edges than the others but misses some important edges and

their strength is reduced for the noisy images. The Compass

operator generates too many edges and the number of edges

increases with noise. TVED has a better behavior, since it

only detects the most important edges and is less influenced

by noise. The PSNR confirms that TVED is the most robust

detector, whereas the FOM indicates that the three methods

have a similar performance in precision, with TVED being

slightly better.

1All the images are available at http://deim.urv.cat/˜rivi/tved.html

4. CONCLUDING REMARKSA new method for edge detection based on an adaptation

of the TVF has been proposed. An optimized version of

CIEDE2000 has been used to measure perceptual color differ-

ences in non-controlled environments by modifying its orig-

inal parameters. Experimental results show that the use of

a specific voting process makes the TVF a powerful tool for

edge detection. PSNR and FOM have been used to compare

the performance of the TVED against two of the most rep-

resentative state-of-the-art edge detectors. TVED has been

found to be more robust and slightly more precise than the

other algorithms.

5. REFERENCES

[1] M. Ruzon and C. Tomasi, “Edge, junction, and corner detection

using color distributions,” IEEE Trans. PAMI, vol. 23, no. 11,

pp. 1281–1295, 2001.

[2] M. Maire, P. Arbelaez, C. Fowlkes, and J. Malik, “Using con-

tours to detect and localize junctions in natural images,” in Proc.CVPR, 2008, pp. 1–8.

[3] G. Medioni, M. S. Lee, and C. K. Tang, A ComputationalFramework for Feature Extraction and Segmentation, Elsevier

Science, 2000.

[4] M. R. Luo, G. Cui, and B. Rigg, “The development of the CIE

2000 colour-difference formula: CIEDE2000,” Color Res. andApplication, vol. 26, no. 5, pp. 340–350, 2001.

[5] C-H Chou and K-C Liu, “A fidelity metric for assessing visual

quality of color images,” in Proc. ICCCN, 2007, pp. 1154–1159.

[6] D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A database of hu-

man segmented natural images and its application to evaluating

segmentation algorithms and measuring ecological statistics,” in

Proc. ICCV, 2001, pp. II:416–423.

[7] W. K. Pratt, Digital Image Processing: PIKS Scientific Inside,

Wiley-Interscience, fourth edition, 2007.

2156

Date post:	27-Jan-2017
Category:	Documents
Upload:	carme
View:	213 times
Download:	0 times

[IEEE 2009 16th IEEE International Conference on Image Processing ICIP 2009 - Cairo, Egypt...

Documents