[IEEE 21st International Conference on Advanced Information Networking and Applications Workshops...

WaveQ: Combining Wavelet Analysis and Clustering for Effective Image Retrieval

Dany Gebara*

Abstract

This paper proposes WaveQ, a content-based image retrieval system that classifies images as texture or non-texture, then uses a Daubechies wavelet decomposition t o extract feature vector information from the images, and finally applies the O P T I C S clustering algorithm to cluster the extracted data into groups of similar images. Queries are submitted t o WaveQ i n the form of a n example image. WaveQ has been thoroughly tested and the results are very promising. The achieved results demonstrate that the classification of images is extremely fast and accurate.

Keywords: classification, clustering, image mining, image retrieval, wavelet analysis.

1 Introduction

The basic concept behind a content-based image retrieval (CBIR) system is to compare the content of a query image to that of images within a database and return the closest matches. Some of the earlier methods used to achieve this were color histograms, which were fast to implement and execute. More recently, systems based on wavelet decomposition have appeared; these overcome some limitations of the color histogram.

Another factor to consider is that there are different classes of images. Texture images form an important class, where an object within the image is repeated pe- riodically throughout the image. Some medical images such as X-rays and some topographic images fall under this category. Non-texture images tend to have objects of interest clustered in one or more regions of an image.

Reda Alhajjl

In order to be able to compare images by content, a feature vector (or representative signature) needs to be calculated for each image. This feature vector is the description of the image to the CBIR system, which will then conduct its search based on these calculated vectors. Generally, the algorithms used to calculate these feature vectors perform well on some class of images and poorly on others. It therefore follows that a CBIR system should classify an image first, then use an appropriate algorithm based on the classification. In terms of querying speed, a faster system is naturally preferred. Hence, if there is a way to avoid scanning the entire database every time a query is submitted, this should result in faster responses to the user. As a solution, clustering can be applied to the calculated feature vectors, where the signatures for similar images are grouped as one cluster. When querying, a CBIR system needs only to look at a representative for each cluster to narrow the search.

To handle the classification and querying of images better in a more concise and effective way, this paper proposes WavwQ as a system that combines both wavelet analysis and clustering into the image retrieval process. The WaveQ system is proposed with two dis- tinct phases. During the learning phase, WaveQ will learn what all the images in the database kook like. The learning phase is comprised of three parts. Each image is first classified as texture or non-texture, then one of two algorithms, based on the classification result, is executed to calculate a feature vector for the image. These vectors are then clustered so that similar images are grouped together. In the querying phase, a query image selected by the user is presented to WaveQ. The image is first classified, and a feature vector is calculated for the image. This is then compared to the clusters (representative vectors) only, instead of every

'Dany Gebara: Department of Computer Sci- ence, University of Calgary, Calgary, Alberta, Canada, stored image in the database. When WaveQ deter- gebaraOcpsc.ucalgary.ca mines which cluster represents the closest match, the

t ~ e d a ~ l h a $ : Department of Computer Science, University query image is compared to the images within that of Calgary, Calgary, Alberta, Canada, [email protected]; cluster and the best matches are returned to the user. he is also affiliated with the Department of Computer Science, Global University, Beirut, Lebanon. The balance of this paper is organized as follows.

21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07) 0-7695-2847-3107 $20.00 O 2007 IEEE C ~ ~ P U T E R

SOCIETY

The WaveQ system is described in detail in Section 2. Section 3 covers the experimental runs that were carried out to test the performance of WaveQ. Section 4 is summary and conclusions.

2 The WaveQ Model

Like most content-based image retrieval (CBIR) systems, WaveQ consists of two major phases: the learning phase and the querying phase. In the learning phase, all the images in the database go through an indexing stage, where each image is classified as either texture or non-texture. Then, clustering is applied to texture and non-texture images separately. For each cluster, a representative feature vector is calculated to be used in the querying phase.

2.1 Feature Extraction

To classify images, WaveQ uses the LUV color space, which is device-independent and has "good perception correlation properties" [lo, 91, i.e., the Euclidean distance between two colors approximately equals to the perceived color difference [6]. While L encodes lumi- nance, which corresponds to the perceived brightness (or gray scale level) of a color [4], U and V encode the color information of a pixel. U corresponds approximately to the red-green color from the RGB spectrum, and V to the blue-yellow color [6]. The values of L, U and V for each pixel can be calculated from the RGB pixel components as shown in Equation 1.

where (Y,, U,, V,) = (1.0,0.2009,0.4610) for white point reference.

The next step is to calculate the energy of one of the color components. Experiments conducted on both U and V components showed that the U component is better suited for the application under consideration. The energy formula is given in Equation 2.

where M and N are the width and height of the image, respectively, and pixel(m, n) is the value of the pixel located at index (m, n).

21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07) 0-7695-2847-3107 $20.00 O 2007 IEEE

The standard deviation is a statistic from which one can tell how tightly objects are packed about a mean in a set of data objects. If the objects are closely packed, then the standard deviation is small. If they are scat- tered, the standard deviation is relatively large. Based on the definition of texture, the standard deviation of the U component is calculated for every pixel. In textured images the energy of the image will be close in value to a larger number of pixels. This leads to a smaller value for the standard deviation. Thus, if the standard deviation is smaller than a certain threshold, the image is classified as texture. In WaveQ, the threshold is set to 4.0 based on experimentation.

For feature extraction, Daubechies wavelet decomposition is applied to the L component of the image. The L component is used because it corresponds to the gray scale of the image without considering the color information. WaveQ can read colored images, but it converts them to gray scale, and uses the L component to classify and extract features. The feature vectors calculated for all the images are stored and used in the clustering step.

Apply the Daubechies low and high pass filters to all the columns in the image yields a Daubechies wavelet decomposed image with four sub-bands: LL, HL, LH, and HH. While LL is an approximation of the original image a t a lower resolution, HL and LH correspond to the vertical and horizontal wavelet coefficients, respectively. The fourth sub-band HH corresponds to the diagonal coefficients.

After completing the Daubechies decomposition, the next step is to calculate the feature vectors. The texture feature vector is made up of several components. There are some statistical features including energy (Equation 2) and standard deviation. There are also some wavelet co-occurrence features, such as contrast, energy, entropy and local homogeneity. Wavelet decomposition is carried out to the third level, and statistical and wavelet co-occurrence features are recorded from these sub-bands. The wavelet statistical features are only calculated for the high frequency sub-bands because this is often where the most important information of a texture image appears [3]. The level 1 low frequency sub-band LL1 is further decomposed by Daubechies wavelet and the energy and standard deviation are calculated for the level 2 high frequency sub- bands. This process is repeated for a third level decomposition, where LL2 is decomposed and the wavelet statistical features are calculated for LH3, HL3 and HH3.

Texture features can be extracted in several ways. One of the earliest methods developed uses statistical features of a pixel's grey level (L component) [7]. Haralick proposed the use of grey level co-occurrence

Q C O ~ P U T E R

SOCIETY

matrices to extract second order statistics from an im- To make the features scale-invariant, Equation 4 is age [5]. They have been very effective in texture classi- divided by the size of the original image, wh; and to fication. In WaveQ, instead of using the co-occurrence make them grey-level invariant, Equation 4 is raised to matrix on the original image, it is calculated for the the power 1/N, where N is the order of autocorrelation. level 1 Daubechies decomposed sub-bands. The size of Thus, two similar images of different size or brightness the co-occurrence matrix is determined by the highest should match during a query. grey level in the sub-band. Finally, the texture feature vector also contains the average of contrast, energy, en- 2.2 Clustering tropy and local homogeneity over all four orientations: horizontal, vertical and the two diagonals.

For calculating the non-texture feature vector, we In WaveQ, the OPTICS clustering algorithm [I] is

used to cluster the images within each category: tex- the method of Kubo et [81, with the ture and non-te*ure. OPTICS orders the points in

difference of using the Daubechies wavelet transform the database following the concept of density-based instead of the Haar wavelet. Daubechies wavelets are clustering algorithms. This results in a set of ordered used because they are better than Haar wavelets when points that are useful to determine clusters at different working with general purpose images [Ill .

The non-texture feature vector method takes a densities. In a density-based clustering algorithm, each

level 1 Daubechies decomposed image as input and uses point in a have a Il~XIlber of

the LH and HL sub-bands to construct an edge image. points that are within a specified distance E from it. The HL sub-band embodies the vertical edges and the Clusters that have a higher density, i.e., a smaller E

LH sub-bands contains the coefficients of the horizon- value, can be totally included (or nested) in clusters of tal edges. The edge image is built by combining the lower density, i.e., larger E value. Taking this into con- elements of the LH and HL sub-bands together using sideration, OPTICS orders the points in the database the following equation: so that clusters of different densities can be derived

concurrently. Finally, some basic concepts to be used ern,. = ,/'= (3) in the sequel are introduced next.

where 1 < m < height of LH & HL sub-bands, 1 < n < width of LH & HL sub-bands, v,,,, h,,,, and em,, are elements of the HL sub-band, LH sub-band and the edge image, respectively.

The edge image constructed is used to calculate the higher order autocorrelation features, which are the primitive edge features [8]. Due to the large number of autocorrelation functions. Kubo et a1 limited them by limiting the autocorrelation functions to the second order and the range of displacement to a 3 x 3 window. By doing these limitations, 25 features are extracted.

~ ~ ( a l , a z , ..., a ~ ) = I ( r ) I ( r + a ~ ) I ( r + az) ... I ( r + a ~ )

(4) Equation 4 is used to calculate the Nth order auto-

correlation features. I(T) represents the image and I (T + ai) is the translation of I(T) by a displacement vector ai.

Higher order autocorrelation features are shift- invariant, i.e., the features are not dependant on the positions of objects within an image. However, these features are not invariant to scale and grey levels. That is, if two images were presented to a content-based image retrieval system, with the only difference being scaling or brightness, the system would not return a match. A remedy for this could be by normalization using the following equation:

Given two objects p and q, object p is in the &-neighborhood of q if the distance from p to q is less than E . An object is said to be core if and only if it has a minimum number of points (MinPts) in its &-neighborhood. Object p is directly density- reachable from q if q is a core object and p is in the E-

neighborhood of q. Object p is density-reachable from object q if there is a chain of objects pl, ...,p,, where pl = q and p, = p such that pj+l is directly density- reachable from pj. The core-distance is the smallest distance E' between a given object p and an object in its &-neighborhood such that p would be a core object with respect to E' . The reachability-distance of a given object p is the smallest distance such that p is density- reachable from a core object o.

The only parameter needed by OPTICS is MinPts, the minimum number of points that should be in the &-neighborhood of a point p in order for it to be considered a core object. Previous research [I] has shown that good results can be obtained by using any value for MinPts between 10 and 20. In WaveQ, MinPts is set to 15. The higher the value of MinPts, the more continuous and even the reachability curve is; and the lower the value, the more rough the curve is.

The value of E is an important consideration in this clustering system. There are many ways to calculate E . It is important to use a sensible value because if E is too small, many potential clusters may be lost.

21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'O7) 0-7695-2847-3107 $20.00 O 2007 IEEE C ~ M P U T E R

SOCIETY

Therefore, it is desired to obtain the smallest value for E that would result in one large cluster containing almost all points in the database. Then, all the smaller clusters can be assumed to be contained within this large cluster.

Using simple heuristics, and assuming random dis- tribution of points within the database (i.e no clusters), the k-nearest-neighbor distance is calculated, i.e., the distance E for a point to have k-objects within its neighborhood, where k=MinPts. This value is computed for a data space DS containing N points. E is equal to the radius of a d-dimensional hypersphere S within DS, where S contains k points [I].

To calculate E consider the following equations for the volume of a hypersphere S.

Volume~s Volumes =

N x k

Volumes in Equation 6 holds because we assume that the points are randomly distributed. Note that N and k are the total number of points in D S and S , respectively. Whereas, Volumes(,) in Equation 6 also, is the volume of a hypersphere S with radius r and dimension d. And J? is the Gamma function defined in Equation 8. Now, if we equate these two equations (since they both represent the volume of a hypersphere), we can calculate the value of r ( E ) :

Thus, E is calculated by setting its value to r in Equa- tion 7.

if n is even

2. x r q x x rn if n is odd (8)

In WaveQ, the value for E is calculated as described above. After calculating E, OPTICS sorts the set of d-dimensional points being clustered. It takes a d- dimensional Origin point (0, ..., o), and calculates the Euclidean distance, given in Equation 9 between all points and the Origin. The list is sorted in ascending order based on the Euclidean distance. This is done to facilitate the extraction of points later in the algorithm. The Euclidean distance is given by:

where p and q are points of dimension N and pi and qi are their respective coordinates in dimension i.

It is required to determine whether p is a core object or not, and to calculate its core and reachability distances. This is achieved by considering points that are in the &-neighborhood of p. If there are more than MinPts in p's &-neighborhood, then p is a core object with a core distance equal to the distance of its nth neighbor, where n = MinPts. Sec- ond, the reachability-distance for each point q in p's &-neighborhood is updated. This is done by first check- ing whether q is processed or not. If not, then q is added to NeighborList, which is arranged to preserve the heap properties based on q's reachability-distance. However, if q is processed, this means that it is already NeighborList, because it is in another point's E-

neighborhood. So, q's reachability has to be modified NeighborList to correspond to the smallest reachability of q. This is the only way to ensure that clusters with higher densities, i.e., smaller E values are completed first. Finally, the point currently being processed is written with its core and reachability distances recorded in a file called OutputFile.

Looking at the reachability distances should be suf- ficient to determine which points belong to a cluster. To extract the clusters from the ordered list of reachabilities, every object o in the list is checked, and the clusters are created or updated based on the reachability and core distance values. If the reachability of o is greater than E', this means that o is not reachable from any object before it in the OrderedFile. If it were reachable then it would have been assigned a reachability-distance that is less than or equal to E'. So, the core distance of o is checked. If o is a core object with respect to E' and MinPts, then a new cluster is created and o is added to it. Otherwise, o belongs to the NOISE cluster. If the reachability of o is at most E', then o is reachable from the points that precede it OrderedFile; and thus, o is added to the current cluster. After performing some experiments, E' was taken to be the average of all non-infinity reachabilities times 0.7.

While adding objects to clusters, a feature vector, for each cluster except NOISE, is calculated to represent this cluster. The representative feature vector is the mean of all feature vectors belonging to the same cluster. This representative feature vector is used later on in the querying phase to speed up the process.

2.3 The Querying Phase

Once the learning phase has been completed, the database can be queried. By implementing classification and clustering, WaveQ can query the database

21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07) 0-7695-2847-3107 $20.00 O 2007 IEEE C ~ M P U T E R

SOCIETY

much faster because the images in the database have already been grouped. WaveQ operates on the princi- ple of query by example.

When a query image is presented to WaveQ, it un- dergoes processing similar to other images in the learning phase. First, it is classified as either a texture or a non-texture image. Then, based on this classification, the appropriate feature vector is extracted. This feature vector is then compared to the representative feature vectors calculated for each cluster in the clustering part of the learning phase.

The cluster with the closest match on its representative feature vector is then chosen for closer examina- tion. The query feature vector is then compared to the feature vectors of images within the chosen cluster and the closest matches are returned.

3 Experimental Analysis

The development of WaveQ was undertaken in a Java-based software development environment. The

order to extract feature vectors correctly and cluster them effectively. If the classification rate is low, then less matching images would be returned as a result of a query because either the query image was classified incorrectly, or in the learning phase the images that were grouped together did not belong to the same image type. The time taken to classify is also important because as the database size is increased, the run time would also increase significantly within the learning phase.

The second parameter being considered is the precision of WaveQ's querying technique based on the feature vectors extracted; the higher the system's precision, the higher the percentage of matching images returned to a query. The most favorable condition would be for 100% matching of images, since the idea behind a content-based image retrieval system is to retrieve from the database the maximum number of images that match a query image. For the evaluation of WaveQ, the definition given in Equation 10 is used for precision.

# of correct images returned Precision =

# o f imaaes returned (10)

testing has been carried out on a desktop PC equipped The third factor under consideration is determin-

with an Me'@ CPU at 3'0 GHz7 ing the effectiveness of incorporating clustering into and has of RAM* The 'perating 'ystem in- WaveQ on its performance. This was tested by stalled is Microsoft@ windows@ XP - SP2. paring the querying results obtained by appIying clus-

It was decided test the performance On tering with those obtained by using the feature vectors texture images to demonstrated the ability of WaveQ all the images in the database. The hypothesis be- in classifying and querying any image belonging to this in^ tested is that clusterinn would be useful due to the category. A second aim of the conducted experiments fact that it reduces the search space without signifi- is to compare the performance of WaveQ to other cantly affecting the accuracy of the query results. This content-based image retrieval (CBIR) systems already is demonstrated in Equation 11. described in the literature. The data set was chosen to minimize the number of experimental runs required while still achieving both objectives stated above.

The dataset consisted of texture images taken from the Brodatz album [2]. This is a popular album that has been extensively used for testing purposes. Thirty of the Brodatz images of size 640 x 640 and gif for- mat, were used to derive a set of 3000 images of size 256 x 256, that were used in the experiments. This was achieved by randomly extracting 100 sample images of size 256 x 256 from each original image.

The first criteria analyzed is the effectiveness and speed of WaveQ's classification method. The classification method was also compared to the one used by Wang et al [9, 101 on both sets of images, along with the time needed to classify an image. It was easy to an- alyze the effectiveness of WaveQ because it was known in advance what the total number of correct texture and non-texture images would be.

The effectiveness of the classification process is important. As the first step in both the learning and the querying phases, correct classification is vital in

Clustering Precision = # of correct images returned from Clusters

# of correct images returned from entire database (11)

To test the benefits of the WaveQ classification method, the Brodatz image set was first processed ac- cording to the algorithm used by Wang et a1 [9, 101. The images were then processed again by WaveQ. It was known that the 3000 Brodatz images should be classified as texture. Running WaveQ on the 3000 Bro- datz images, 64 of those were classified as non-texture, that is false-negatives and 2936 images were classified as texture, i.e., true-positives. So the percentage of true-positive results is high (97.8%). Whereas, when the classification algorithm of Wang et a1 [9, 101 was executed on the same 3000 Brodatz images. 123 images were classified as non-texture, i.e., 123 out of the 3000 images are false-negatives; and 2877 images were reported as true-positives because they are texture images and they were also classified as texture. So the percentage of true-positives is 95.9%, which is slightly lower than the result obtained from WaveQ.

21st International Conference on Advanced Information Networking and Applications Workshops (AINAW107) 0-7695-2647-3107 $20.00 8 2007 IEEE

Another significant factor in classification is the time required to perform the task. The size of image databases is usually large and a fast learning phase is naturally preferred. Classification is part of the learning phase, and the faster the classification can be done the better. When executing the algorithm of Wang et al[9, 101, the average time needed to classify one Bro- datz image is 13.4 seconds. Whereas, the average time achieved using the WaveQ classification algorithm to classify a Brodatz texture is 0.10 seconds. It is clear that the WaveQ algorithm is significantly faster. Over- all, it takes almost 13 hours to classify the 3000 Bro- datz images using Wang et al's algorithm [9, 101, while WaveQ needs less than two hours to accomplish the same task, with a better rate of true-positives.

4 Summary and Conclusions

The research scope for this paper focused on the development of a general purpose content-based image retrieval system. A classification method was adapted; and this proved to be very efficient a t classifying images. Image retrieval was carried out in WaveQ by comparing texture features or shape features within images. Two separate algorithms based on the Daubechies wavelet were used to extract features from images, each one suited t o a class of images. Clustering was used to reduce the search space by grouping similar images together in order to make querying faster. It is worth mentioning the several pos- sible applications and benefits of WaveQ. First, the underlying framework for a general purpose CBIR system has been developed such that it can operate on a home computer. This opens up the possibility of creating software for personal home usage to classify photos and search them. WaveQ also has the potential to be applied to larger commercial databases. Such applications require the classification and analysis of images from different domains, including medical, crim- inal, spatial, etc, to identify the class of a new image under investigation. This is all supported by the testing phase which reported quite good execution times that could possibly be further improved with some ad- justments to the constructs and parameters utilized by the system.

References

[I] Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, and Jorg Sander. Optics: Ordering points to identify the clustering structure. In Proceed- ings ACM SIGMOD International Conference on Management of Data, pages 49-60, June 1999.

21st International Conference on Advanced Information Networking and Applications Workshops (AINAW'07) 0-7695-2847-3107 $20.00 8 2007 IEEE

[2] P. Brodatz. Textures: A photographic album for artists & designers, 1966. Accessed on 15/6/2004. Available a t http://www.ux.his.no/ tran- den/brodatz.html.

[3] T. Chang and C.C. Jay Kuo. Texture analysis and classification with tree-structured wavelet transform. IEEE Transactions on Image Processing, 2(3):429-441, 1993.

[4] D. Duce (Editor). W3C Portable Net- work Graphics Specification Version 2.0. Accessed on 8/11/2005. Available at http://www.w3.org/TR/PNG-Glossary.htm1.

[5] R. Haralick. Statistical and structural approaches to texture. Proc. IEEE, 67:786-804,1979.

[6] C.G. Healey and J.T. Enns. Large datasets at a glance: Combining textures and colors in scientific visualization. IEEE Transactions on Visualization and Computer Graphics, 5(2):145-167, 1999.

[7] P. Howarth and S. Riiger. Evaluation of texture features for content-based image retrieval. In In- ternational Conference on Image and Video Re- trieval, pages 326-334, July 2004.

[8] M. Kubo, Z. Aghbari, and A. Makinouchi. '

Content-based image retrieval technique using wavelet-based shift and brightness invariant edge feature. International Journal of Wavelets, Mul- tiresolution and Information Processing, 1(2):163- 178, 2003.

[9] J. Li, J.Wang, and G.Wiederhold. Classification of textured and non-textured images using region segmentation. In Proc. of the 7th International Conference on Image Processing, pages 754-757, September 2000.

J. Wang, J. Li, and G. Wiederhold. Simplic- ity: Semantics-sensitive integrated matching for picture libraries. IEEE Transactions On Pattern Analysis And Machine Intelligence, 23(9):947- 963, 2001.

[ll] J.Z. Wang, G. Wiederhold, O.Firschein, and S.X. Wei. Content-based image indexing and searching using Daubechies' wavelets. International Journal on Digital Libraries, 1:311-328, 1997.

C Z ~ P UTER SOCIETY

Date post:	27-Jan-2017
Category:	Documents
Upload:	reda
View:	215 times
Download:	0 times

[IEEE 21st International Conference on Advanced Information Networking and Applications Workshops...

Documents