+ All Categories
Home > Documents > A two-stage trademark retrieval system with invariant property · is a concern for new company to...

A two-stage trademark retrieval system with invariant property · is a concern for new company to...

Date post: 03-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
4
Proceedings of the Second APSIPA Annual Summit and Conference, pages 280–283, Biopolis, Singapore, 14-17 December 2010. A two-stage trademark retrieval system with invariant property Lit-Hung Chan, Ngai-Fong Law and Wan-Chi Siu Center for Signal Processing, Department of Electronic and Information Engineering, The Hong Kong Polytechnic University, Hung Hom, Hong Kong E-mail: {[email protected] , [email protected] , [email protected] } Abstract—A novel and fast trademark retrieval system is proposed in this paper to deal with the rapid increase of trademarks. The proposed system uses chains of pre-image processing to provide a standardized contour image for feature extraction. The extracted features consist of outer shape feature and interior shape feature. A two-stage matching is used to retrieve similar trademarks. Outer shape feature matching is first used to retrieve trademarks with similar outer shape, the interior shape feature matching is then combined to refine the first stage retrieval results. Experimental results on a database of 300 trademarks were used to evaluate the performance of the proposed retrieval system. The experimental results show that the proposed two-stage retrieval method can successfully retrieve similar trademarks with identical shape features extracted from its contour image and its interior regions. I. INTRODUCTION Due to rapid increase in trademarks registration, copyright is a concern for new company to design its trademark. With content-based retrieval methods, users can search similar trademarks according to their shape. Companies can also use these methods to search if similar trademarks exist in the market before having their design registered. Such a process provides a convenient way for new company to design a distinctive trademark to avoid infringing copyright. The aim of this project is to design and create an effective content- based retrieval model for trademarks. Much research [1-6] has been done on trademark retrieval. As trademark usually has a well-defined shape with clear boundary, these algorithms rely on contour-based shape feature to analyze trademarks. We proposed a two-stage retrieval approach for trademarks retrieval. The first stage retrieves trademarks according to their outer-shape similarity while the second stage refines the search results by using interior shape features. The rationale behind the first stage process is that trademarks can always be characterized by their rough shape. Hence, irrelevant results can be filtered out after the first stage. To verify the effectiveness of our proposed two-stage approach, a content-based retrieval system for trademarks was implemented. There were more than 300 trademarks or logo used for testing. The trademarks were classified into different classes according to their shape features. Recall and precision were then used to evaluate the performance of the system. The rest of the paper is organized as follows. Section II gives the overview of the system. Section III gives the detailed descriptions about the pre-image processing to standardize trademarks. Sections IV and V present the feature extraction for outer shape feature and interior shape feature respectively. Section VI describes the two-stage retrieval method. Section VII presents and discusses the results, and finally section VIII concludes the paper. II. AN OVERVIEW OF THE PROPOSED SYSTEM Our trademark retrieval system consists of a database construction part and an image retrieval part. In constructing the database, trademark images undergo the pre-image processing to produce standardized trademarks so that translation, rotation and scaling-invariant features can be extracted and stored in the database. In image retrieval part, users need to input a query image first. The system then performs the same pre-image processing and feature extraction as the database construction part. Afterwards, the system measures the similarity between the outer shape feature of the query image and that of those images in the database for first stage ranking. This process classifies the database images into different sets according to the similarity of their outer shape feature with the query image. For each set of the retrieved images, second stage ranking is done which is based on the interior similarity between the query image and images in that particular set. This two-stage approach thus helps making the retrieval to be sensitive to both outer shape and interior region features. III. PRE-IMAGE PROCESSING In the scanning process, trademarks can be located at arbitrary positions, rotated by an arbitrary angle, and with different resolutions. The aim of the pre-image processing is to produce standardized trademarks so that translation, rotation and scaling invariant features can be extracted for further analysis. There are four steps in pre-image processing, they are 1) color to grayscale conversion, 2) denoising, 3) edge detection and 4) rotation, border chopping and scaling. The trademark is converted to gray scale before feature extraction. The trademark is then filtered by wavelets. Upon wavelet transform, there are four subband images produced. Only the LL subband image containing the approximation image is retained as a noise filtered image. The other three subband images containing the edge/noise information are removed. The contour of the trademark is detected by Canny edge detector on the noise-filtered image. The centroid of the 280 10-0102800283©2010 APSIPA. All rights reserved.
Transcript
Page 1: A two-stage trademark retrieval system with invariant property · is a concern for new company to design its trademark. With content-based retrieval methods, users can search similar

Proceedings of the Second APSIPA Annual Summit and Conference, pages 280–283,Biopolis, Singapore, 14-17 December 2010.

A two-stage trademark retrieval system with invariant property

Lit-Hung Chan, Ngai-Fong Law and Wan-Chi Siu Center for Signal Processing, Department of Electronic and Information Engineering,

The Hong Kong Polytechnic University, Hung Hom, Hong Kong E-mail: {[email protected] , [email protected], [email protected]}

Abstract—A novel and fast trademark retrieval system is proposed in this paper to deal with the rapid increase of trademarks. The proposed system uses chains of pre-image processing to provide a standardized contour image for feature extraction. The extracted features consist of outer shape feature and interior shape feature. A two-stage matching is used to retrieve similar trademarks. Outer shape feature matching is first used to retrieve trademarks with similar outer shape, the interior shape feature matching is then combined to refine the first stage retrieval results. Experimental results on a database of 300 trademarks were used to evaluate the performance of the proposed retrieval system. The experimental results show that the proposed two-stage retrieval method can successfully retrieve similar trademarks with identical shape features extracted from its contour image and its interior regions.

I. INTRODUCTION

Due to rapid increase in trademarks registration, copyright is a concern for new company to design its trademark. With content-based retrieval methods, users can search similar trademarks according to their shape. Companies can also use these methods to search if similar trademarks exist in the market before having their design registered. Such a process provides a convenient way for new company to design a distinctive trademark to avoid infringing copyright. The aim of this project is to design and create an effective content-based retrieval model for trademarks. Much research [1-6] has been done on trademark retrieval. As trademark usually has a well-defined shape with clear boundary, these algorithms rely on contour-based shape feature to analyze trademarks. We proposed a two-stage retrieval approach for trademarks retrieval. The first stage retrieves trademarks according to their outer-shape similarity while the second stage refines the search results by using interior shape features. The rationale behind the first stage process is that trademarks can always be characterized by their rough shape. Hence, irrelevant results can be filtered out after the first stage.

To verify the effectiveness of our proposed two-stage approach, a content-based retrieval system for trademarks was implemented. There were more than 300 trademarks or logo used for testing. The trademarks were classified into different classes according to their shape features. Recall and precision were then used to evaluate the performance of the system.

The rest of the paper is organized as follows. Section II gives the overview of the system. Section III gives the detailed descriptions about the pre-image processing to

standardize trademarks. Sections IV and V present the feature extraction for outer shape feature and interior shape feature respectively. Section VI describes the two-stage retrieval method. Section VII presents and discusses the results, and finally section VIII concludes the paper.

II. AN OVERVIEW OF THE PROPOSED SYSTEM

Our trademark retrieval system consists of a database construction part and an image retrieval part. In constructing the database, trademark images undergo the pre-image processing to produce standardized trademarks so that translation, rotation and scaling-invariant features can be extracted and stored in the database. In image retrieval part, users need to input a query image first. The system then performs the same pre-image processing and feature extraction as the database construction part. Afterwards, the system measures the similarity between the outer shape feature of the query image and that of those images in the database for first stage ranking. This process classifies the database images into different sets according to the similarity of their outer shape feature with the query image. For each set of the retrieved images, second stage ranking is done which is based on the interior similarity between the query image and images in that particular set. This two-stage approach thus helps making the retrieval to be sensitive to both outer shape and interior region features.

III. PRE-IMAGE PROCESSING

In the scanning process, trademarks can be located at arbitrary positions, rotated by an arbitrary angle, and with different resolutions. The aim of the pre-image processing is to produce standardized trademarks so that translation, rotation and scaling invariant features can be extracted for further analysis. There are four steps in pre-image processing, they are 1) color to grayscale conversion, 2) denoising, 3) edge detection and 4) rotation, border chopping and scaling.

The trademark is converted to gray scale before feature extraction. The trademark is then filtered by wavelets. Upon wavelet transform, there are four subband images produced. Only the LL subband image containing the approximation image is retained as a noise filtered image. The other three subband images containing the edge/noise information are removed. The contour of the trademark is detected by Canny edge detector on the noise-filtered image. The centroid of the

280

10-0102800283©2010 APSIPA. All rights reserved.

Page 2: A two-stage trademark retrieval system with invariant property · is a concern for new company to design its trademark. With content-based retrieval methods, users can search similar

contour trademark is calculated. Axis passing through the centroid with maximum length of the trademark shape is used to rotate the trademark such that this axis is relocated horizontally. Afterwards, unnecessary border of the trademark is chopped. Finally the trademark is scaled on ratio to have a pre-defined border of 300 pixel width.

IV. OUTER SHAPE FEATURE EXTRACTION

The outer shape of a trademark image gives a rough skeleton about the trademark. To obtain the outer shape, the outer intersection points for each angle from centroid are detected in the standardized contour image of the trademark. These points are transformed into one dimension function for estimating the outer shape [7-9]. Let f(i,j) be the function representing the detected outer points. This function f(i,j) is transformed into ( )θa for θ =1,2,3…360 where ( )θa shows the distance between the detected points and the centroid of the contour. Besides extracting the distance information of the outer points, we consider also the symmetry property of the outer contours with respect to the centroid. Let

( ) ( ) ( )180+−= θθθ aab (1)

for θ =1,2…180. ( )θb measures the difference between ( )θa and ( 180+ )θa , i.e., its 180 displacement. To characterize the outer shape and the symmetry property

of the outer contours, the summation and the standard deviation of ( )θa and ( )θb are used. They are defined as,

, (2) ( )∑=

=360

1

1Outer θ

θa

( )[ ]θastd2Outer = , (3)

, (4) ( )∑=

=180

1

3Outer θ

θb

( )[ ]θbstd4Outer = . (5)

V. INTERIOR SHAPE FEATURE EXTRACTION

In contrary to the outer shape features, the interior shape features consider the fine details in the trademark patterns. Wavelets [10-14] are used to characterize the directional information of the trademark patterns. Besides, statistical features of edge images are also extracted.

A. Directionalities Information For standardized contour image, two-dimensional wavelet

transform is used to separate the image into different subbands which consist of different directional information. For example, HL subband consists of the horizontal edge information, LH subband consists of vertical edge information and HH subband consists of diagonal edge information. The diagonal edge information is further analyzed in pi/4 and 3pi/4 directions.

For the horizontal directionalities, let h(j,k) be the HL subband where j and k represent the row and column indexes respectively. The horizontal directionality is defined as [1],

( )

( )∑∑

∑= =

=⎟⎟⎟⎟

⎜⎜⎜⎜

=M

j

N

kN

k

H

kjh

kjhDir

1 1

2

1

,

,

.

(6)

Similarly, for the vertical directionalities, let v(j,k) be the values of LH subband where j and k represent the row and column indexes respectively. The vertical directionality is defined as [1],

( )

( )∑∑

∑= =

=⎟⎟⎟⎟

⎜⎜⎜⎜

=M

j

N

kN

k

V

kjv

kjvDir

1 1

2

1

,

,

.

(7)

In subband HH, image passes through high pass filter on both rows and columns to extract diagonal edge information. To further analyze the subband in pi/4 and 3pi/4 directions, diagonal directionality of 3pi/4 is calculated by summing the square value of every point in the subband HH over the diagonal sum value of pi/4 direction, while diagonal directionality of pi/4 is calculated by summing the square value of every point value over the diagonal sum value of 3pi/4 direction.

B. Moment invariants To characterize the interior shape pattern of the trademark,

the first and second order of improved moment invariants on boundary [15] are used in the retrieval system. This improved version is very similar to the Hu moment invariants [16], but the improved moment invariants are used in boundary image while the Hu moment invariants are used in massive image. The massive image means the original image that contains both contour and interior region. More information can be found in [17].

VI. TWO-STAGE FEATURE MATCHING

During query, the input trademark would undergo the same processes as those trademarks in creating database. These included pre-image processing and feature extraction. In total, there are ten indexes for retrieval: four indexes for outer shape description and six indexes for interior shape description. A two-stage feature matching is adopted.

A. Outer shape similarity There are four outer shape features for each trademark. Let

the outer shape indexes of the image m in database and query trademark be respectively databasemO(k) and inO(k) for k=1,2,3,4. Their similarity is calculated as,

∑= −

−=

4

1 ))(min())(max())()((

41

k mm

mouter kOdatabasekOdatabase

kOdatabasekinOD

(8)

where max(databasemO(k)) and min(databasemO(k)) are the maximum and minimum values of that feature descriptor over all the images in the database.

B. Interior shape similarity There are six interior shape features for each trademark.

Let the interior shape indexes of image m in database and query trademark be respectively databasemI(k) and inI(k) for k=1,2,…6. Their similarity is calculated as,

281

Page 3: A two-stage trademark retrieval system with invariant property · is a concern for new company to design its trademark. With content-based retrieval methods, users can search similar

∑= −

−=

6

1int ))(min())(max(

))()((61

k mm

merior kIdatabasekIdatabase

kIdatabasekinID

(9)

where max(databasemI(k)) and min(databasemI(k)) are the maximum and minimum values of that feature descriptor over all the images in the database.

Douter is used to classify the trademarks into different sets of images according to their outer shape similarity. There are 8 ranges for Douter to classify the trademarks. First seven sets are with interval of 0.5, i.e., set 1 with range from 0 to 0.5, set 2 with range from 0.5 to 1 and so on. For images with Douter larger than 3.5, they would be classified into set 8. For each set, the trademarks would undergo a second stage similarity matching. In particular, the summation of Douter and Dinterior are used to rank the trademarks in each set.

VII. EXPERIMENTAL RESULTS

In this section, we would first introduce the trademark database used. Afterwards, the results of the proposed methods are shown in three parts, the results for outer shape matching only, the results for the two-stage feature matching and the results for combined feature matching without using the two-stage matching. Recall and precision were computed. Some query examples would also be shown.

A. Trademark Database The database consist of 308 trademark images which are

classified into 7 different classes, including circle, ellipse, rectangle, triangle, square, shield and words. The resolution of the database images is larger than 300x300 pixels. For images in a low resolution, Canny edge detection may fail in detecting the contour of the trademarks correctly.

B. Outer Shape Matching Results Our proposed retrieval system consists of two-stages. The

first stage is mainly responsible for classifying the trademarks into different sets according to the similarity in their outer shape. Table I shows some retrieval examples based on the first stage outer shape matching.

C. Two-stage Matching Results As we can see from Table I, trademarks are matched based

on their outer shape similarity only in the first stage. Detailed trademark patterns are ignored. To refine the retrieval results, trademarks in each set are re-ranked using their outer and interior shape similarities. In this way, the two-stage approach helps to make the retrieval results become similar in

both outer shape and interior shape. Table II shows an example. In the outer shape feature matching, the retrieval results for the first six images are all in circular shape, but we can clearly see that the retrieved images in rank 3, 4 and 6 have a great difference in interior pattern with the query image. The retrieval results have been improved on the two-stage retrieval approach as shown in the second row of Table II. Those images with large difference in interior shape feature have been re-ranked in a lower priority while those images with similar interior feature have been re-ranked in a higher priority.

Precision-recall graph was plotted to evaluate the performance of our proposed system. As can be seen in Fig. 1, the performance of the classes circle, rectangle, triangle, square and words are good. For the ellipse class, the width to height ratio of the ellipse changes within the class. This affects the similarity within images in the same class and thus makes a relatively poor result. For the shield class, the outer shape of the trademarks in the class varies. Because of this large variation of the outer shape, the retrieval performance is affected.

D. Results for Single Stage Approach using Combined Feature Matching

To show the effectiveness of the proposed two-stage approach, a comparative study with a single-stage retrieval system using both outer shape features and interior shape features was carried out. The features used in both two-stage and single-stage approaches were the same. The only difference was in the similarity measure part where the single-stage approach simply combined Douter and Dinterior in ranking trademark images. As for the two-stage approach, Douter was first used to classify trademark into different sets. It was then followed by using both in ranking trademarks in each set.

TABLE I

SOME RETRIEVAL EXAMPLES BASED ON OUTER SHAPE MATCHING

TABLE II

EXAMPLE ON COMPARING THE RETRIEVAL RESULTS ON OUTER SHAPE MATCHING AND TWO-STAGE MATCHING

Fig. 1 Precision-recall graph for the proposed two-stage feature

matching approach

282

Page 4: A two-stage trademark retrieval system with invariant property · is a concern for new company to design its trademark. With content-based retrieval methods, users can search similar

REFERENCES Table III compares the precision of the two-stage matching and the single-stage matching approaches at a recall of 20. We can see that the retrieval system of the single-stage approach is generally much poorer than that of two-stage retrieval approach. On average, the two-stage approach has a precision of 70.1 while the single stage approach has a precision of 66.6. The similarity differences for outer shape feature and interior shape feature may interefere with each others in producing the final retrieved trademarks in the single stage approach. Hence this deteriorates the retrieval performance.

[1] Muwei Jian, Liang Xu, “Trademark Image Retrieval Using Wavelet-based Shape Features”, 2008 International Symposium zzon Intelligent Information Technology Application Workshops, pp 496-500, 2008

[2] Yong-jiao Wang, Chun-feng Zheng, “Trademark Image Retrieval Based on Shape and Key Local Color Features”, Second International Conference on Information and Computing Science, vol.2, pp 325-328, 2009.

[3] P. Kwan, K. Toraichi, H. Kitagawa and K.Kameyama, “Approximate Query Processing for a Content-Based Image Retrieval Method”, In: V. Malik et al.(Eds.): International Conference on Database and Expert Systems Applications 2003, Springer-Verlag, pp 517-526, 2003. VIII. CONCLUSIONS

[4] M.H. Huang, C.H. Hsieh, C.M. Kuo, “An Efficient Two-Stage Trademark Retrieval System”, International Computer Symposium, pp214-219, 2004.

In this paper, we have proposed a novel two-stage approach for content-based trademark retrieval. A chain of pre-image processing has been done to ensure the retrieval system to be rotation, translation and scaling invariant. Besides, the features used for matching are classified into two types, the outer shape descriptors and interior shape descriptors. For the outer shape descriptors, four values are extracted from the outer points detected from each angle with respect to the centroid. These four values give a general description about the outer shape boundary. For the interior shape features, the contour of trademarks is analyzed by wavelets to extract the shape feature in four directions, i.e. horizontal, vertical, pi/4 diagonal and 3pi/4 diagonal information. Besides, two moment invariant values are extracted from standardized contour to measure the distribution and symmetry of the trademarks. With the two-stage retrieval method, trademarks that are similar in both outer shape and interior shape can be retrieved effectively.

[5] Yong-Sung Kim , Whoi-Yul Kim, “Content-based trademark retrieval system using visually salient features”, Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition (CVPR '97), pp 307-312, June 1997.

[6] Wing Ho Leung and Tsuhan Chen, “Trademark Retrieval Using Contour-Skeleton Stroke Classification”, IEEE International Conference on Multimedia and Expo, pp 517 - 520 vol.2, 2002

[7] Chotirat Ratanamahatana , Eamonn Keogh, Vit Niennattrakul, “Making Image retrieval and Classification more accurate using time series and learned constraints”, Artificial Intelligence for Maximizing Content Based Image Retrieval, pp145-166, 2009.

[8] M. Munich and P. Perona, “Continuous dynamic time warping for translation-invariant curve alignment with applications to signature verification”, in International Conference on Computer Vision (ICCV), pp. 108-115, 1999.

[9] Manmatha, R. & Rath, T. M. “Indexing of Handwritten Historical Documents - Recent Progress”, Proc. of the 2003 Symposium on Document Image Understanding Technology (SDIUT), Greenbelt, pp 77-85, April 2003.

ACKNOWLEDGMENT [10] Stefania Ardizzoni, Ilaria Bartolini, Marco Patella, “Windsurf: Region-Based Image Retrieval Using Wavelets”, Proceedings of the 10th International Workshop on Database & Expert Systems Applications, pp 167-173, 1999.

This work is supported by the Centre for Signal Processing (1-BB9F), Department of Electronic and Information Engineering, the Hong Kong Polytechnic University, and the CERG Grant (PolyU 5215/08E) of the Hong Kong SAR Government. Chan Lit Hung acknowledges the research studentship provided by the University.

[11] B.F. Guo and J.M. Jiang, “A modified shape descriptor in wavelets compressed domain”, International Conference on Image Processing, pp 936-939, 2002.

[12] H-M. Zhang, Q-H. Wang, Y-X. Kan, J-H. Liu and Y-W. Gong, “Researches on Hierarchical Image Retrieval Model Based on Wavelet Descriptor and Indexed by Half-Axes-Angles using R-tree”, International Conference on Machine Learning and Cybernetics, Dalian, 2006.

TABLE III

PRECISION OF VARIOUS CLASSES AT A RECALL OF 20

Precision(%) Class Name Two-stage matching

Single stage matching

improvement

1 Circle 86 79 8.9% 2 Ellipse 55 54 1.9% 3 Rectangle 72 66 9.1% 4 Triangle 72 64 12.5% 5 Square 73 75 -2.7% 6 Shield 52 50 4% 7 Words 81 79 2.5%

average 70.1 66.6 5.3%

[13] AV Sutagundar, “Wavelet Based Image Indexing and Retrieval”, Emerging Trends in Engineering and Technology, pp 52-55, 2008.

[14] M. Jian, “New Texture Features Based on Wavelet Transform Coinciding with Human Visual Perception”, Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, vol. 1, pp 369-373, 2007.

[15] C.C. Chen, “Improved moment invariants for shape discrimination”, Pattern Recognition, pp 683–686, 1993.

[16] M.K. Hu, “Visual Pattern Recognition by Moment Invariants”, IRE Trans. Info. Theory, vol. 8, pp 179–187, 1962.

[17] Javier Montenegro Joo, “Improved Moment Invariants Know How, Why And When”, Revista de Investigacion de Fisica, Vol.8, pp 82-90, 2005.

283


Recommended