+ All Categories
Home > Documents > Skew correction of business card images acquired in PDA

Skew correction of business card images acquired in PDA

Date post: 20-Sep-2016
Category:
Upload: nc
View: 216 times
Download: 3 times
Share this document with a friend
9
Skew correction of business card images acquired in PDA J.H. Park, I.H. Jang and N.C. Kim Abstract: An efficient algorithm for rotational skew correction of business card images acquired in a PDA (personal digital assistant) camera is presented. The proposed method is composed of four parts: block adaptive binarisation (BAB), stripe generation, skew angle calculation and image rotation. In BAB, an input image is binarised block by block so as to lessen the effects of irregular illumination and shadow over the input image. In stripe generation, character string clusters are generated merging adjacent characters and their strings, and then only clusters useful for skew angle calculation are output as stripes. In skew angle calculation, the direction angles of the stripes are calculated using their central moments and then the skew angle of the input image is determined averaging the direction angles. In image rotation, the input image is rotated by the skew angle. Experimental results show that the proposed method yields root mean square error of 0.448 for test images of several types of business cards acquired by a PDA under various surrounding conditions. 1 Introduction Nowadays business cards are usually used as a means of advertising, not only for businessmen but also for other people such as teachers, engineers etc. It therefore becomes a common thing that these people exchange business cards with each other when they first meet. Accordingly, they usually get more business cards and need efficient manage- ment of them instead of carrying all of them. Up to now most of these people keep a business card by putting it in a book of business cards directly, making a note of its information in a memo pad, or manually inputting its information into a computer. Such a management approach, however, is quite inefficient because it requires time and effort for manual searching among many business cards or manual inputting of the information. In recent days the use of mobile Internet terminals such as a personal digital assistant (PDA) has increased since their mobility and portability are well suited to modern human life. A PDA can easily obtain an image of a business card by digitising it with its built-in camera. Furthermore, recognis- ing characters in the image by the PDA can make the management of the information of the business card efficient. For example, if required or permitted, the PDA can transmit the information of the business card immedi- ately via a wireless network and receive the information related to the owner of the business card. However, the images obtained using a PDA may be severely skewed due to handheld photographing in digitisation, which gives uneasy alignment of the camera with its object. It is well known that the presence of skew in a digitised document image makes it difficult to analyse the document and recognise the characters in the image [1]. The accuracy of document analysis and character recog- nition becomes lower as the skew angle is greater and it is seriously deteriorated if the angle is greater than 5 : It therefore follows that skew correction of a document image may be an essential preprocessing of document analysis and character recognition. Many algorithms have been proposed for skew correction of document images [2–8]. Dengel [2] proposed a method of skew correction for document images based on a left margin search. In the method, the position of the first black pixel in each row of a document image is searched. Using these positions, a straight line that represents the right boundary of the left margin of the document is determined. However, this method may not detect the skew angle correctly if that the image has picture regions in its left margin. Le et al. [3] proposed a skew correction method using a Hough transform. One of its features is to reduce the effects of the non-textual data in the skew angle detection to increase the detection accuracy. For this purpose, the entire image is divided into squares of a certain size and then each square is classified as either a textual square or a non-textual one. All black pixels are removed except the pixels in the vertically lowest position of each character in the textual squares. The Hough transform is applied to these remaining pixels to obtain the skew angle. This method yields better performance over those without classification of textual and non-textual regions. However, it cannot determine the skew angle correctly in the case of the document of languages where components of each character are separated, as in Korean. Yan [4] proposed a skew angle detection using cross- correlation between pairs of vertical lines on an entire document image. Chaudhuri et al. [5] improved Yan’s method. Instead of finding the correlation over an entire q IEE, 2005 IEE Proceedings online no. 20045015 doi: 10.1049/ip-vis:20045015 J.H. Park is with Samsung Electronics Co. Ltd., S/W Laboratory, R&D Group 3, Mobile Communication Division, Gumi, 730-350, Korea I.H. Jang is with the Kyungwoon University, Department of Electronic Engineering, Gumi, 730-850, Korea N.C. Kim is with the Kyungpook National University, Department of Electronic Engineering, Daegu, 702-701, Korea E-mail: [email protected] Paper first received 30th April 2004 and in revised form 22nd February 2005 IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 6, December 2005 668
Transcript
Page 1: Skew correction of business card images acquired in PDA

Skew correction of business card images acquiredin PDA

J.H. Park, I.H. Jang and N.C. Kim

Abstract: An efficient algorithm for rotational skew correction of business card images acquiredin a PDA (personal digital assistant) camera is presented. The proposed method is composed offour parts: block adaptive binarisation (BAB), stripe generation, skew angle calculation andimage rotation. In BAB, an input image is binarised block by block so as to lessen the effects ofirregular illumination and shadow over the input image. In stripe generation, character stringclusters are generated merging adjacent characters and their strings, and then only clusters usefulfor skew angle calculation are output as stripes. In skew angle calculation, the direction angles ofthe stripes are calculated using their central moments and then the skew angle of the input imageis determined averaging the direction angles. In image rotation, the input image is rotated by theskew angle. Experimental results show that the proposed method yields root mean square errorof 0.448 for test images of several types of business cards acquired by a PDA under varioussurrounding conditions.

1 Introduction

Nowadays business cards are usually used as a means ofadvertising, not only for businessmen but also for otherpeople such as teachers, engineers etc. It therefore becomesa common thing that these people exchange business cardswith each other when they first meet. Accordingly, theyusually get more business cards and need efficient manage-ment of them instead of carrying all of them. Up to nowmost of these people keep a business card by putting it in abook of business cards directly, making a note of itsinformation in a memo pad, or manually inputting itsinformation into a computer. Such a management approach,however, is quite inefficient because it requires time andeffort for manual searching among many business cards ormanual inputting of the information.

In recent days the use of mobile Internet terminals such asa personal digital assistant (PDA) has increased since theirmobility and portability are well suited to modern humanlife. A PDA can easily obtain an image of a business card bydigitising it with its built-in camera. Furthermore, recognis-ing characters in the image by the PDA can make themanagement of the information of the business cardefficient. For example, if required or permitted, the PDAcan transmit the information of the business card immedi-ately via a wireless network and receive the informationrelated to the owner of the business card.

However, the images obtained using a PDA may beseverely skewed due to handheld photographing indigitisation, which gives uneasy alignment of the camerawith its object. It is well known that the presence of skew ina digitised document image makes it difficult to analysethe document and recognise the characters in the image [1].The accuracy of document analysis and character recog-nition becomes lower as the skew angle is greater and itis seriously deteriorated if the angle is greater than 5�:It therefore follows that skew correction of a documentimage may be an essential preprocessing of documentanalysis and character recognition.

Many algorithms have been proposed for skew correctionof document images [2–8]. Dengel [2] proposed a methodof skew correction for document images based on a leftmargin search. In the method, the position of the first blackpixel in each row of a document image is searched. Usingthese positions, a straight line that represents the rightboundary of the left margin of the document is determined.However, this method may not detect the skew anglecorrectly if that the image has picture regions in its leftmargin.

Le et al. [3] proposed a skew correction method using aHough transform. One of its features is to reduce the effectsof the non-textual data in the skew angle detection toincrease the detection accuracy. For this purpose, the entireimage is divided into squares of a certain size and then eachsquare is classified as either a textual square or a non-textualone. All black pixels are removed except the pixels in thevertically lowest position of each character in the textualsquares. The Hough transform is applied to these remainingpixels to obtain the skew angle. This method yields betterperformance over those without classification of textual andnon-textual regions. However, it cannot determine the skewangle correctly in the case of the document of languageswhere components of each character are separated, as inKorean.

Yan [4] proposed a skew angle detection using cross-correlation between pairs of vertical lines on an entiredocument image. Chaudhuri et al. [5] improved Yan’smethod. Instead of finding the correlation over an entire

q IEE, 2005

IEE Proceedings online no. 20045015

doi: 10.1049/ip-vis:20045015

J.H. Park is with Samsung Electronics Co. Ltd., S/W Laboratory, R&DGroup 3, Mobile Communication Division, Gumi, 730-350, Korea

I.H. Jang is with the Kyungwoon University, Department of ElectronicEngineering, Gumi, 730-850, Korea

N.C. Kim is with the Kyungpook National University, Department ofElectronic Engineering, Daegu, 702-701, Korea

E-mail: [email protected]

Paper first received 30th April 2004 and in revised form 22nd February2005

IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 6, December 2005668

Page 2: Skew correction of business card images acquired in PDA

image, they calculated the correlations over small regionsselected randomly and found the maximum correlationamong them. Then they calculated the skew angle from therelation of the distance between two lines and the distancecorresponding to the maximum correlation. The methodyields good performance on document images with manycharacter strings. Its performance, however, deteriorates fordocument images with pictures and=or tables or for thosewith textured backgrounds. Also, a skew correction methodfor document images using a projection is found in [6], oneusing a Fourier transform is found in [7], and one based onstraight-line fitting is found in [8]. Methods [2–8]mentioned above are mainly for rotational skew corrections.Most conventional skew correction methods have been

studied for document images. Since document images areusually acquired by high-resolution scanners and have manycharacter strings, it is somewhat easy to perform their skewcorrection. On the other hand, business card images that areacquired by low-resolution cameras in handheld PDAs mayoften have irregular illumination and shadow. In additionthey have a number of character strings in irregularpositions and sometimes contain pictures, patterns andtextures in the background. It is thus not easy to performskew correction of business card images acquired in a PDAby using conventional skew correction methods fordocument images.Related to our study, some recent methods of skew

correction for business card images acquired in scanners andthose for document images in handheld cameras can befound in [9–14]. In [9], a method of skew correction forbusiness card images based on a two-stage Hough transformwas proposed. Examples of products that can do skewcorrection for business card images are found in [10] and[11]. Perspective skew corrections of document imagesacquired in handheld cameras were proposed in [12] and[13] and that of texts in 3D scenes with handheld camerasin [14].In this paper, we propose an efficient algorithm for

rotational skew correction of business card images acquiredin a PDA, which may be captured in an ill-conditionedenvironment. The proposed method is composed of fourparts: block adaptive binarisation (BAB), stripe generation,skew angle calculation and image rotation. In BAB, an inputimage is binarised block by block so as to lessen the effectof irregular illumination and shadow over the input image.The input image is first partitioned into 8� 8 blocks andeach block is then classified into a character block (CB) or abackground block (BB). Each CB is grouped with its eightadjacent blocks so that a 24� 24 block is formed and athreshold for binarisation is obtained by applying Otsu’smethod [15] to the 24� 24 block. The CB is binarised withthe threshold.In stripe generation, clusters of character strings are

generated merging adjacent characters and their strings.Then only clusters of above a certain size and eccentricity,which represents how the object is longish, are output asstripes. In skew angle calculation, the direction angles of thestripes are calculated using their central moments and thenthe skew angle of the input image is determined averagingthe direction angles. In image rotation, the input image isrotated by the skew angle. To evaluate the performance ofthe proposed method, the method is applied to test images ofseveral types of business cards under various surroundingconditions acquired by a PDA camera and its performance iscompared with that of Chaudhuri’s method in [5].Simulation results show that the proposed method yieldsgood skew correction.

2 Proposed skew correction method

Business cards generally contain character strings such asname, affiliation, address, phone number, e-mail address etc.,which are parallel to the major axes of business cards. In theproposedmethod, we regard the angle of a character string tothe horizontal axis as the skew angle of an input image andtry to find the angle. Since one does not usually skew abusiness card intentionally during digitisation, we assumethat an input image is rotationally skewed with an anglewhose absolute value is less than or equal to 35� and withoutserious perspective skew. Figure 1 shows a block diagram ofthe proposed skew correction method. The proposed methodis composed of four parts: BAB, stripe generation, skewangle calculation and image rotation.

2.1 Block adaptive binarisation

When an image is binarised with just one threshold, it iscalled global binarisation (GB). Figure 2 shows business cardimages of 640� 480 pixels obtained by a PDA camera andthe resulting images of the GB with a threshold by Otsu’sthreshold selection method [15] for the images. The businesscard images in Figs. 2a and b are ill-conditioned images ofirregular illumination and shadow, respectively. As shown inthe binary images in Figs. 2c and d, when a business cardimage is binarised by GB, some character strings may notappear due to ill condition of irregular illumination andshadow. To overcome the problem, a BAB that can performwell even for the images of ill conditions is proposed.

Figure 3 shows the BAB procedure. First, an input imageis partitioned into 8� 8 blocks and then the blocks areclassified into CBs and BBs. The key idea of the blockclassification is that most CBs have higher activities thanmost BBs. A block is determined as a CB if its activity ishigher than a threshold. In the decision, the block activity isdefined as the absolute sum of low-frequency DCTcoefficients, that is:

Ek ¼Xu

Xv

0�uþv�3ðu;vÞ6¼ð0;0Þ

Dkuv

�� �� ð1Þ

where Dkuv denotes the DCT coefficient of the frequency

(u, v) at the kth block. So the classification of the kth blockcan be represented as

OC ¼ fkjEk � ThBg ð2Þ

OB ¼ O� OC ð3Þwhere O, OC and OB denote index sets of total block, CB,and BB, respectively, and ThB denotes a threshold. In thispaper, ThB is determined as the average of Ek over the entireimage. Figure 4 shows the resulting images of the blockclassification for the images in Figs. 2a and b. In Fig. 4, greyparts represent CBs and black ones BBs. We can see fromFig. 4 that the blocks in the images are well classified.

For the kth CB, we find a threshold Thk for the block of24� 24 into which the CB and its eight adjacent blocks aregrouped. The original kth block ukði; jÞ of 8� 8 is binarisedwith the Thk as

bkði; jÞ ¼ 1; if ukði; jÞ � Thk

0; otherwise

�ð4Þ

Finally, a binary image is obtained by tiling the binarisedblocks of 8� 8 in their positions. Figure 5 shows theresulting images for the images in Figs. 2a and b by BAB.Comparing the binary images in Figs. 2c and d with those in

IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 6, December 2005 669

Page 3: Skew correction of business card images acquired in PDA

Fig. 5, one can see that the character strings in the binaryimages by BAB are detected much better than those by GB,regardless of irregular illumination and shadow.

2.2 Stripe generation

In stripe generation, separated character strings are firstconverted into clusters of character strings. This procedureis called cluster generation. Then only longish clustersappropriate to the direction angle calculation are selected asstripes. In cluster generation, horizontal subsampling on thebinary image is first performed. Characters and their stringsin the subsampled binary image are then merged intolongish clusters by using morphological dilation [16] with a

structuring element of size 5� 5: Even though the linespacings of clusters adjacent vertically are usually longerthan the inter-spaces of the characters and their strings, theclusters may be merged in the dilation. To avoid such anunwanted merging, morphological erosion [16] with astructuring element of size 3� 3; whose effect is weakerthan that of the dilation, is applied. Vertical subsampling isnext performed to maintain the ratio of the horizontal size tothe vertical size of the input image. Figure 6 shows theresulting images of cluster generation for the binary imagesin Fig. 5. As shown in Fig. 6, most character strings areconverted to clusters. However, there are some undesiredclusters, which are not longish, such as marks, noise blocks,and so on.

Fig. 1 Block diagram of the proposed skew correction method

a b

c d

Fig. 2 Business card images obtained by PDA camera and resulting GB images

a Input image 1b Input image 2c Binary image for input image 1d Binary image for input image 2

Fig. 3 Procedure for BAB

IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 6, December 2005670

Page 4: Skew correction of business card images acquired in PDA

To remove the undesired clusters, we adopt the criteria ofcluster size m00 and eccentricity e (which represents howlongish an object is), which are defined as follows [16]:

mpq ¼Xx

Xy

ðx� �xxÞpðy� �yyÞq; p; q ¼ 0; 1; 2; � � � ð5Þ

e ¼ 4m211 þ ðm20 � m02Þ2

ðm20 þ m02Þ2ð6Þ

where �xx and �yy denote horizontal and vertical centroids,respectively. A cluster is selected as a stripe if itseccentricity e and size m00 satisfy the conditions e � ethand mL � m00 � mH ; where eth denotes a threshold for theeccentricity e, and mL and mH are thresholds for the clustersize m00. Figure 7 shows the resulting images where stripesare found from the images in Fig. 6. In Fig. 7, undesiredclusters such as marks and noise blocks are well removed.

2.3 Skew angle calculation

In skew angle calculation, the direction angle of each stripeis calculated as [16]:

y ¼ 1

2arctan

2m11m20 � m02

� �ð7Þ

where y denotes the direction angle whose range is 0� �y< 180�: To estimate the skew angle, the direction angle yis first rounded off as:

yn ¼ roundðyÞ mod 180 ð8Þwhere yn denotes the rounded direction angle. Thehistogram of yn; hðynÞ; is then obtained to find the mostfrequent occurrence among the rounded direction angles of

a

b

Fig. 4 Resulting images of block classification for images inFigs. 2a and b

a Result for Fig. 2ab Result for Fig. 2b, where grey parts represent CBs and black ones BBs

a

b

Fig. 5 Resulting images of BAB for images in Figs. 2a and b

a Result for Fig. 2ab Result for Fig. 2b

a

b

Fig. 6 Resulting images of cluster generation for images in Fig. 5

a Result for Fig. 5ab Result for Fig. 5b

IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 6, December 2005 671

Page 5: Skew correction of business card images acquired in PDA

all the stripes. Next, the most frequent direction angle yf isfound as

yf ¼ arg max0�yn < 180

1

3

Xynþ1

yt¼yn�1

hðyt mod 180Þ( )

ð9Þ

Finally, the skew angle ys is estimated as

ys ¼ < y>; yf � 0:5 � y< yf þ 0:5 ð10Þ

where < �>denotes the average operation.

2.4 Image rotation

In image rotation, the input image is rotated by �ys whenthe estimated skew angle is ys: Such an image rotation canbe implemented by forward mapping [17]. This rotation,however, may create holes inside the skew corrected imagebecause it is not a one-to-one mapping. Instead of forwardmapping, we adopt the following inverse mapping [18].In inverse mapping, each pixel ðx0; y0Þ of an output image ismapped into the point (x, y) of an input image, that is:

x

y

� �¼ cos yr sin yr

� sin yr cos yr

� �x0

y0

� �ð11Þ

where yr denotes the rotation angle. One problem of thismethod is that a point (x, y) corresponding to a pixel ðx0; y0Þmay not be a pixel because of the trigonometric functions.The intensity value of a pixel ðx0; y0Þ in an output image gcan be obtained by bilinear interpolation [19] with fourneighbour pixels of a point (x, y) in an input image f asfollows:

a

b

Fig. 7 Resulting images where stripes are found from imagesin Fig. 6

a Result for Fig. 6ab Result for Fig. 6b

a

b

Fig. 8 Resulting images of image rotation for images in Figs. 2aand b

a Result for Fig. 2ab Result for Fig. 2b

a

b

Fig. 9 Resulting images of corner filling for images in Fig. 8

a Result for Fig. 8ab Result for Fig. 8b

IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 6, December 2005672

Page 6: Skew correction of business card images acquired in PDA

gðx0; y0Þ ¼ ð1� DxÞð1� DyÞf ðm; nÞþ ð1� DxÞDyf ðm; nþ 1Þþ Dxð1� DyÞf ðmþ 1; nÞþ DxDyf ðmþ 1; nþ 1Þ

ð12Þ

where m ¼ intðxÞ; n ¼ intðyÞ; Dx ¼ x� m; and Dy ¼y� n.Figure 8 shows the rotated images for the images in

Figs. 2a and b. As shown in Fig. 8, the rotation creates blankregions at each corner. For better appearance, eachblank pixel of four corners is filled with the nearest non-blank pixel along the horizontal line. Figure 9 shows theresulting images of corner filling for the images in Fig. 8.We can see that the corner-filled images look more naturalthan the unfilled ones.

3 Experimental results

To evaluate the performance of the proposed skewcorrection method, test images of several types of businesscards were acquired under various surrounding conditions.In the acquisition of the test images, a PDA, iPAQ h3950 byCompaq, which has a 400MHz Xscale processor and 64MBRAM, was used with its built-in camera Nexicam byNavicom, which can obtain images of 800� 600 pixels atmaximum. In the test business cards, there are ordinarybusiness cards, special business cards of textured surfaces,and those with patterns or pictures. The surroundingconditions can be divided into good condition and illcondition containing irregular illumination, shadow, andcomplex backgrounds out of business cards like wood grain.As a performance measure, RMSE (root mean square error)between manually measured skew angle and estimated skew

a b

c d

e f

Fig. 10 Images of several types of business cards acquired by PDA under various surrounding conditions

a Ordinary business card on white paperb Ordinary business card on wood grain deskc, d Special business cards with patterns on wood grain deske Special business card of textured surface on wood grain deskf Ordinary business card on two-tone desk

IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 6, December 2005 673

Page 7: Skew correction of business card images acquired in PDA

angle was adopted. In the manual measurement, we drew aline parallel to the major axis of the business card in an inputimage and calculated its angle by using an image processingtool.

Figure 10 shows 640� 480 images of several types ofbusiness cards acquired by the PDA under varioussurrounding conditions. Figures 10a and b show ordinary

business cards on white paper and directly on a wood graindesk, respectively. Figures 10c and d show special businesscards with patterns on a wood grain desk, Fig. 10e a specialbusiness card of textured surface on a wood grain desk, andFig. 10f an ordinary business card on a two-tone desk. Thereare also irregular illuminations and=or shadows inthe business card images in Figs. 10b–d. Figure 11 shows

Table 1: RMSE performances of Chaudhuri’s method and proposed method

Chaudhuri, deg Proposed, deg

Type of business card Surrounding condition with GB with BAB with GB1 with BAB2

Ordinary good 14.81 13.72 12.95 0.42

Ill 16.47 14.83 12.11 0.39

Special good 14.62 16.84 12.55 0.42

Ill 18.24 16.95 13.83 0.52

Total 16.10 15.49 12.35 0.44

1 the case that the BAB is replaced with the GB in the proposed method2 the case of the proposed method itself

a b

c d

e f

Fig. 11 Resulting images of skew correction for images in Fig. 10

IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 6, December 2005674

Page 8: Skew correction of business card images acquired in PDA

the resulting images of the skew correction for the images inFig. 10 by the proposed skew correction method. It is shownthat the proposed method gives good skew correctionregardless of the type of business cards and their surround-ing conditions.Next, we compare the performance of the proposed skew

correction method with that of Chaudhuri’s method in [5].As mentioned before, Chaudhuri’s method is for documentimages obtained by high-resolution scanners, in which atleast a half of each image line generally contains characters.Differently from an image acquired by a scanner, a businesscard image from a PDA camera is affected by itssurrounding condition so that some character strings inits binary image by the GB may not appear as shown inFig. 2. For fair comparison, two types of binary imageswith GB and BAB were adopted as input images forChaudhuri’s method. Table 1 shows RMSE performances ofChaudhuri’s method with the GB, that with the BAB, theproposed method with the GB, and the proposed methoditself for 200 test business card images. As shown in Table 1,we can see that the proposed method itself yields RMSE of0:44� so that it yields 15:05–15:66� improvement in RMSEover Chaudhuri’s method for the test images. We can alsosee that the proposed method itself yields 11:91� improve-ment in RMSE over the proposed method with the GB.To evaluate the effect of the proposed skew correction on

character recognition, we measured character recognitionrate, which is defined as

character recognition rate

¼ number of correctly recognised characters

total number of characters ð13Þ

The BAB using modified quadratic filter [20] was used as abinarisation software and the OCR (optical characterrecognition) software of FineReader 5.0 [21] as recognitionsoftware. Table 2 shows character recognition rates onbinary images skew corrected by Chaudhuri’s method withthe GB, that with the BAB, the proposed method with theGB, and the proposed method itself for 60 test business cardimages. As shown in Table 2, we can see that the proposedmethod itself gives character recognition rate of 88% so thatit gives 86:9% improvement in character recognition rate forthe test images. We can also see that it gives 31:7–38:5%improvement in character recognition rate over Chaudhuri’smethod and 7:6% improvement over the proposed methodwith the GB. From these results, one can see that theproposed skew correction has very good effect on characterrecognition.We also implemented the proposed algorithm using

Cþþ on the PDA mentioned before under Windows CEenvironment. Unlike a PC, a PDA processor does not have aFPU (floating-point unit) so that all floating-point operationsin the algorithm take a long time in a PDA. To solve this

problem, all floating-point operations are converted intointeger operations by using rounding and shifting.The processing time of the skew correction for a businesscard image in the PDA was about 0.7 s. Figure 12 shows aninput business card image and its skew-corrected image onthe PDA.

Table 2: Character recognition rates on binary images by an OCR software of FineReader 5.0, which are obtained byapplying skew correction with Chaudhuri’s method and proposed method followed by binarisation with BAB usingmodified quadratic filter to test business card images

Chaudhuri, % Proposed, %

Type of business card Surrounding condition No skew correction % with GB with BAB with GB with BAB

Ordinary good 2.4 56.9 72.6 90.9 93.1

Ill 0.7 46.4 54.3 82.3 90.1

Special good 1.1 45.4 45.3 85.1 85.8

Ill 0.0 49.3 53.1 63.4 83.1

Total 1.1 49.5 56.3 80.4 88.0

a

b

Fig. 12 Input business card image and its skew-corrected imageon PDA

a Input imageb Skew-corrected image

IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 6, December 2005 675

Page 9: Skew correction of business card images acquired in PDA

4 Conclusions

An efficient algorithm has been presented for rotationalskew correction of business card images acquired in a PDA,which may be captured under ill-conditioned surroundingconditions. In the proposed method, an input image isbinarised block by block so as to lessen the effect ofirregular illumination and shadow over the input image andthen character string clusters are generated mergingadjacent characters and their strings. Then only clustersuseful for skew angle calculation are output as stripes andthe direction angles of the stripes are calculated usingcentral moments. Next the skew angle of the input image isdetermined averaging the most frequent direction angles.Finally, the input image is rotated by the skew angle.The performance of the proposed method has beenevaluated for test images of several types of businesscards under various surrounding conditions acquired by aPDA camera. Experimental results have demonstrated thatthe proposed method yields RMSE of 0:44� for the testimages so that it gives good skew correction regardless ofthe types of business cards and their surrounding conditions.The extension of our method to the case of seriousperspective skew correction is left for a further study.

5 References

1 Kwag, H.K., Kim, S.H., Jeong, S.H., and Lee, G.S.: ‘Efficient skewestimation and correction algorithm for document images’, Image Vis.Comput., 2002, 20, (1), pp. 25–35

2 Dengel, A.: ‘ANASTASIL: A system for low-level and high-levelgeometric analysis of printed documents’, in Baird, H.S. et al. (Eds.):‘Structured document image analysis’ (Springer-Verlag, New York,1992), pp. 70–99

3 Le, D.X., Thoma, G., and Weschler, H.: ‘Automated page orientationand skew angle detection for binary document images’, PatternRecognit., 1994, 27, (10), pp. 1325–1344

4 Yan, H.: ‘Skew correction of document images using interline cross-correlation’,Graph. Models Image Process., 1993, 55, (6), pp. 538–543

5 Chaudhuri, A., and Chaudhuri, S.: ‘Robust detection of skew indocument images’, IEEE Trans. Image Process., 1997, 6, (2),pp. 344–349

6 Ciardiello, G., et al.: ‘An experimental system for office documenthandling and text recognition’. Proc. Int. Conf. on Pattern Recognition,1988, pp. 739–743

7 Postl, W.: ‘Detection of linear oblique structures and skew scan indigitized documents’. Proc. Int. Conf. on Pattern Recognition, 1986,pp. 687–689

8 Cao, Y., Wang, S., and Li, H.: ‘Skew detection and correction indocument images based on straight-line fitting’, Pattern Recognit. Lett.,2003, 24, (12), pp. 1871–1879

9 Pan, W., Jin, J., Shi, G., and Wang, Q.R.: ‘A system for automaticChinese business card recognition’. Proc. IEEE Int. Conf. on DocumentAnalysis and Recogn., 2001, pp. 577–581

10 Canon USA Inc., 1999, DR-5020 High Speed Document Scanner(Online); available at: http://www.usa.canon.com/html/ibeCCtdItemDetail.jsp?section=10042 &item=517

11 Xerox Corp., 2004, Xerox DocuMate 262 (Online); available at: http://www.xeroxscanners.com/default.asp?pageid=128

12 Pilu, M.: ‘Extraction of illusory linear clues in perspectively skeweddocuments’. Proc. IEEE Int. Conf. on Computer Vision and PatternRecognition, 2001, pp. 363–368

13 Pilu, M., and Pollard, S.: ‘A light-weight text image processing methodfor handheld embedded cameras’. Proc. British Machine Vision Conf.,2002, pp. 547–556

14 Clark, P., and Mirmehdi, M.: ‘Rectifying perspective views of text in3D scenes using vanishing points’, Pattern Recognit., 2003, 36, (11),pp. 2673–2686

15 Otsu, N.: ‘A threshold selection method from gray-level histograms’,IEEE Trans. Syst. Man Cybern., 1979, SMC-9, (1), pp. 62–66

16 Gonzalez, R.C., andWoods, R.E.: ‘Digital image processing’ (Prentice-Hall, Upper Saddle River, NJ, USA, 2002)

17 Shapiro, L.G., and Stockman, G.C.: ‘Computer vision’ (Prentice-Hall,Upper Saddle River, NJ, USA, 2001)

18 Jahne, B., Haußecker, H., and Geißler, P.: ‘Handbook of computervision and applications’ (Academic Press, San Diego, CA, USA, 1999)

19 Lim, J.S.: ‘Two-dimensional signal and image processing’ (Prentice-Hall, Englewood Cliffs, NJ, USA, 1990)

20 Shin, K.T., Jang, I.H., Kim, N.C., Kim, C.H., and Kim, T.S.: ‘Blockadaptive binarization of business card images using modified quadraticfilter’. Proc. SPIE Conf. on Electronic Imaging, 2004, pp. 92–101

21 Abbyy software house, 2001, FineReader 5.0 (Online); available at:http://www.abbyy.com/ocr_products.asp?param=1613

IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 6, December 2005676


Recommended