+ All Categories
Home > Documents > [IEEE 2011 Tenth International Conference on Machine Learning and Applications (ICMLA) - Honolulu,...

[IEEE 2011 Tenth International Conference on Machine Learning and Applications (ICMLA) - Honolulu,...

Date post: 18-Dec-2016
Category:
Upload: maha
View: 219 times
Download: 4 times
Share this document with a friend
4
Arabic Handwriting Recognition using Concavity Features and Classifier Fusion Sherif Abdel Azeem Electronics Engineering Department American University in Cairo Email: [email protected] Maha El Meseery Electronics Engineering Department American University in Cairo Email: [email protected] Abstract—This paper presents a simple and effective technique for the recognition of writer-independent offline handwritten Arabic Digits. The system is based on labeling the white pix- els in a digit’s image into nine different concavity categories. Four different feature vectors are extracted from these labeled concavities. Each feature vector is then introduced to a linear SVM classifiers. The final decision of the system is achieved using classifiers fusion methods. The system has been tested on a database of 10000 Arabic handwritten digits. The presented method achieves a recognition rate of 99.36% which outperforms all reported results on that Arabic digits database using linear SVM classifier. I. I NTRODUCTION Handwritten digits recognition has application like office automation, check verification, and postal address reading and sorting. While recognition of handwritten Latin digits has been extensively investigated using various techniques [1], too little work has been done on handwritten Arabic digits. Al-Omari et al. [2] used a scale-, translation-, rotation- invariant feature vector to train a probabilistic neural network (PNN). Their database was composed of 720 digits for training and 480 digits for testing written by 120 persons. They achieved 99.75% accuracy. Said et al. [3] used pixel values of the normalized digit images as features. They fed these values to an Artificial Neural Network (ANN). They used a training set of 2400 digits and a testing set of 200 digits written by 20 persons to achieve 94% accuracy. In [4], El-Sherif and Ab- delazeem devised a two-stage system for recognizing Arabic digits. The first stage is an ANN fed with a short feature vector to handle easy-to-classify digits. Ambiguous digits are rejected to the more powerful second stage which is an SVM fed with a long feature vector. The system had a good timing performance and achieved 99.15% accuracy. Note that results of different works cannot be compared because the used databases are not the same. Abdelazeem and El-Sherif [5] conducted a comprehensive study on the problem of Arabic handwritten digits recognition. They reported the performances of different classification methods and several universal feature extraction techniques on the recognition of handwritten Arabic digits. Abdelazeem [6] used several features based on the topology of the Arabic digits and achieved 99.25% recognition rate. In this paper we introduce a system that is based on structural concavity features. The concavity feature are ex- tracted from the digit in nine different concavity categories. Next, the features are extracted in four different orientations: vertical, horizontal, left diagonal, and right diagonal. For each orientation, the image is divided into strips and the features are extracted from each strip. A different classifier is used for each orientation, thus producing four different decisions for each digit. A decision fusion method is used to generate the final recognized digits from the input image. II. FEATURE EXTRACTION The most important step in any recognition system is the feature extraction. Any recognition system’s performance de- pends on the quality of features as it depends on the classifier used to make the final classification decision [7]. For a better recognition system, the feature set extracted should capture the input digit properties. Structural features tend to describe the visual topological and geometrical properties of the digit which make them similar to how humans distinguish the different classes. The feature extraction used in the proposed system is based on extracting nine different concavity features from the input image. First, the image is preprocessed to normalize the size of the digit, this is done via resizing the bounding box of the digit to height h 1 and width w 2 . Second, the image is traversed to label each white pixel in the image with the correct concavity configuration as explained in the next section. A. Concavity configurations The concavity features are designed to highlight the topo- logical and geometrical properties of the input digit. A white pixel in the input digit image is set to a certain concavity fea- ture depending on the nature of the surrounding black pixels. The neighborhood of each white pixel is searched on the four Freeman directions [8] until either a black pixel is found or one of the corners of the bounding box is encountered. The white pixel is then assigned to one of possible nine concavity configurations depending on the nature of the surrounding black pixels and bounding box corners as follows. 1) Concavity 1: A White pixel surrounded by black pixels in all four Freeman directions as shown in Fig 1(a) belongs to concavity configuration 1 (C1). 1 h is chosen empirically to 40 pixels 2 Note that the aspect ratio is preserved. 2011 10th International Conference on Machine Learning and Applications 978-0-7695-4607-0/11 $26.00 © 2011 IEEE DOI 10.1109/ICMLA.2011.36 200
Transcript
Page 1: [IEEE 2011 Tenth International Conference on Machine Learning and Applications (ICMLA) - Honolulu, HI, USA (2011.12.18-2011.12.21)] 2011 10th International Conference on Machine Learning

Arabic Handwriting Recognition using ConcavityFeatures and Classifier Fusion

Sherif Abdel AzeemElectronics Engineering Department

American University in CairoEmail: [email protected]

Maha El MeseeryElectronics Engineering Department

American University in CairoEmail: [email protected]

Abstract—This paper presents a simple and effective techniquefor the recognition of writer-independent offline handwrittenArabic Digits. The system is based on labeling the white pix-els in a digit’s image into nine different concavity categories.Four different feature vectors are extracted from these labeledconcavities. Each feature vector is then introduced to a linearSVM classifiers. The final decision of the system is achievedusing classifiers fusion methods. The system has been tested ona database of 10000 Arabic handwritten digits. The presentedmethod achieves a recognition rate of 99.36% which outperformsall reported results on that Arabic digits database using linearSVM classifier.

I. INTRODUCTION

Handwritten digits recognition has application like officeautomation, check verification, and postal address reading andsorting. While recognition of handwritten Latin digits has beenextensively investigated using various techniques [1], too littlework has been done on handwritten Arabic digits.

Al-Omari et al. [2] used a scale-, translation-, rotation-invariant feature vector to train a probabilistic neural network(PNN). Their database was composed of 720 digits for trainingand 480 digits for testing written by 120 persons. Theyachieved 99.75% accuracy. Said et al. [3] used pixel values ofthe normalized digit images as features. They fed these valuesto an Artificial Neural Network (ANN). They used a trainingset of 2400 digits and a testing set of 200 digits written by20 persons to achieve 94% accuracy. In [4], El-Sherif and Ab-delazeem devised a two-stage system for recognizing Arabicdigits. The first stage is an ANN fed with a short feature vectorto handle easy-to-classify digits. Ambiguous digits are rejectedto the more powerful second stage which is an SVM fed with along feature vector. The system had a good timing performanceand achieved 99.15% accuracy. Note that results of differentworks cannot be compared because the used databases arenot the same. Abdelazeem and El-Sherif [5] conducted acomprehensive study on the problem of Arabic handwrittendigits recognition. They reported the performances of differentclassification methods and several universal feature extractiontechniques on the recognition of handwritten Arabic digits.Abdelazeem [6] used several features based on the topologyof the Arabic digits and achieved 99.25% recognition rate.

In this paper we introduce a system that is based onstructural concavity features. The concavity feature are ex-tracted from the digit in nine different concavity categories.

Next, the features are extracted in four different orientations:vertical, horizontal, left diagonal, and right diagonal. For eachorientation, the image is divided into strips and the featuresare extracted from each strip. A different classifier is used foreach orientation, thus producing four different decisions foreach digit. A decision fusion method is used to generate thefinal recognized digits from the input image.

II. FEATURE EXTRACTION

The most important step in any recognition system is thefeature extraction. Any recognition system’s performance de-pends on the quality of features as it depends on the classifierused to make the final classification decision [7]. For a betterrecognition system, the feature set extracted should capturethe input digit properties. Structural features tend to describethe visual topological and geometrical properties of the digitwhich make them similar to how humans distinguish thedifferent classes. The feature extraction used in the proposedsystem is based on extracting nine different concavity featuresfrom the input image. First, the image is preprocessed tonormalize the size of the digit, this is done via resizing thebounding box of the digit to height h1 and width w 2. Second,the image is traversed to label each white pixel in the imagewith the correct concavity configuration as explained in thenext section.

A. Concavity configurations

The concavity features are designed to highlight the topo-logical and geometrical properties of the input digit. A whitepixel in the input digit image is set to a certain concavity fea-ture depending on the nature of the surrounding black pixels.The neighborhood of each white pixel is searched on the fourFreeman directions [8] until either a black pixel is found orone of the corners of the bounding box is encountered. Thewhite pixel is then assigned to one of possible nine concavityconfigurations depending on the nature of the surroundingblack pixels and bounding box corners as follows.

1) Concavity 1: A White pixel surrounded by black pixelsin all four Freeman directions as shown in Fig 1(a) belongsto concavity configuration 1 (C1).

1h is chosen empirically to 40 pixels2Note that the aspect ratio is preserved.

2011 10th International Conference on Machine Learning and Applications

978-0-7695-4607-0/11 $26.00 © 2011 IEEE

DOI 10.1109/ICMLA.2011.36

200

Page 2: [IEEE 2011 Tenth International Conference on Machine Learning and Applications (ICMLA) - Honolulu, HI, USA (2011.12.18-2011.12.21)] 2011 10th International Conference on Machine Learning

(a)C1

(b) C2 (c)C3

(d)C4

(e)C5

(f) C6 (g)C7

(h)C8

(i) C9

Fig. 1. The 9 concavity configurations. The blue pixel indicates the encounterof a bounding box

2) Concavity 2: A White pixel surrounded by the bottomcorner of the bounding box and by black pixels in the threeFreeman directions shown in Fig. 1(b) belongs to concavityconfiguration 2 (C2).

3) Concavity 3: A White pixel confined by the top cornerof the bounding box and by black pixels in the three Freemandirections shown in Fig. 1(c) belongs to concavity configura-tion 3 (C3).

4) Concavity 4: A White pixel confined by the left cornerof the bounding box and by black pixels in the three Freemandirections shown in Fig. 1(d) belongs to concavity configura-tion 4 (C4).

5) Concavity 5: A White pixel confined by the rightcorner of the bounding box and by black pixels in the threeFreeman directions shown in Fig. 1(e) belongs to concavityconfiguration 5 (C5).

6) Concavity 6: A White pixel confined by the top and rightcorner of the bounding box and by black pixels in the twoFreeman directions shown in Fig. 1(f) belongs to concavityconfiguration 6 (C6).

7) Concavity 7: A White pixel confined by the right andbottom corner of the bounding box and by black pixels inthe two Freeman directions shown in Fig. 1(g) belongs toconcavity configuration 7 (C7).

8) Concavity 8: A White pixel confined by the top and leftcorner of the bounding box and by black pixels in the twoFreeman directions shown in Fig. 1(h) belongs to concavityconfiguration 8 (C8).

9) Concavity 9: A White pixel confined by the bottom andleft corner of the bounding box and by black pixels in the twoFreeman directions shown in Fig. 1(i) belongs to concavityconfiguration 9 (C9).

B. Computing features from concavity maps

The concavity configurations 1 to 9 are used to label eachwhite pixel in the input digit. Figure 2 shows an example ofan input digit 5 and the concavity map produced for the whitepixels in the image. In the concavity map, each white pixel islabeled with its concavity configuration number from 1 to 9.The black pixels are labeled with 0 as they do not belong toany configuration

After producing the concavity map, features are extractedby dividing the image into strips in four directions (vertical,horizontal, right diagonal, and left diagonal) and the countof each of the nine concavity configurations in each strip isconsidered a feature as follows.

1) Vertical: The concavity map is divided into verticalstrips. The count of each concavity configuration in the stripis considered a feature. Figure 3(a) shows an example of a

(a) OriginalImage

(b) Bitmap (c) ConcavityFeatures

Fig. 2. The Concavity Features of Digit 5

(a) Vertical (b) Horizontal (c) Left Diagonal (d) Right Diag-onal

Fig. 3. The Different Feature Orientation

vertical strip on a digit 5 concavity map. The feature vectorfor this strip is [ 8 0 0 0 0 3 2 0 0 ] because there are 8 pixelslabeled with C1, 3 pixels labeled with C6, 2 pixels labeledwith C7, and no pixels belonging to any other configuration.

2) Horizontal: The concavity map is divided into horizontalstrips. The count of each concavity configuration in the stripis considered a feature. Figure 3(b) shows an example of ahorizontal strip on a digit 5 concavity map. The feature vectorfor this strip is [ 8 0 0 0 0 5 0 2 0 ] because there are 8 pixelslabeled with C1, 5 pixels labeled with C6, 2 pixels labeledwith C8, and no pixels belonging to any other configuration.

3) Left Diagonal: The concavity map is divided into leftdiagonal strips. The count of each concavity configuration inthe strip is considered a feature. Figure 3(c) shows an exampleof a left diagonal strip on a digit 5 concavity map. The featurevector for this strip is [11 0 0 0 0 0 3 3 0 ] because thereare 11 pixels labeled with C1, 3 pixels labeled with C7, 3pixels labeled with C8, and no pixels belonging to any otherconfiguration.

4) Right Diagonal: The concavity map is divided into rightdiagonal strips. The count of each concavity configuration inthe strip is considered a feature. Figure 3(d) shows an exampleof a right diagonal strip on a digit 5 concavity map. The featurevector for this strip is [ 10 0 0 0 0 5 0 0 2 ] because thereare 10 pixels labeled with C1, 5 pixels labeled with C6, 2pixels labeled with C9, and no pixels belonging to any otherconfiguration.

It has to be noted that the width of the strips used to extractthe features in the different directions could be of any numberof pixels. The strips used in the example shown in Figure 3have widths of one pixel. A validation set has been used to

201

Page 3: [IEEE 2011 Tenth International Conference on Machine Learning and Applications (ICMLA) - Honolulu, HI, USA (2011.12.18-2011.12.21)] 2011 10th International Conference on Machine Learning

(a) Different styles (b) Different sizes

Fig. 4. Different samples of the Arabic Digit ’0’

Fig. 5. The Block diagram

find the best empirical width for the strips as explained inSection IV. Another note is that the feature vector obtainedfor each strip is normalized by dividing the number of pixelsof every concavity configuration by the total number of whitepixels in the strip

C. Digit Zero Problem

The Arabic Digit zero ’0’ is very difficult to recognizebecause there is no specific way to write it, as it is only likea decimal point. Figure 4(a) shows different samples of theArabic Digit ’0’. The figure clearly shows that the digit has nospecific way of writing and it usually confuses with other digitsespecially Arabic digits ’1’ and ’5’. The most discriminatingfeature for the digit ’0’ over the other digits is its small size.Figure 4(b) shows the average size of Digit ’0’ versus otherArabic digits. Thus, to solve the digit zero problem we haveadded a feature area that describes the size of the boundingbox of the digit. Another feature height/width has beenadded to distinguish digit ’0’ from digit ’1’. Those two featureshave significantly increased the recognition rate of the zero.

III. CLASSIFICATION

In the proposed system a classifier is used for the con-cavity features obtained from each of the four orientations:horizontal, vertical, left diagonal, and right diagonal. FigureIII shows the block diagram of the system. First, the featurevector for every orientation is computed then introduced tothe corresponding classifier. To get a single decision from thesystem we employ decision fusion methods to reach the finalclassification. The next sections describe the classifiers usedand the fusion methods.

A. Single Classifiers

The output of the feature extraction stage is four differentfeature vectors for the four different orientations used. A clas-sifier is used for each orientation. To simplify the classificationwe use simple one versus one (OVO) Support Vector Machine

(SVM) with linear kernel classifier. Since the SVM is a binaryclassifier, the one versus one classification requires n(n−1)

2classifiers for n classes. The final decision of the classificationin a one versus one classifier can be obtained using simplemajority voting. Unfortunately, this method does not generatethe confidence measure for each class needed in the classifierfusion step. Various methods can be used to generate suchprobability or confidence measure from pairwise classifiers.One of the popular methods is introduced in [9] where piprobability of sample x to belong to class i is computed usingthe following equation :

Pi(x) =1

K∑j=1,j 6=i

1rij(x)

− (K − 2)

where rij(x) is the probability of the tested pattern x to beof class i, where the used classifier is the one responsible forseparating class i from class j and k is the number of classes.

B. Fusion Methods

Fusion methods can be divided into three main categories:label based, rank based, and soft margin or fuzzy basedmethods. The label based method are usually based on the finallabel of base classifiers like the majority voting fusion. Therank based method uses the decision values of base classifiersto rank using various fixed rules to get the final decision forthe system. The soft margin and fuzzy based methods usethe output decision values of the base classifiers as a patternrecognition problem and try to detect the correct pattern.Methods like Bayes and neural network fusion are consideredthe most widely used fusion methods based on this type [10]–[12].

The four base classifier generate the posterior probabilitiesPij(x), i = 1 . . . c; j = 1 . . . k for c classifiers and k classesis computed for test sample x. A fusion method will combinethem to a new set of probability qj(x) which can be usedfor the final classification. Let qj(x) be computed by qj(x) =rulei (Pij(x)). The final classification is made by φ(x) =max

j(qj(x)). The rules we used to compute qj(x) are Majority

vote, Maximum, Minimum, Mean, Median, Sum, WeightedSum, Product and Borda Count.

IV. EXPERIMENTS AND RESULTS

The experiments are performed on the ADBase database [5]which consist of 70,000 handwritten digits collected form 700different writers. The dataset is divided into 60,000 samplesfor the training set and 10,000 samples for the test set. Theclassifiers’ parameters are optimized using a validation set of10,000 samples taken from the training set. The validationset has been used to obtain the best number of strips n (5,empirically) in each of the four orientations, the classifiers’parameters, and for training the neural network for fusion ofthe classifiers.

The number of features in the feature vector obtained forevery orientation is the number of features per strip times the

202

Page 4: [IEEE 2011 Tenth International Conference on Machine Learning and Applications (ICMLA) - Honolulu, HI, USA (2011.12.18-2011.12.21)] 2011 10th International Conference on Machine Learning

TABLE ISINGLE CLASSIFIER RESULTS

Classifier Recognition rateHorizontal 99.24

Vertical 99.07Right diagonal 98.79Left Diagonal 98.85

TABLE IIRESULT OF DIFFERENT FUSION METHODS

Fusion Method Recognition RateMajority Voting 99.32

Sum 99.32Product 99.29

Weighted sum 99.31Borda Count 99.32

Mean 99.32Median 99.33

Max 99.18Min 99.21

Neural Network 99.36

number of strips. This amounts to 45 (9 concavity features *5 strips) features. This is in addition to the two size featuresused to distinguish digit ’0’ from all other digits. Thus, thenumber of features used for each of the four classifiers is 47(45+2).

A. Experiments on Separate Classifiers

Table I shows the results of all the single classifiers oneach feature orientation. The results show that the horizontalorientation reports the highest accuracy. This observation maybe due to Arabic Digits having more visually discriminativefeatures on the horizontal orientation than on any other orien-tation.

B. Experiment for Fusion Methods

Table II shows the results of the different fusion methodsused to combine the results of all the four basic classifiers.The table shows the final system results after the fusion rulesare applied. The results show that the Mean, Median, Majorityvote, Borda Count, and neural network fusion methods givecomparable results. The neural network result is the best but itneeds a repetitive training which increases the complexity ofthe system. The neural network used is a multilayer perceptron(MLP) neural network to learn the output decision of our fourclassifiers. The output of the neural network is considered thefinal label of the system. The MLP consists of one hidden layerand 80 hidden neurons using the back propagation algorithm.

C. Comparison to Other Systems

We have compared the results obtained by the proposedsystem with previous results on the same Arabic handwrittendigits database using linear SVM as reported in [5], [6]. Theresults are given in Table III for the proposed system andfor several other feature extraction techniques. Comparing theresults clearly reveals the superiority of the proposed systemwith a recognition rate of 99.36% over all previously reported

TABLE IIIACCURACIES OF DIFFERENT FEATURES USING LINEAR SVM

Feature Recognition RateGradient 99.18Gradient+size 99.22Kirsch 99.06Local Chain 98.32Wavelet 97.73Domain specific 99.25Proposed System 99.36

methods. Moreover, concavity features in the horizontal ori-entation alone without any classifier fusion produces 99.24%recognition rate (see Table II) which is superior to mostprevious feature extraction methods.

V. CONCLUSION

In this paper we introduced a system that recognizes Arabichandwritten Digits with 99.36% efficiency. The system uses9 different concavity features. The features are extracted infour orientations: horizontal, vertical, right diagonal, and leftdiagonal. Four different classifiers are then used, one for eachorientation. The classifier outputs are then passed through aclassifier fusion system. Different types of fusion have beenused to achieve a final recognition accuracy of 99.36%.

REFERENCES

[1] C. L. Liu, K. Nakashima, H. Sako, and H. Fujisawa, “Handwrittendigit recognition: benchmarking of state-of-the-art techniques,” PatternRecognition, vol. 36, pp. 2271–2285, 2003.

[2] F. A. Al-Omari and O. M. Al-Jarrah, “Handwritten indian numeralsrecognition system using probabilistic neural networks,” Advanced En-gineering Informatics, vol. 18, no. 1, pp. 9–16, 2004.

[3] F. N. Said, R. A. Yacoub, and C. Y. Suen, “Recognition of english andarabic numerals using a dynamic number of hidden neurons,” in ICDAR,1999, pp. 237–240.

[4] E. El-Sherif and S. Abdelazeem, “A two-stage system for arabic hand-written digit recognition tested on a new large database,” in InternationalConference on Artificial Intelligence and Pattern Recognition (AIPR-07),Orlando, FL, USA, July 2007.

[5] S. Abdelazeem and E. El-Sherif, “Arabic handwritten digit recognition,”International Journal of Document Analysis and Recognition IJDAR,vol. 11, no. 3, pp. 127–141, Dec 2008.

[6] S. Abdelazeem, “A novel domain-specific feature extraction schemefor arabic handwritten digits recognition,” in Proceedings of the 2009International Conference on Machine Learning and Applications, ser.ICMLA ’09, Miami Beach, Florida, 13-15 December 2009, pp. 247–252.

[7] F. Lauer, C. Y. Suen, and G. Bloch, “A trainable feature extractor forhandwritten digit recognition,” Pattern Recognition, vol. 40, no. 6, pp.1816–1824, Jun. 2007.

[8] H. Freeman, “On the encoding of arbitrary geometric configurations,”Institute of Radio Engineers, trans. on Electronic Computers, vol. EC-10, pp. 260–268, 1961.

[9] D. Price, S. Knerr, L. Personnaz, and G. Dreyfus, “Pairwise neuralnetwork classifiers with probabilistic outputs,” in Advances in NeuralInformation Processing Systems 7 (NIPS-94), 1995, pp. 11091116,.

[10] N. Farah, M. T. Khadir, and M. Sellami, “Artificial neural network fusion: Application to Arabic words recognition,” in European Symposium onNeural Networks. ESANN 2005, no. April, 2005, pp. 27–29.

[11] D. Ruta and B. Gabrys, “An overview of classifier fusion methods,”Computing and Information Systems, vol. 7, no. 1, pp. 1–10, 2000.

[12] T. Denoeux, “A neural network classifier based on Dempster-Shafertheory,” IEEE Transactions on Systems, Man, and Cybernetics - Part A:Systems and Humans, vol. 30, no. 2, pp. 131–150, Mar. 2000.

203


Recommended