Research ArticleThe Application of Pattern Recognition in Electrofacies Analysis
Huan Li1 Xiao Yang2 and Wenhong Wei1
1 Dongguan University of Technology Dongguan 523808 China2 School of Information Science and Technology Tsinghua University Beijing 100084 China
Correspondence should be addressed to Wenhong Wei weiwhdguteducn
Received 24 February 2014 Accepted 26 April 2014 Published 4 June 2014
Academic Editor Guiming Luo
Copyright copy 2014 Huan Li et alThis is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
Pattern recognition is an important analytical tool in electrofacies analysis In this paper we study several commonly used clusteringand classification algorithms On the basis of advantages and disadvantages of existing algorithms we introduce the KMRICalgorithm which improves initial centers of K-means Also we propose the AKM algorithm which automatically determines thenumber of clusters and apply support vector machine to classification Finally we apply these algorithms to electrofacies analysiswhere the experiments on the real-world datasets are carried out to compare the merits of various algorithms
1 Introduction
Thebasic principle of electrofacies analysis is to determine thelithological types corresponding to electrofacies according tothe known lithological and underlying parameters in the keywell Then we conduct clustering and discriminant analysisof key well and noncoring wells to automatically judge theautomatica
Clusteringmeans the process of partitioning an unlabeleddataset into groups of similar objects Each group called acluster consists of objects that are similar to each other withrespect to a certain similarity measure and which are dis-similar to objects of other groups The applications of clusteranalysis have been used in a wide range of different areasincluding artificial intelligence bioinformatics biology com-puter vision data compression image analysis informationretrieval machine learning marketing medicine patternrecognition spatial database analysis statistics recommen-dation systems and web mining
Dong et al [1] proposed an improvement method basedon K-means which obtains the optimized initial center froma group of initial clustering centers The K-means algorithmis one of the most popular and widespread partitioningclustering algorithms because of its superior feasibility and
efficiency in dealing with a large amount of data The maindrawback of the KM algorithm is that the cluster resultis sensitive to the selection of the initial cluster centersand may converge to the local optima At present thedevelopment tendency of clusteringmethod is to find a globaloptimal solution in combinationwith the global optimizationmethods like simulated annealing particle swarm and otherlocal methods like K-means [2ndash6] Pelleg andMoore [7] pro-posed an algorithm which can automatically determine theoptimal number of clusters during clustering The challengeof clustering high-dimensional data has emerged in recentyears Clustering high-dimensional data is the cluster analysisof data anywhere from a few dozens to many thousandsof dimensions Such high-dimensional data spaces are oftenencountered in areas such as medicine biology bioinformat-ics and the clustering of text documents where if a word-frequency vector is used the number of dimensions equalsthe size of the dictionary In high-dimensional clusteringgenerally the original space is transformed by PCA SVDK-L transformation and other dimensionality reductionmethods first then the clustering of low-dimensional spaceis performed Bertini et al [8] introduced a high-dimensionalvisualization technology showing multidimensional data ontwo-dimensional plane
Hindawi Publishing CorporationJournal of Applied MathematicsVolume 2014 Article ID 640406 8 pageshttpdxdoiorg1011552014640406
2 Journal of Applied Mathematics
K-means [9 10] is a clustering method most widely usedin science and engineering nowadays However it has thefollowing 5 deficiencies [3 5]
(1) The results are initial center initiative(2) Only local optimal solution can be obtained rather
than global optimal solution(3) The number of clustering k should be set in advanced
artificially(4) The error point imposes serious impacts on the results
of clustering(5) The algorithm lacks scalability
The paper introduces an improved algorithm accordingto the deficiencies of K-means
2 Improve K-Means Method of Initial Center
Aimed at the disadvantages (1) and (4) inK-means algorithmwe propose a K-means algorithm with refined initial centers(KMRIC for short) based on the works of predecessors [1]
(1) Randomly extract J sample subsets 119878119894 119894 = 1 2 119869
(2) Conduct K-means clustering of 119869 sample subsetsrespectively on the whole data field to get J setsCM119894 119894 = 1 2 119869 CM = ⋃119869
119894=1CM119894 in which there
are119870 times 119869 points for CM at most(3) Conduct K-means clustering on CM by taking CM as
the initial clustering center to get J clustering centersets FM
119894119894 = 1 2 119869
It can be seen from Figure 1 that the clustering center isobtained fromdifferent subsample set near the real clusteringcenter and clustering is formed by different subsample setIn (3) selecting the one with the minimum sum of squaresof deviations as the improved initial clustering center canreduce the randomness brought by random selection In (2)to eliminate the influence of error point the modified K-means algorithm (KmeansMod) is adoptedKmeansMod hasthe following modification based on the standard K-meanswhen the standardK-means algorithm is completed the datapoint contained in each clustering will be checked If the datapoint contained in a clustering is zero the original center willbe replaced by taking the data point furthest to the clusteringcenter as a new center and then the K-means algorithm isreran
KMRPIC algorithm eliminates the sensitivity ofK-meansalgorithm to data input consequence and initial centerswhich is an obvious improvement compared with K-meanseffect When applied to large-scale data KMRIC can reducethe iterations and improve the execution efficiency
3 Adaptive K-Means
The number of clusters k of K-means algorithm should be setin advance manually However actually we do not know thevalue of k especially in the case of high dimension of data soit is more difficult to select the correct value of k
stands for the real clusteringstands for the cluster that is obtained by different subsample set
Figure 1 Multicombination clustering center obtained from multi-ple sample subsets
X-means put forward by Pelleg and Moore [7] canautomatically determine the number of clusters HoweverX-means is prone to split data into more clusters than theactual ones which is particularly obvious when the data isnot strictly subject to the normal distribution Lewis [11]statistics are introduced as the standard of measuring thenormal distribution and propose an adaptiveK-means (AKMfor short)
The AKM algorithm first assumes that all data are inthe same cluster then the number of clustering is graduallyincreased in the subsequent iterations In each iterationwhether each cluster satisfied the normal distribution isjudged at once if not the cluster should be split into twoclusters After each splitting K-means clustering is carriedout in the whole data field to improve the clustering resultsThe iteration ends until there is no splitting and then the finalclustering results will be obtained The schematic diagram ofAKM algorithm is shown in Figure 2 In Figure 2 clusteringis divided into three categories firstly then each categoryis split into two subclasses At last the results are got afterone splitting to judge whether each subclass follows Gaussiandistribution
The judgment of splitting is as follows
(1) Select the confidence level 120572
(2) Run KMRIC program and split X into two to get twoclustering centers 119888
1 1198882
(3) Let ] = 1198881minus1198882be anN-dimensional vector connecting
the two centers which is themain direction of judgingthe normal distribution X is projected on ] 1199091015840
119894=
(⟨119909119894 ]⟩]2)1198831015840 is transformed to make its mean as 0
and variance as 1
(4) Suppose that 119911119894= 119865(119909
1015840
(119894)) The results 1198602
lowast(119885) with
respect to confidence level 120572 are not significant soaccept119867
0 reserve the original clustering center 119888 and
abandon 1198881and 1198882 Otherwise reject 119867
0 and replace
the original clustering center 119888 by 1198881and 1198882
Journal of Applied Mathematics 3
(a) It is divided into three categories (b) Each category is split into two subclasses
(c) Get the results after one splitting to judge whether eachsubclass follows Gaussian distribution
Figure 2 Schematic diagram of AKM algorithm
1198602
lowast(119885) is the statistics of Anderson Darling
1198602(119885) = minus
1
119899
119899
sum
119894=1
(2119894 minus 1) [log (119911119894) + log (1 minus 119911119899+1minus119894)] minus 119899
(1)
Figure 3 shows two distribution circumstances InFigure 3(a) the subclass follows Gaussian distributionbut in Figure 3(b) the subclass does not follow Gaussiandistribution AKM algorithm can judge whether eachsubclass follows Gaussian distribution
AKM integrates the determination process of the numberof clusters and the clustering process which can automati-cally determine the optimal number of clusters thus avoidingthe subjectivity in the selection of number of clusters and theblindness of initialization and can also distinguish the errors
4 Discriminant Method
41 Fisher Classification Fisher method actually is about thedimension compression Projecting the samples which can
be easily separated in higher space on a straight line arbitrarilymay be difficult to be identified for different types mixedtogether Generally the best direction can always be foundto separate the samples when projected on that directionBut how to find out the best direction and how to realizethe transformations of projection toward the best directionare the very two problems to be solved by Fisher algorithmFigure 4 shows analysis schematic diagram of Fisher algo-rithm using linear discriminant In Figure 4(a) the samplecannot be identifiedwhen being projected on coordinate axisand in Figure 4(b) the projection samples can be identified bylooking for a direction
42 Potential Function Classification Potential function acommon method used in nonlinear classifier is a way tosolve the classification problems of pattern via the conceptionof electric field In the potential function classification thesamples belonging to one category are treated as positivecharge while the samples belonging to another category aretreated as the negative charge thus turning the classificationproblems of pattern to the matter of transferring the positive
4 Journal of Applied Mathematics
(a) The subclass follows Gaussian distribution (b) The subclass does not follow Gaussian distribution
Figure 3 Judge whether each subclass follows Gaussian distribution
(a) The sample cannot be identified when beingprojected on coordinate axis
(b) Theprojection samples can be identified by lookingfor a direction
Figure 4 Schematic diagram of Fisher linear discriminant analysis
charge and negative charge and the equipotential line whereits electric potential is zero is the decision boundary Thetraining course of potential function algorithm is a processof accumulating electric potential when the samples are inputone after another by exploiting the potential function
43 Least Squares Support Vector Machine (LS-SVM) Basedon the VC dimension theory of statistical learning theoryand the structural riskminimization principle support vectormachinesmethod [12] converts the practical problem to high-dimensional feature space through nonlinear transform-ation and realizes the nonlinear discriminant function in theoriginal space by constructing linear discriminant functionin higher space By means of introducing the least squareslinear system into support vector machine to replace thetraditional one quadratic programming method which isadopted to settle the problems of classification and estima-tion is a kind of extension of traditional support vectormachine
5 Procedures of Electrofacies Analysis
The procedure of electrofacies analysis is shown in Figure 5
51 Feature Extraction of Log Data The primary step toestablish electrofacies is to extract a set of log data featuresthat can reflect the lithologic character of sedimentary rockGenerally there are 9 types of well-logging items or moreand those logging items are interrelated There are two waysto eliminate gibberish simplify control methods and reducecalculated amount (1) principal component analysis (2)Select logging items manually The extracted logging itemswill be recorded in Table stdlogdata as the data source forclustering analysis
52 Clustering Analysis In order to find out the electrofaciesof the same type and establish a standard library in electro-facies analysis clustering analysis must be conducted to stra-tum Finally the classification results acquired by clustering
Journal of Applied Mathematics 5
Input number of typical
Set up dynamic data
Principal component
analysis
Select well-
logging item
Clustering analysis
Lithology dictionary
Set up electrofacies-lithofa
Input number of unknown
Set up dynamic data
Select well-logging item
Discriminate analysis
Determine
Figure 5 Flow diagram of electrofacies analysis
should be recorded in the column of ldquoCategoryrdquo in Tablestdlogdata and the lithology be recorded in the column ofldquoLithologyrdquo according to the lithology dictionary
53 Discriminant Analysis After establishing lithofaciesdatabase namely the electrofacies of type well it is possibleto discriminate the lithofacies of other wells After discrim-ination the data and discriminant results will be written inTable anylogdata and the logging items bewritten in the Tableanylogitem
6 Comparison and Analysis ofResults of Algorithm
61 Experimental Data The Iris dataset [13] usually serves asthe testing dataset for benchmark function in which eachrecord contains 4 attributes of Iris totaling 150 samplesThe correct classification result is that each type of datahas 50 samples Eight attributes are included in each set ofdata of electrofacies totaling 177 samples As for the realdata in electrofacies there is no strictly accurate number ofcategories and standard classification Judging by experience8 classifications may be rational
62 Analysis of Experimental Results of Cluster
621 Iris Dataset It can be easily seen from Figures 6ndash9that the cluster obtained by standard K-means algorithm ispretty different from the standard results while the clusteringresults obtained by ISODATA and KMRIC come near to thestandard ones and are the same as the results obtained bybuilt-in K-means algorithm of Matlab AKM has only twocategories The second and the third categories are deemedas belonging to the same normal distribution that are neverapart for they are approximate to each other and have someparts overlapped (see Table 1 and 2)
Table 1 Clustering method comparison under Iris dataset
119870-means ISODATA KMRIC AKM MatlabType I 30 50 50 53 50Type II 24 39 39 97 38Type III 96 61 61 0 62Accuracy 693 926 926 667 92
Table 2 Clustering method comparisons under Iris dataset
119870-means ISODATA KMRIC AKM MatlabType I 56 35 47 46 47Type II 38 30 46 40 45Type III 36 26 26 26 23Type IV 18 23 14 23 23Type V 11 17 13 13 13Type VI 9 13 12 12 12Type VII 8 12 10 10 11Type VIII 1 10 9 7 3Type IX 0 8 0 0 0Type X 0 3 0 0 0
622 Electrofacies Dataset It can be seen from Figures 10ndash13that the clustering results obtained by K-means have largeerror while the cluster obtained by KMRIC and AKM isrelatively rational and can basically reflect the right classifi-cation and AKM can also identify the accurate number ofclustering automatically Compared with ISODATA AKMis more accurate in determining the number of clusteringand its clustering results are more rational as well Besidesit proves that the hypothesis testing way to judge the numberof clustering of AKM is more universal than that by judgingit based on the between-class distance of ISODATA
6 Journal of Applied Mathematics
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 6 Clustering results of dataset by Matlab figure
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 7 Clustering results obtained by standard K-means
Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset
Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100
63 Experimental Results and Analysis of Classification
631 Iris Dataset See Table 3
632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 8 Clustering results obtained by ISODATA and KMRIC
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 9 Clustering results obtained by AKM
Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset
Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769
within-class scatter 119878119908
matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study
Journal of Applied Mathematics 7
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 10 Clustering results obtained by standard K-means
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 11 Clustering results obtained by ISODATA
7 Conclusion
On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly
There are some other problems that remain open Thevolatility of results which was caused by the randomness
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 12 Clustering results obtained by KMRIC
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 13 Clustering results obtained by AKM
of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)
8 Journal of Applied Mathematics
References
[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013
[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014
[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010
[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014
[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011
[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012
[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000
[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011
[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014
[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014
[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961
[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013
[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
2 Journal of Applied Mathematics
K-means [9 10] is a clustering method most widely usedin science and engineering nowadays However it has thefollowing 5 deficiencies [3 5]
(1) The results are initial center initiative(2) Only local optimal solution can be obtained rather
than global optimal solution(3) The number of clustering k should be set in advanced
artificially(4) The error point imposes serious impacts on the results
of clustering(5) The algorithm lacks scalability
The paper introduces an improved algorithm accordingto the deficiencies of K-means
2 Improve K-Means Method of Initial Center
Aimed at the disadvantages (1) and (4) inK-means algorithmwe propose a K-means algorithm with refined initial centers(KMRIC for short) based on the works of predecessors [1]
(1) Randomly extract J sample subsets 119878119894 119894 = 1 2 119869
(2) Conduct K-means clustering of 119869 sample subsetsrespectively on the whole data field to get J setsCM119894 119894 = 1 2 119869 CM = ⋃119869
119894=1CM119894 in which there
are119870 times 119869 points for CM at most(3) Conduct K-means clustering on CM by taking CM as
the initial clustering center to get J clustering centersets FM
119894119894 = 1 2 119869
It can be seen from Figure 1 that the clustering center isobtained fromdifferent subsample set near the real clusteringcenter and clustering is formed by different subsample setIn (3) selecting the one with the minimum sum of squaresof deviations as the improved initial clustering center canreduce the randomness brought by random selection In (2)to eliminate the influence of error point the modified K-means algorithm (KmeansMod) is adoptedKmeansMod hasthe following modification based on the standard K-meanswhen the standardK-means algorithm is completed the datapoint contained in each clustering will be checked If the datapoint contained in a clustering is zero the original center willbe replaced by taking the data point furthest to the clusteringcenter as a new center and then the K-means algorithm isreran
KMRPIC algorithm eliminates the sensitivity ofK-meansalgorithm to data input consequence and initial centerswhich is an obvious improvement compared with K-meanseffect When applied to large-scale data KMRIC can reducethe iterations and improve the execution efficiency
3 Adaptive K-Means
The number of clusters k of K-means algorithm should be setin advance manually However actually we do not know thevalue of k especially in the case of high dimension of data soit is more difficult to select the correct value of k
stands for the real clusteringstands for the cluster that is obtained by different subsample set
Figure 1 Multicombination clustering center obtained from multi-ple sample subsets
X-means put forward by Pelleg and Moore [7] canautomatically determine the number of clusters HoweverX-means is prone to split data into more clusters than theactual ones which is particularly obvious when the data isnot strictly subject to the normal distribution Lewis [11]statistics are introduced as the standard of measuring thenormal distribution and propose an adaptiveK-means (AKMfor short)
The AKM algorithm first assumes that all data are inthe same cluster then the number of clustering is graduallyincreased in the subsequent iterations In each iterationwhether each cluster satisfied the normal distribution isjudged at once if not the cluster should be split into twoclusters After each splitting K-means clustering is carriedout in the whole data field to improve the clustering resultsThe iteration ends until there is no splitting and then the finalclustering results will be obtained The schematic diagram ofAKM algorithm is shown in Figure 2 In Figure 2 clusteringis divided into three categories firstly then each categoryis split into two subclasses At last the results are got afterone splitting to judge whether each subclass follows Gaussiandistribution
The judgment of splitting is as follows
(1) Select the confidence level 120572
(2) Run KMRIC program and split X into two to get twoclustering centers 119888
1 1198882
(3) Let ] = 1198881minus1198882be anN-dimensional vector connecting
the two centers which is themain direction of judgingthe normal distribution X is projected on ] 1199091015840
119894=
(⟨119909119894 ]⟩]2)1198831015840 is transformed to make its mean as 0
and variance as 1
(4) Suppose that 119911119894= 119865(119909
1015840
(119894)) The results 1198602
lowast(119885) with
respect to confidence level 120572 are not significant soaccept119867
0 reserve the original clustering center 119888 and
abandon 1198881and 1198882 Otherwise reject 119867
0 and replace
the original clustering center 119888 by 1198881and 1198882
Journal of Applied Mathematics 3
(a) It is divided into three categories (b) Each category is split into two subclasses
(c) Get the results after one splitting to judge whether eachsubclass follows Gaussian distribution
Figure 2 Schematic diagram of AKM algorithm
1198602
lowast(119885) is the statistics of Anderson Darling
1198602(119885) = minus
1
119899
119899
sum
119894=1
(2119894 minus 1) [log (119911119894) + log (1 minus 119911119899+1minus119894)] minus 119899
(1)
Figure 3 shows two distribution circumstances InFigure 3(a) the subclass follows Gaussian distributionbut in Figure 3(b) the subclass does not follow Gaussiandistribution AKM algorithm can judge whether eachsubclass follows Gaussian distribution
AKM integrates the determination process of the numberof clusters and the clustering process which can automati-cally determine the optimal number of clusters thus avoidingthe subjectivity in the selection of number of clusters and theblindness of initialization and can also distinguish the errors
4 Discriminant Method
41 Fisher Classification Fisher method actually is about thedimension compression Projecting the samples which can
be easily separated in higher space on a straight line arbitrarilymay be difficult to be identified for different types mixedtogether Generally the best direction can always be foundto separate the samples when projected on that directionBut how to find out the best direction and how to realizethe transformations of projection toward the best directionare the very two problems to be solved by Fisher algorithmFigure 4 shows analysis schematic diagram of Fisher algo-rithm using linear discriminant In Figure 4(a) the samplecannot be identifiedwhen being projected on coordinate axisand in Figure 4(b) the projection samples can be identified bylooking for a direction
42 Potential Function Classification Potential function acommon method used in nonlinear classifier is a way tosolve the classification problems of pattern via the conceptionof electric field In the potential function classification thesamples belonging to one category are treated as positivecharge while the samples belonging to another category aretreated as the negative charge thus turning the classificationproblems of pattern to the matter of transferring the positive
4 Journal of Applied Mathematics
(a) The subclass follows Gaussian distribution (b) The subclass does not follow Gaussian distribution
Figure 3 Judge whether each subclass follows Gaussian distribution
(a) The sample cannot be identified when beingprojected on coordinate axis
(b) Theprojection samples can be identified by lookingfor a direction
Figure 4 Schematic diagram of Fisher linear discriminant analysis
charge and negative charge and the equipotential line whereits electric potential is zero is the decision boundary Thetraining course of potential function algorithm is a processof accumulating electric potential when the samples are inputone after another by exploiting the potential function
43 Least Squares Support Vector Machine (LS-SVM) Basedon the VC dimension theory of statistical learning theoryand the structural riskminimization principle support vectormachinesmethod [12] converts the practical problem to high-dimensional feature space through nonlinear transform-ation and realizes the nonlinear discriminant function in theoriginal space by constructing linear discriminant functionin higher space By means of introducing the least squareslinear system into support vector machine to replace thetraditional one quadratic programming method which isadopted to settle the problems of classification and estima-tion is a kind of extension of traditional support vectormachine
5 Procedures of Electrofacies Analysis
The procedure of electrofacies analysis is shown in Figure 5
51 Feature Extraction of Log Data The primary step toestablish electrofacies is to extract a set of log data featuresthat can reflect the lithologic character of sedimentary rockGenerally there are 9 types of well-logging items or moreand those logging items are interrelated There are two waysto eliminate gibberish simplify control methods and reducecalculated amount (1) principal component analysis (2)Select logging items manually The extracted logging itemswill be recorded in Table stdlogdata as the data source forclustering analysis
52 Clustering Analysis In order to find out the electrofaciesof the same type and establish a standard library in electro-facies analysis clustering analysis must be conducted to stra-tum Finally the classification results acquired by clustering
Journal of Applied Mathematics 5
Input number of typical
Set up dynamic data
Principal component
analysis
Select well-
logging item
Clustering analysis
Lithology dictionary
Set up electrofacies-lithofa
Input number of unknown
Set up dynamic data
Select well-logging item
Discriminate analysis
Determine
Figure 5 Flow diagram of electrofacies analysis
should be recorded in the column of ldquoCategoryrdquo in Tablestdlogdata and the lithology be recorded in the column ofldquoLithologyrdquo according to the lithology dictionary
53 Discriminant Analysis After establishing lithofaciesdatabase namely the electrofacies of type well it is possibleto discriminate the lithofacies of other wells After discrim-ination the data and discriminant results will be written inTable anylogdata and the logging items bewritten in the Tableanylogitem
6 Comparison and Analysis ofResults of Algorithm
61 Experimental Data The Iris dataset [13] usually serves asthe testing dataset for benchmark function in which eachrecord contains 4 attributes of Iris totaling 150 samplesThe correct classification result is that each type of datahas 50 samples Eight attributes are included in each set ofdata of electrofacies totaling 177 samples As for the realdata in electrofacies there is no strictly accurate number ofcategories and standard classification Judging by experience8 classifications may be rational
62 Analysis of Experimental Results of Cluster
621 Iris Dataset It can be easily seen from Figures 6ndash9that the cluster obtained by standard K-means algorithm ispretty different from the standard results while the clusteringresults obtained by ISODATA and KMRIC come near to thestandard ones and are the same as the results obtained bybuilt-in K-means algorithm of Matlab AKM has only twocategories The second and the third categories are deemedas belonging to the same normal distribution that are neverapart for they are approximate to each other and have someparts overlapped (see Table 1 and 2)
Table 1 Clustering method comparison under Iris dataset
119870-means ISODATA KMRIC AKM MatlabType I 30 50 50 53 50Type II 24 39 39 97 38Type III 96 61 61 0 62Accuracy 693 926 926 667 92
Table 2 Clustering method comparisons under Iris dataset
119870-means ISODATA KMRIC AKM MatlabType I 56 35 47 46 47Type II 38 30 46 40 45Type III 36 26 26 26 23Type IV 18 23 14 23 23Type V 11 17 13 13 13Type VI 9 13 12 12 12Type VII 8 12 10 10 11Type VIII 1 10 9 7 3Type IX 0 8 0 0 0Type X 0 3 0 0 0
622 Electrofacies Dataset It can be seen from Figures 10ndash13that the clustering results obtained by K-means have largeerror while the cluster obtained by KMRIC and AKM isrelatively rational and can basically reflect the right classifi-cation and AKM can also identify the accurate number ofclustering automatically Compared with ISODATA AKMis more accurate in determining the number of clusteringand its clustering results are more rational as well Besidesit proves that the hypothesis testing way to judge the numberof clustering of AKM is more universal than that by judgingit based on the between-class distance of ISODATA
6 Journal of Applied Mathematics
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 6 Clustering results of dataset by Matlab figure
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 7 Clustering results obtained by standard K-means
Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset
Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100
63 Experimental Results and Analysis of Classification
631 Iris Dataset See Table 3
632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 8 Clustering results obtained by ISODATA and KMRIC
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 9 Clustering results obtained by AKM
Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset
Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769
within-class scatter 119878119908
matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study
Journal of Applied Mathematics 7
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 10 Clustering results obtained by standard K-means
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 11 Clustering results obtained by ISODATA
7 Conclusion
On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly
There are some other problems that remain open Thevolatility of results which was caused by the randomness
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 12 Clustering results obtained by KMRIC
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 13 Clustering results obtained by AKM
of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)
8 Journal of Applied Mathematics
References
[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013
[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014
[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010
[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014
[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011
[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012
[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000
[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011
[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014
[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014
[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961
[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013
[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Journal of Applied Mathematics 3
(a) It is divided into three categories (b) Each category is split into two subclasses
(c) Get the results after one splitting to judge whether eachsubclass follows Gaussian distribution
Figure 2 Schematic diagram of AKM algorithm
1198602
lowast(119885) is the statistics of Anderson Darling
1198602(119885) = minus
1
119899
119899
sum
119894=1
(2119894 minus 1) [log (119911119894) + log (1 minus 119911119899+1minus119894)] minus 119899
(1)
Figure 3 shows two distribution circumstances InFigure 3(a) the subclass follows Gaussian distributionbut in Figure 3(b) the subclass does not follow Gaussiandistribution AKM algorithm can judge whether eachsubclass follows Gaussian distribution
AKM integrates the determination process of the numberof clusters and the clustering process which can automati-cally determine the optimal number of clusters thus avoidingthe subjectivity in the selection of number of clusters and theblindness of initialization and can also distinguish the errors
4 Discriminant Method
41 Fisher Classification Fisher method actually is about thedimension compression Projecting the samples which can
be easily separated in higher space on a straight line arbitrarilymay be difficult to be identified for different types mixedtogether Generally the best direction can always be foundto separate the samples when projected on that directionBut how to find out the best direction and how to realizethe transformations of projection toward the best directionare the very two problems to be solved by Fisher algorithmFigure 4 shows analysis schematic diagram of Fisher algo-rithm using linear discriminant In Figure 4(a) the samplecannot be identifiedwhen being projected on coordinate axisand in Figure 4(b) the projection samples can be identified bylooking for a direction
42 Potential Function Classification Potential function acommon method used in nonlinear classifier is a way tosolve the classification problems of pattern via the conceptionof electric field In the potential function classification thesamples belonging to one category are treated as positivecharge while the samples belonging to another category aretreated as the negative charge thus turning the classificationproblems of pattern to the matter of transferring the positive
4 Journal of Applied Mathematics
(a) The subclass follows Gaussian distribution (b) The subclass does not follow Gaussian distribution
Figure 3 Judge whether each subclass follows Gaussian distribution
(a) The sample cannot be identified when beingprojected on coordinate axis
(b) Theprojection samples can be identified by lookingfor a direction
Figure 4 Schematic diagram of Fisher linear discriminant analysis
charge and negative charge and the equipotential line whereits electric potential is zero is the decision boundary Thetraining course of potential function algorithm is a processof accumulating electric potential when the samples are inputone after another by exploiting the potential function
43 Least Squares Support Vector Machine (LS-SVM) Basedon the VC dimension theory of statistical learning theoryand the structural riskminimization principle support vectormachinesmethod [12] converts the practical problem to high-dimensional feature space through nonlinear transform-ation and realizes the nonlinear discriminant function in theoriginal space by constructing linear discriminant functionin higher space By means of introducing the least squareslinear system into support vector machine to replace thetraditional one quadratic programming method which isadopted to settle the problems of classification and estima-tion is a kind of extension of traditional support vectormachine
5 Procedures of Electrofacies Analysis
The procedure of electrofacies analysis is shown in Figure 5
51 Feature Extraction of Log Data The primary step toestablish electrofacies is to extract a set of log data featuresthat can reflect the lithologic character of sedimentary rockGenerally there are 9 types of well-logging items or moreand those logging items are interrelated There are two waysto eliminate gibberish simplify control methods and reducecalculated amount (1) principal component analysis (2)Select logging items manually The extracted logging itemswill be recorded in Table stdlogdata as the data source forclustering analysis
52 Clustering Analysis In order to find out the electrofaciesof the same type and establish a standard library in electro-facies analysis clustering analysis must be conducted to stra-tum Finally the classification results acquired by clustering
Journal of Applied Mathematics 5
Input number of typical
Set up dynamic data
Principal component
analysis
Select well-
logging item
Clustering analysis
Lithology dictionary
Set up electrofacies-lithofa
Input number of unknown
Set up dynamic data
Select well-logging item
Discriminate analysis
Determine
Figure 5 Flow diagram of electrofacies analysis
should be recorded in the column of ldquoCategoryrdquo in Tablestdlogdata and the lithology be recorded in the column ofldquoLithologyrdquo according to the lithology dictionary
53 Discriminant Analysis After establishing lithofaciesdatabase namely the electrofacies of type well it is possibleto discriminate the lithofacies of other wells After discrim-ination the data and discriminant results will be written inTable anylogdata and the logging items bewritten in the Tableanylogitem
6 Comparison and Analysis ofResults of Algorithm
61 Experimental Data The Iris dataset [13] usually serves asthe testing dataset for benchmark function in which eachrecord contains 4 attributes of Iris totaling 150 samplesThe correct classification result is that each type of datahas 50 samples Eight attributes are included in each set ofdata of electrofacies totaling 177 samples As for the realdata in electrofacies there is no strictly accurate number ofcategories and standard classification Judging by experience8 classifications may be rational
62 Analysis of Experimental Results of Cluster
621 Iris Dataset It can be easily seen from Figures 6ndash9that the cluster obtained by standard K-means algorithm ispretty different from the standard results while the clusteringresults obtained by ISODATA and KMRIC come near to thestandard ones and are the same as the results obtained bybuilt-in K-means algorithm of Matlab AKM has only twocategories The second and the third categories are deemedas belonging to the same normal distribution that are neverapart for they are approximate to each other and have someparts overlapped (see Table 1 and 2)
Table 1 Clustering method comparison under Iris dataset
119870-means ISODATA KMRIC AKM MatlabType I 30 50 50 53 50Type II 24 39 39 97 38Type III 96 61 61 0 62Accuracy 693 926 926 667 92
Table 2 Clustering method comparisons under Iris dataset
119870-means ISODATA KMRIC AKM MatlabType I 56 35 47 46 47Type II 38 30 46 40 45Type III 36 26 26 26 23Type IV 18 23 14 23 23Type V 11 17 13 13 13Type VI 9 13 12 12 12Type VII 8 12 10 10 11Type VIII 1 10 9 7 3Type IX 0 8 0 0 0Type X 0 3 0 0 0
622 Electrofacies Dataset It can be seen from Figures 10ndash13that the clustering results obtained by K-means have largeerror while the cluster obtained by KMRIC and AKM isrelatively rational and can basically reflect the right classifi-cation and AKM can also identify the accurate number ofclustering automatically Compared with ISODATA AKMis more accurate in determining the number of clusteringand its clustering results are more rational as well Besidesit proves that the hypothesis testing way to judge the numberof clustering of AKM is more universal than that by judgingit based on the between-class distance of ISODATA
6 Journal of Applied Mathematics
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 6 Clustering results of dataset by Matlab figure
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 7 Clustering results obtained by standard K-means
Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset
Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100
63 Experimental Results and Analysis of Classification
631 Iris Dataset See Table 3
632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 8 Clustering results obtained by ISODATA and KMRIC
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 9 Clustering results obtained by AKM
Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset
Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769
within-class scatter 119878119908
matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study
Journal of Applied Mathematics 7
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 10 Clustering results obtained by standard K-means
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 11 Clustering results obtained by ISODATA
7 Conclusion
On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly
There are some other problems that remain open Thevolatility of results which was caused by the randomness
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 12 Clustering results obtained by KMRIC
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 13 Clustering results obtained by AKM
of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)
8 Journal of Applied Mathematics
References
[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013
[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014
[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010
[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014
[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011
[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012
[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000
[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011
[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014
[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014
[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961
[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013
[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
4 Journal of Applied Mathematics
(a) The subclass follows Gaussian distribution (b) The subclass does not follow Gaussian distribution
Figure 3 Judge whether each subclass follows Gaussian distribution
(a) The sample cannot be identified when beingprojected on coordinate axis
(b) Theprojection samples can be identified by lookingfor a direction
Figure 4 Schematic diagram of Fisher linear discriminant analysis
charge and negative charge and the equipotential line whereits electric potential is zero is the decision boundary Thetraining course of potential function algorithm is a processof accumulating electric potential when the samples are inputone after another by exploiting the potential function
43 Least Squares Support Vector Machine (LS-SVM) Basedon the VC dimension theory of statistical learning theoryand the structural riskminimization principle support vectormachinesmethod [12] converts the practical problem to high-dimensional feature space through nonlinear transform-ation and realizes the nonlinear discriminant function in theoriginal space by constructing linear discriminant functionin higher space By means of introducing the least squareslinear system into support vector machine to replace thetraditional one quadratic programming method which isadopted to settle the problems of classification and estima-tion is a kind of extension of traditional support vectormachine
5 Procedures of Electrofacies Analysis
The procedure of electrofacies analysis is shown in Figure 5
51 Feature Extraction of Log Data The primary step toestablish electrofacies is to extract a set of log data featuresthat can reflect the lithologic character of sedimentary rockGenerally there are 9 types of well-logging items or moreand those logging items are interrelated There are two waysto eliminate gibberish simplify control methods and reducecalculated amount (1) principal component analysis (2)Select logging items manually The extracted logging itemswill be recorded in Table stdlogdata as the data source forclustering analysis
52 Clustering Analysis In order to find out the electrofaciesof the same type and establish a standard library in electro-facies analysis clustering analysis must be conducted to stra-tum Finally the classification results acquired by clustering
Journal of Applied Mathematics 5
Input number of typical
Set up dynamic data
Principal component
analysis
Select well-
logging item
Clustering analysis
Lithology dictionary
Set up electrofacies-lithofa
Input number of unknown
Set up dynamic data
Select well-logging item
Discriminate analysis
Determine
Figure 5 Flow diagram of electrofacies analysis
should be recorded in the column of ldquoCategoryrdquo in Tablestdlogdata and the lithology be recorded in the column ofldquoLithologyrdquo according to the lithology dictionary
53 Discriminant Analysis After establishing lithofaciesdatabase namely the electrofacies of type well it is possibleto discriminate the lithofacies of other wells After discrim-ination the data and discriminant results will be written inTable anylogdata and the logging items bewritten in the Tableanylogitem
6 Comparison and Analysis ofResults of Algorithm
61 Experimental Data The Iris dataset [13] usually serves asthe testing dataset for benchmark function in which eachrecord contains 4 attributes of Iris totaling 150 samplesThe correct classification result is that each type of datahas 50 samples Eight attributes are included in each set ofdata of electrofacies totaling 177 samples As for the realdata in electrofacies there is no strictly accurate number ofcategories and standard classification Judging by experience8 classifications may be rational
62 Analysis of Experimental Results of Cluster
621 Iris Dataset It can be easily seen from Figures 6ndash9that the cluster obtained by standard K-means algorithm ispretty different from the standard results while the clusteringresults obtained by ISODATA and KMRIC come near to thestandard ones and are the same as the results obtained bybuilt-in K-means algorithm of Matlab AKM has only twocategories The second and the third categories are deemedas belonging to the same normal distribution that are neverapart for they are approximate to each other and have someparts overlapped (see Table 1 and 2)
Table 1 Clustering method comparison under Iris dataset
119870-means ISODATA KMRIC AKM MatlabType I 30 50 50 53 50Type II 24 39 39 97 38Type III 96 61 61 0 62Accuracy 693 926 926 667 92
Table 2 Clustering method comparisons under Iris dataset
119870-means ISODATA KMRIC AKM MatlabType I 56 35 47 46 47Type II 38 30 46 40 45Type III 36 26 26 26 23Type IV 18 23 14 23 23Type V 11 17 13 13 13Type VI 9 13 12 12 12Type VII 8 12 10 10 11Type VIII 1 10 9 7 3Type IX 0 8 0 0 0Type X 0 3 0 0 0
622 Electrofacies Dataset It can be seen from Figures 10ndash13that the clustering results obtained by K-means have largeerror while the cluster obtained by KMRIC and AKM isrelatively rational and can basically reflect the right classifi-cation and AKM can also identify the accurate number ofclustering automatically Compared with ISODATA AKMis more accurate in determining the number of clusteringand its clustering results are more rational as well Besidesit proves that the hypothesis testing way to judge the numberof clustering of AKM is more universal than that by judgingit based on the between-class distance of ISODATA
6 Journal of Applied Mathematics
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 6 Clustering results of dataset by Matlab figure
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 7 Clustering results obtained by standard K-means
Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset
Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100
63 Experimental Results and Analysis of Classification
631 Iris Dataset See Table 3
632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 8 Clustering results obtained by ISODATA and KMRIC
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 9 Clustering results obtained by AKM
Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset
Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769
within-class scatter 119878119908
matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study
Journal of Applied Mathematics 7
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 10 Clustering results obtained by standard K-means
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 11 Clustering results obtained by ISODATA
7 Conclusion
On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly
There are some other problems that remain open Thevolatility of results which was caused by the randomness
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 12 Clustering results obtained by KMRIC
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 13 Clustering results obtained by AKM
of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)
8 Journal of Applied Mathematics
References
[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013
[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014
[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010
[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014
[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011
[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012
[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000
[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011
[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014
[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014
[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961
[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013
[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Journal of Applied Mathematics 5
Input number of typical
Set up dynamic data
Principal component
analysis
Select well-
logging item
Clustering analysis
Lithology dictionary
Set up electrofacies-lithofa
Input number of unknown
Set up dynamic data
Select well-logging item
Discriminate analysis
Determine
Figure 5 Flow diagram of electrofacies analysis
should be recorded in the column of ldquoCategoryrdquo in Tablestdlogdata and the lithology be recorded in the column ofldquoLithologyrdquo according to the lithology dictionary
53 Discriminant Analysis After establishing lithofaciesdatabase namely the electrofacies of type well it is possibleto discriminate the lithofacies of other wells After discrim-ination the data and discriminant results will be written inTable anylogdata and the logging items bewritten in the Tableanylogitem
6 Comparison and Analysis ofResults of Algorithm
61 Experimental Data The Iris dataset [13] usually serves asthe testing dataset for benchmark function in which eachrecord contains 4 attributes of Iris totaling 150 samplesThe correct classification result is that each type of datahas 50 samples Eight attributes are included in each set ofdata of electrofacies totaling 177 samples As for the realdata in electrofacies there is no strictly accurate number ofcategories and standard classification Judging by experience8 classifications may be rational
62 Analysis of Experimental Results of Cluster
621 Iris Dataset It can be easily seen from Figures 6ndash9that the cluster obtained by standard K-means algorithm ispretty different from the standard results while the clusteringresults obtained by ISODATA and KMRIC come near to thestandard ones and are the same as the results obtained bybuilt-in K-means algorithm of Matlab AKM has only twocategories The second and the third categories are deemedas belonging to the same normal distribution that are neverapart for they are approximate to each other and have someparts overlapped (see Table 1 and 2)
Table 1 Clustering method comparison under Iris dataset
119870-means ISODATA KMRIC AKM MatlabType I 30 50 50 53 50Type II 24 39 39 97 38Type III 96 61 61 0 62Accuracy 693 926 926 667 92
Table 2 Clustering method comparisons under Iris dataset
119870-means ISODATA KMRIC AKM MatlabType I 56 35 47 46 47Type II 38 30 46 40 45Type III 36 26 26 26 23Type IV 18 23 14 23 23Type V 11 17 13 13 13Type VI 9 13 12 12 12Type VII 8 12 10 10 11Type VIII 1 10 9 7 3Type IX 0 8 0 0 0Type X 0 3 0 0 0
622 Electrofacies Dataset It can be seen from Figures 10ndash13that the clustering results obtained by K-means have largeerror while the cluster obtained by KMRIC and AKM isrelatively rational and can basically reflect the right classifi-cation and AKM can also identify the accurate number ofclustering automatically Compared with ISODATA AKMis more accurate in determining the number of clusteringand its clustering results are more rational as well Besidesit proves that the hypothesis testing way to judge the numberof clustering of AKM is more universal than that by judgingit based on the between-class distance of ISODATA
6 Journal of Applied Mathematics
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 6 Clustering results of dataset by Matlab figure
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 7 Clustering results obtained by standard K-means
Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset
Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100
63 Experimental Results and Analysis of Classification
631 Iris Dataset See Table 3
632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 8 Clustering results obtained by ISODATA and KMRIC
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 9 Clustering results obtained by AKM
Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset
Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769
within-class scatter 119878119908
matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study
Journal of Applied Mathematics 7
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 10 Clustering results obtained by standard K-means
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 11 Clustering results obtained by ISODATA
7 Conclusion
On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly
There are some other problems that remain open Thevolatility of results which was caused by the randomness
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 12 Clustering results obtained by KMRIC
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 13 Clustering results obtained by AKM
of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)
8 Journal of Applied Mathematics
References
[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013
[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014
[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010
[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014
[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011
[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012
[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000
[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011
[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014
[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014
[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961
[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013
[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
6 Journal of Applied Mathematics
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 6 Clustering results of dataset by Matlab figure
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 7 Clustering results obtained by standard K-means
Table 3 Number of misclassification and accuracy of variousdiscriminant methods under Iris dataset
Fisher Potential function LS-SVMType I 0 0 0Type II 1 0 0Type III 0 0 0Total 1 0 0Accuracy 967 100 100
63 Experimental Results and Analysis of Classification
631 Iris Dataset See Table 3
632 Electrofacies Dataset It can be seen fromTables 3 and 4that these three classification methods all work well whenprocessing the Iris data for the data structure of Iris is quitesimple and low in dimension As for electrofacies data Fisherdiscriminant analysis is not applicable due to the singular
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 8 Clustering results obtained by ISODATA and KMRIC
15
1
05
0
minus05
minus1
minus15
minus4 minus3 minus2 minus1 0 1 2 3 4
Figure 9 Clustering results obtained by AKM
Table 4 Number of misclassification of various discriminantmethods under electrofacies dataset
Fisher Potential function LS-SVMType I mdash 0 0Type II mdash 0 2Type III mdash 0 0Type IV mdash 1 2Type V mdash 0 0Type VI mdash 0 0Type VII mdash 0 3Type VIII mdash 0 2Total mdash 1 9Accuracy mdash 949 769
within-class scatter 119878119908
matrix while the potential func-tion and LS-SVM still have better accuracy to classificationThe multiclassification of LS-SVM application remains forfurther study
Journal of Applied Mathematics 7
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 10 Clustering results obtained by standard K-means
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 11 Clustering results obtained by ISODATA
7 Conclusion
On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly
There are some other problems that remain open Thevolatility of results which was caused by the randomness
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 12 Clustering results obtained by KMRIC
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 13 Clustering results obtained by AKM
of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)
8 Journal of Applied Mathematics
References
[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013
[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014
[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010
[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014
[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011
[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012
[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000
[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011
[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014
[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014
[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961
[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013
[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Journal of Applied Mathematics 7
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 10 Clustering results obtained by standard K-means
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 11 Clustering results obtained by ISODATA
7 Conclusion
On the basis of analyzing the strengths and weaknesses of theexisting main algorithms for clustering this paper proposedthe KMRIC algorithm for improving initial points and theAKM algorithm for determining the number of clusters Thesupport vector machine has also been used for classificationFinally the algorithms are applied to electrofacies analysisThrough the experimental analysis comparison was madeamong algorithms According to the experimental resultsthe KMRIC algorithm erases the sensibility of K-meansalgorithm to data input sequence and initial centers and itachieves an obvious improvement relative to K-means andISODATA AKMalgorithmmixes the process of determiningthe number of clusters and the clustering process togetherto avoid the subjectivity in selecting the number of clustersand the blindness in initial divisions Under general condi-tion the number of clusters and rational clusters can be foundcorrectly
There are some other problems that remain open Thevolatility of results which was caused by the randomness
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 12 Clustering results obtained by KMRIC
15
10
5
0
minus5
minus10
minus15
minus30 minus20 minus10 0 10 20 30
Figure 13 Clustering results obtained by AKM
of selecting initial points in KMRIC existed in KMRIC andAKM To address this problem we can lower the randomnessby selecting the optimal initial points repeatedly Hierarchicalclustering is a very stable method but its disadvantage is themassive calculation cost How to combine the hierarchicalclustering and the abovementionedmethods may be taken asthe improvement direction in future
Conflict of Interests
The authors declare that there is no conflict of interestsregarding the publication of this paper
Acknowledgments
Research of the authors was supported by theNationalNatureScience Foundation of China (no 61103037) Nature ScienceFoundation of Guangdong Province (no S2013010011858)Project of Guangdong University of Outstanding Young Tal-ents Cultivation (no 2012LYM 0125) and Dongguan Scienceand Technology Project (no 2012108102007)
8 Journal of Applied Mathematics
References
[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013
[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014
[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010
[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014
[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011
[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012
[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000
[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011
[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014
[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014
[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961
[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013
[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
8 Journal of Applied Mathematics
References
[1] S Dong D D Zhou and W Ding ldquoFlow cluster algorithmbased on improvedK-meansmethodrdquo IETE Journal of Researchvol 59 no 4 pp 326ndash333 2013
[2] J Q He H Dai and X Song ldquoThe combination stretching fun-ction technique with simulated annealing algorithm for globaloptimizationrdquo Optimization Methods and Software vol 29 no3 pp 629ndash645 2014
[3] J Liu and T Z Liu ldquoDetecting community structure in complexnetworks using simulated annealing with k-means algorithmsrdquoPhysica A StatisticalMechanics and Its Applications vol 389 no11 pp 2300ndash2309 2010
[4] S H Kim and L Li ldquoStatistical identifiability and convergenceevaluation for nonlinear pharmacokinetic models with particleswarm optimizationrdquo Computer Methods and Programs inBiomedicine vol 113 no 2 pp 413ndash432 2014
[5] S Kalyani andK S Swarup ldquoParticle swarmoptimization basedK-means clustering approach for security assessment in powersystemsrdquo Expert Systems with Applications vol 38 no 9 pp10839ndash10846 2011
[6] D H Wang J F Wang and X Y Xu ldquoA relevance vectormachine and bare-bones particle swarm optimization hybridalgorithm for PD pattern recognition of XLPE cablerdquo Journal ofComputational Information Systems vol 8 no 2 pp 451ndash4582012
[7] D Pelleg and AWMoore ldquoX-means extending K-means withefficient estimation of the number of clustersrdquo in Proceedings ofthe 17th International Conference onMachine Learning pp 727ndash734 2000
[8] E Bertini A Tatu and D Keim ldquoQuality metrics in high-dimensional data visualization an overview and systemati-zationrdquo IEEE Transactions on Visualization and ComputerGraphics vol 17 no 12 pp 2203ndash2212 2011
[9] L M Li and Z S Wang ldquoMethod of redundant features elimi-nating based on k-means clusteringrdquo Applied Mechanics andMaterials vol 488 pp 1023ndash1026 2014
[10] C H Lin C C Chen H L Lee et al ldquoFast K-means algorithmbased on a level histogram for image retrievalrdquo Expert Systemswith Applications vol 41 no 7 pp 3276ndash3283 2014
[11] P AW Lewis ldquoDistribution of the Anderson-Darling statisticrdquoAnnals of Mathematical Statistics vol 32 pp 1118ndash1124 1961
[12] M Z Tang and C H Yang ldquoExcellent operational patternrecognition based on simultaneously optimizing cost-sensitivesupport vector machinerdquo CIESC Journal vol 64 no 12 pp4509ndash4514 2013
[13] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of