HEART DISEASE PREDICTION USING HYBRID … · International Journal of Mechanical Engineering ......

http://www.iaeme.com/IJMET/index.asp 980 [email protected]

International Journal of Mechanical Engineering and Technology (IJMET)

Volume 9, Issue 1, January 2018, pp. 980–994 Article ID: IJMET_09_01_105

Available online at http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=9&IType=1

ISSN Print: 0976-6340 and ISSN Online: 0976-6359

© IAEME Publication Scopus Indexed

HEART DISEASE PREDICTION USING HYBRID

HARMONY SEARCH ALGORITHM WITH LEVI

DISTRIBUTION

Prasad Koti, Dhavachelvan P, Kalaipriyan T, Sariga Arjunan, Uthayakumar J

and Pothula Sujatha

Department of Computer Science, Pondicherry University, Puducherry, India

ABSTRACT

Prediction of Heart Disease (HD) gains more importance in the field of medical

diagnosis. Generally, experts are required to classify the data to identify the presence

of disease or not. The HD is predicted previously with the use of exact algorithms and

some heuristic algorithms are also utilized to produce precise results in less

computation time. Initially, data mining algorithms are widely used to identify HD.

After bio-inspired algorithms have evolved for solving combinatorial optimization

problems, the area of HD prediction attracts a number of researchers for solving it.

On the other hand, Feature Selection (FS) is a main research area in the field of data

classification, which is used to find a smaller set of rules from the training dataset

with predefined goals. Several techniques, methodologies include machine learning

algorithms, biologically inspired algorithms have been utilized for feature selection.

This part of interest motivated us to design an intelligent algorithm based HD

prediction by using hybrid models for efficient local search procedure. This paper

proposes a hybrid Harmony Search (HM-L) algorithm with Levi distribution to

properly predict HD at appropriate time. In this research work, Correlation-based

Feature Selection (CFS) is used as a feature selection technique. The effectiveness of

hybrid HS algorithm is validated by employing it against a set of datasets. The

obtained results of applied datasets without and with feature selection are compared

to one another. The simulation results ensure that HSS algorithm achieves better

results than the existing methods such as Harmony Search (HM), Biogeography

Optimization Algorithm (BBO), Grey Wolf Optimization (GWO), AL Particle Swarm

Optimization Algorithm (ALPSO) and Artificial Bee Colony (ABC).

Keywords: Feature selection, Heart disease prediction, Harmony search algorithm,

intelligent algorithms, Levi distribution

Cite this Article: Prasad Koti, Dhavachelvan P, Kalaipriyan T, Sariga Arjunan,

Uthayakumar J and Pothula Sujatha, Heart Disease Prediction using Hybrid Harmony

Search Algorithm with Levi Distribution, International Journal of Mechanical

Engineering and Technology 9(1), 2018. pp. 980–994.

http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=9&IType=1

Prasad Koti, Dhavachelvan P, Kalaipriyan T, Sariga Arjunan, Uthayakumar J and Pothula Sujatha


1. INTRODUCTION

An important problem in medical institutions like hospital is providing quality services at

reasonable costs. Quality service indicates the proper diagnose of patients and improper

decisions results in serious consequences which are highly intolerable. Clinical decisions are

prepared using the doctors’ perception and experience instead of the data concealed in the

database. This procedure results in unnecessary biases, errors and extremely expensive which

influences QoS given to the patients. [1] Presented a system which combines the clinical

decision support system (CDSS) with computer-based patient details reduce the chances of

mistakes, improves protection level, and improves patient results. In recent days, many

hospitals employ several kind of hospital information systems to handle their medical data

[2].

HD is exponentially increased in the recent years and become the major reason for death

in several parts of the world. There are numerous features of HD influences in the functioning

of the heart. It is very hard to predict HD precisely at a faster rate. Hence, it is needed to use

computer based systems to diagnose HD to help doctors to predict diseases quickly. At

present, various HD prediction systems based on soft computing techniques is being

developed. Particularly, incorporating the use of different soft computing methods is created

to achieve better results than an individual method. This model contains two levels: In the

first level, FS methods are employed to choose a subset of features. The selected features are

then given as input to the classification methods in the next level. Irrelevant features have to

be eliminated because of assorted characteristics in heart disease datasets and it comprises of

related as well as unrelated and repeated features. An irrelevant feature does not influence the

description of target class. A redundant feature does not give anything but they make noise

towards description of target class [3]. These features reduce the classification accuracy and

also the computational speed. So, eliminating the unnecessary features prior to the application

of classifier techniques is essential. To achieve this goal, FS is involved in HD prediction

model is required in the HD diagnosis system. Several techniques, methodologies include

machine learning algorithms, biologically inspired algorithms have been utilized for feature

selection. Biologically inspired algorithms such as GA and swarm-based approaches like PSO

have been successfully used.

This paper proposes a Hybrid Harmony Search (HM-L) algorithm with Levi distribution

to properly predict HD at appropriate time. The effectiveness of hybrid HS algorithm is

validated by employing it against a set of datasets. The simulation results ensure that the HHS

algorithm achieves better results than state of art methods such as HM, BBO, GWO, ALPSO

and ABC algorithm.

The succeeding part of the paper is structured as follows. The state of art techniques of

HD prediction is explained in Section 2. The outline of HS algorithm is given in Section 3.

The proposed HHS algorithm is discussed in Section 4. The proposed HM-L algorithm is

simulated and the results are investigated in Section 5. The highlights of the paper are

concluded in Section 6.

2. RELATED WORK

In this section, we discuss the state of art techniques to predict HD using data mining

algorithms, machine learning algorithms and so on.

Latha Parthiban et al. developed a method to predict HD named Coactive Neuro-Fuzzy

Inference System (CANFIS) [4]. CANFIS method incorporates the NN adaptive

characteristics with the fuzzy logic method. Then, it is integrated with GA to identify the

presence of HD. CANFIS is a dependent and robust method which identifies a nonlinear

relationship and mapping among various attributes. Fuzzy logic is found to be useful which

Heart Disease Prediction using Hybrid Harmony Search Algorithm with Levi Distribution


correlates the linguistic nature of rules (MFs) with the features of NN. GA is highly useful to

tune the CANFIS parameters automatically and also to select feature set in an optimal way.

The result implies than CANFIS achieves better results in terms of training and classification

accuracy.

Sudha et al. 2012 presented a model to predict heart stroke disease using classification

algorithm. It uses Decision tree, NB and NN to predict the presence of stroke disease. As the

medical dataset are massive in size, Principle Component Analysis algorithm is employed to

reduce higher dimensions into lower ones. Next, the relevant dataset are grouped to form

clusters. To eliminate the issue, the high value attributes may confound or miscalculate the

low value attributes, the value of the attributes should be standardized in prior to clustering.

Therefore, the decreased subset of the attributes can be employed as inputs. The proposed

method enables the user to enter the details of the patients (i.e. blood pressure level, sugar

level, etc) and identify the patient's status of stroke disease. The experimental analysis proves

that NN achieves better accuracy than decision tree and NB method.

Parthiban et al. 2012 proposed a method to identify the probability of getting HD by the

use of attributes from diabetes diagnosis [6]. The proposed method finds the vulnerability of

HD using 500 records gathered from diabetic patients. With the help of diagnosing diabetes,

the proposed system obtains better accuracy and finds the probability of a diabetic person to

get HD using several attributes like age, sex, blood pressure and blood sugar. Using the

obtained results, patients can be warned to alter their habits and lifestyles. The proposed

method will be helpful to prevent the diabetes persons being affected from HD at lower cost.

By comparing NB and SVM, SVM is found to be efficient with better prediction accuracy.

In Anooj 2012, the author formulated a weighted fuzzy rule-based clinical decision

support system (CDSS) to predict HD by acquiring the knowledge automatically from

patient's data [7]. It operates on two levels: (1) automatic creation of weighted fuzzy rules and

(2) creating a fuzzy rule-based CDSS. In first level, data mining techniques, attribute selection

and attribute weightage method are employed to generate the weighted fuzzy rules. Next, the

fuzzy system is created based on the obtained rules and selected attributes. At the end, the

weighted fuzzy rules are provided to the FIS which makes the system to learn and predicts the

rules. The effectiveness of the proposed fuzzy system is investigated by the comparison of the

obtained results with NN based method against a same set of dataset. The comparison results

reveal that fuzzy based CDSS produces better results than NN in terms of accuracy,

sensitivity and specificity.

Long et al. 2015 developed a less complex HD diagnosis system with the help of chaos

firefly algorithm and rough sets based attribute reduction (CFARS-AR) and also an interval

type-2 fuzzy logic system (IT2FLS). The goal of this study is to devise an efficient diagnosis

model to identify HD precisely with less number of attributes. It uses the chaos firefly

algorithm combined with rough sets to decrease the number of attributes. The residual subsets

of attributes are given as inputs to IT2FLS. Two types of comparisons are made to assess the

outcome of CFARS-AR and IT2FLS. Initially, CFARS-AR is compared to BPSO rough sets

based attribute reduction (BPSORS-AR). Next, IT2FLS is also compared traditional

classifiers namely NB, SVM and ANN. NB (NB), Support Vector Machine (SVM), and

Artificial Neural Network (ANN). It discovers the minimum attribute reduction from high

dimensional dataset which improves the output of the classification system. The usage of

fuzzy logic manages the level of uncertainty and noise present in the dataset. Though the

proposed method has several benefits, it has some limitations. CFARS-AR is uncontrollable

in presence of a large number of attributes and the training process of IT2FLS is found to be

very slow.



Syed Umar Amin et al. 2013 proposed an intelligent HD prediction system using GA

optimized NN with the help of several risk factors [9]. No existing techniques identify HD

using the risk factors like age, heredity, diabetes, stress, cholesterol, tobacco smoking, alcohol

intake, obesity or no physical activity, etc. An HD patient has several risk factors which

makes it easier to diagnose. The proposed method uses NN and GA where GA is employed

for NN weight initialization. The advantages are quicker learning process, more stable and

precise than back propagation. To observe the outcome of proposed methodology, the risk

factors of 50 patients was collected and the experimental analysis clears that the training and

validation accuracy is 96.2% and 89% respectively. The proposed system will be helpful for

doctors as well as patients to create awareness about the probable presence of HD without

going to hospital or any medical checkups.

In Deekshatulu, B.L. and Chandra, P., 2013, the author presented an algorithm by the

integration of KNN with GA for proper classification [10]. The proposed method consists of

two steps: 1) GA based attribute evaluation and 2) Developing and determining the accuracy

of the classifier. GA performs global search and produces optimal solution in large search

space. The presence of repeated and unnecessary attribute results to poor classification results.

The GA based search method is utilized to reduce repeated and unnecessary attributes and to

order the attributes which gives more importance towards classification. The attributes in the

lowest order are eliminated and the classification algorithm is developed using the analyzed

attributes. The classifier has undergone training process and then it classifies the dataset as

healthy or sick. The effectiveness of the proposed method is evaluated against a same set of 6

medical data and 1 non-medical data set. The results show that the incorporation of GA with

KNN achieves better accuracy other methods. It finds helpful to doctors to predict heart

diseases with less number of attributes.

Rajathi et al. 2016 developed a prediction method to calculate the probability of getting

HD using KNN incorporated with ACO [11]. The aim of this paper is to identify HD with

high accuracy and least error rates. The proposed method operates in two stages. The first

stage involves the classification of test data using KNN algorithm. The second phase utilizes

ACO for the initialization of population and searches optimized solution. kNN algorithm

selects the training dataset and classifies it. The output from kNN algorithm is given to ACO

to generate the results. Then, the optimization algorithm ACO is applied to the classified

results and the output is generated. The training dataset holds 1500 instances with 15 diverse

attributes. The instances in the dataset represent the output of various testing types to

determine the precision level of HD. The proposed method achieves better results than SVM

and KNN. It predicts HD with an accuracy of 70.26% and the error rate of 0.526%

respectively.

In Ahmed Fawzi Otoom et al. 2015, the author developed a simple and precise mobile

application that enables the user for real-time diagnosis and monitoring of HD [12]. The

existing healthcare system concentrates only on data acquisition and monitoring component,

much importance is not given for real-time diagnosis. The proposed method constructs an

intelligent classifier using machine learning algorithm which allows the user to predict HD by

entering the patient data. It continuously observes the patient's data in real time and raises an

alarm in emergency situations. A diagnostic element is also included in the application which

gives accurate and quick results to doctors or patients. It accurately identifies whether the user

suffers from any HD or not. Three classifiers namely BN, SVM and FT are evaluated to select

diagnosis component. The proposed method found to be efficient and obtained an accuracy of

88.3% and greater than 85% in the cross-validation test. In addition, the monitoring algorithm

achieves the detection rate of 100%.



Vendansamy et al. 2015 presented a HD prediction system (HDPS) to diagnose HD using

NB algorithm [13]. This algorithm classifies the data set in a precise manner, leads to the

proper identification of HD. For the applied dataset, NB algorithm attains a maximum

accuracy of 86.4198% in less response time.

Shashikant et al. 2012 employed an SVM based Sequential Minimization Optimization

learning algorithm to diagnose HD [14]. Initially, the data is preprocessed and the features are

extracted. Then, SVM is applied to classify data and it performs well on pattern classification

problem. The proper identification of the automatic diagnosis system is analyzed on the basis

of classification accuracy, sensitivity and specificity. To analyze the performance of SVM, it

is tested against an India centric dataset with 214 instances and 3 types of HD. The results of

SVM are compared with MLP, RBF, BN, J48 and ORule under 5-fold and 10-fold cross-

validation. The obtained results imply that SVM produces better accuracy of 85.05% than

other methods.

Azhar Hussein Alkeshuosh et al. 2017 uses PSO algorithm to create rules for HD [15].

Initially, random rules are encoded and PSO is used for optimizing the rules to achieve high

accuracy. The individuals are then encoded using Michigan method which contains an

individual to represent single rule. Michigan method comprises of atleast 2 ways to identify

HD. The results imply that rules can be classified precisely using PSO algorithm with an

average accuracy of 87% whereas C4.5 attains a lower accuracy of 63%.

3. HARMONY SEARCH

Harmony search is one of the meta-heuristic optimization algorithm based on music orchestra.

It is inspired by the process of finding the best state of harmony. The process of selecting the

harmony is similar to attaining optimal solution in an optimization process. The searching

process of any algorithm to find optimal solution can be mapped a jazz musician’s

improvisation process. Generally, the best harmony is identified by its audio aesthetic

standard. The musician aims to provide music with great harmony. Likewise, the solution

needs to be best for any optimization problem for the given objectives and constraints. The

aim of these two processes is to achieve best or optimum solution. The resemblance can be

utilized to design new algorithms for effective optimization.

HS is one of the good examples by converting the qualitative process to certain

quantitative rules by idealization, and using the nature of music in an optimization process via

the searching process for the best harmony, called, the Harmony Search (HS) or Harmony

Search algorithm. In the HS algorithm, every musician (decision variables) plays (creates) a

note (a value) to identify the best harmony (global optimum) in total. HS has the capability to

quickly find the region produces high performance regions in solution space. But, it fails to

perform well in the local search optima. To overcome this issue, some of the enhancement of

HS algorithm was proposed and found in the literature to improve precision and convergence

rate.

Mahdavi et al. 2007 developed an improvement of HS algorithm namely IHS, which tunes

the key parameters in a dynamic fashion [16]. HIS performs better than traditional HS

algorithm which can be used in several engineering based optimization problems. Another

variant of HS algorithm is developed by Omran et al. 2008 [17], called global best HS (GHS)

algorithm, follows some ideas from the field of swarm intelligence. It is noted that GHS

algorithm also found to be efficient than HS algorithm.



The procedure involved in HS algorithm is listed here:

Step 1: Initializing of problem and algorithm parameters

Step 2: Initializing Harmony Memory (HM)

Step 3: Improving New Harmony

Step 4: Updation of HM

Step 5: Test the termination condition

1. Initialization of problem and algorithm parameters

In step 1, the optimization problem is defined as

Minimize f(x) subject to

where f (x) is an objective function, x represents the set of every decision variable ( ); N

is the total number of decision variables, is the set of the possible range of values for every

decision variable. The HS algorithm parameters are also defined in this step and the

parameters are Harmony Memory Size (HMS), Harmony Memory Considering Rate

(HMCR), Pitch Adjusting Rate (PAR) and Number of Improvisations (NI), or stopping

condition. The Harmony Memory (HM) is a memory location where all the solution vectors

(decision variables) are stored. HMCR and PAR are employed to enhance the solution vector

represented in Step 3.

2. Initialization of harmony memory

In step 2, the randomly created solution vectors are entered in the HM matrix.

3. Improvisation of a new harmony

Generally, 3 rules are used to generate a new harmony vector

and the

rules are: Memory consideration, Pitch adjustment and Random selection. From the memory

point of view, the value of the first decision variable for the new vector is selected form

any of the values in the particular HM range

. The rest of the values of other

decision variables are also chosen in a similar way. HMCR value lies in the range of 0 and 1.

HMCR is the rate of selecting one value from the previous values stored in the HM, while (1

– HMCR) is the rate of random selection of one value from the probable range of values.



Each component produced by the memory consideration is investigated to decide whether

it should be pitch-adjusted or not. This operation employs (rate of pitch adjustment) PAR

parameter, which is given below.

4. Updating the HM

When the new harmony vector is superior to the worst harmony (Xw) in HM, the objective

function value which calculates the fitness, fitness(Xi), the new harmony is added to HM and

the existing is removed from HM.

5. Test termination condition

When the termination condition is successful, then the computation process is ended. Else,

Steps 3 and 4 will be continued.

Computational Procedure

Step 1: Initialize HMS, HMCR, PAR, BW and NI.

Step 2: Initialize HM and determine the objective function value of every harmony vector

Step 3: Improve a New Harmony Xnew as follows:

Step 4: Updating HM as Xw = Xnew if fitness (Xnew) >fitness (Xw)

Step 5: When the stopping condition is reached (NI), returns the best harmony vector XB

in the HM; else go to step 3.



LIMITATIONS OF HARMONY SEARCH

Harmony Search is good at exploitation of a search space since it is working on each

dimension of the search space. Each variable in the solution are subject to change w.r.t. its

efficiency. However, at the phase of exploration, Harmony search uses a random initialization

which results in minimal deflection from the current exploitation phase. When the phase of

exploration is lower when compared with exploitation, then the algorithm acquires the

tendency of being trapped in local optima.

4. HYBRID HARMONY SEARCH ALGORITHM

Addressing the limitations stated above, a hybrid Harmony Search algorithm has been

proposed in this paper to explore the given search space with an efficient mathematical

distribution function. Levy Distribution [18] which is a mathematical model used for

initiating a sudden drift. Levy flight is a random walk where the step length of search process

is enhanced with an unpredictable deviation. The Levy Distribution can be explained as

follows.

(1)

where represents a random variable within the interval (0, 1], and represents the

stability index. The mathematical model Gaussian and Cauchy distribution plays a significant

role when has been allotted with2 and 1 respectively. During implementation of Levy

Distribution in a search space the mathematical model has been refined [19] to

| | (2)

Where and are normal distribution values, is levy exponent and is defined as

[ (

)

(

)

(

)]

(3)

where value can be fixed with 1.5 [19] and are the random values with a mean of 0

and standard deviation of 1 from normal distribution. The Hybrid Harmony Search Algorithm

imposed with Levy Distribution is given in Algorithm 1. In Hybrid Harmony Search after

initialization of computational variables and population, on improvising phase of each

individual a random individual will be generated as follows

{

(4)

where represents a random variable, are two random individuals. When

the random initialization of two individuals will follow the computation as mentioned in Eq.

(7). Opposition concept also incorporated in the proposed Hybrid Harmony Search. When a

feature of an individual has not been modified either through Harmony Memory or through

Pitch adjustment, then the newly generated Levy imposed solution will contribute the

unmodified feature of current individual.



Levy based HS

Input:Features (S), objective function

Begin

Population Initialization ( ), Maximum Iterations ( )

Initializing Harmony Memory (HM) with using randomization

Initialize and

while( ) do

{

| |

if ( )

Opt from HM

elseif ( )

Modify using Eq. (1)

else

endif

end for

if (

endif

end while

End

Output:

5. PERFORMANCE EVALUATION

Dataset description

Benchmarked datasets from UCI repository are used in this study. The heart disease dataset

consists of 267 instance and 22 features. For training, 80 instances are used and the remaining

187 instances are employed for testing purposes.

Parameter setting

For evaluation of the proposed method it has been implemented in MATLAB 9.0 with system

specifications Intel Core i7, 6th

gen processor with 3.2 GHz processor speed, 4GB RAM. For

tuning the performance of proposed method ANN has been hardcoded instead using it from

toolbox. The simulation parameters of the proposed algorithm have been tabulated in Table 1.



Table 1 Parameter setting

Parameters Values

Population size 100

Maximum iterations 500

0.1~0.5

0.7~0.95

[-1, 1]

0.1

Table 2 tabulates the performance metrics discussed above before the features are

selected. The results are compared in Figure 1 based on existing algorithms such as HS, BBO,

GWO, ALPSO and ABC algorithm. Figure 1 shows overall comparison for the Type 1 Error

rate of Heart Disease dataset classification before and after feature selection. From the figure

1, it is evident that the proposed HM-L has achieved less error rate when compared with

existing algorithms in all the cases. And also, the error rate of HM-L is almost near to the

result obtained before feature selection using same methodology.

Table 1 Result of Simulation w.r.t. stated performance metrics for Heart Disease Dataset before

Feature Selection

Algorithm Type 1

Error Rate

Type 2

Error Rate Sensitivity Specificity Accuracy Error-Rate F-Score Kappa

HM-L 4.07 4.81 89.34 92.56 91.11 8.99 90.08 82.03

HM 4.13 7.89 83.33 92.14 87.96 12.03 86.77 75.78

BBO 5.18 7.03 84.8 90.34 87.77 12.22 86.53 75.35

GWO 6.29 9.62 79.84 87.94 84.07 15.93 82.73 67.99

ALPSO 7.037 11.85 75.93 86.13 81.11 18.89 79.84 62.16

ABC 8.88 14.44 71.11 82.22 76.66 23.33 75.29 53.33

Table 2 Result of Simulation w.r.t. stated performance metrics for Heart Disease Dataset after Feature

Selection

Algorithm Type 1

Error Rate

Type 2

Error Rate Sensitivity Specificity Accuracy Error-Rate F-Score Kappa

HM-L 3.33 4.07 90.98 93.91 92.59 7.41 91.73 85.02

HM 4.44 6.66 85.71 91.66 88.88 11.11 87.80 77.61

BBO 4.07 5.92 87.2 92.41 90 10 88.97 79.83

GWO 5.18 8.51 82.17 90.07 86.29 13.7 85.14 72.46

ALPSO 5.92 10.74 78.19 88.32 83.33 16.67 82.21 66.61

ABC 7.77 13.33 73.33 84.44 78.88 21.11 77.64 57.78



Figure 1 Comparison of Type 1 Error Rate before Feature Selection Vs after Feature Selection

Figure 2 Comparison of Type 2 Error Rate before Feature Selection Vs after Feature Selection

Figure 2 shows overall comparison for the Type 2 Error rate of Heart Disease dataset

classification before and after feature selection. From the figure 2 it is evident that the

proposed HM-L has achieved less error rate when compared with existing algorithms in all

the cases. And also, the error rate of HM-L is almost near to the result obtained before feature

selection using same methodology.

Figure 3 Comparison on Sensitivity - Before Feature Selection Vs After Feature Selection

0

2

4

6

8

10

HM-L HM ALPSO GWO BBO ABC3

.33

33

4.4

44

4

5.9

25

9

5.1

85

2

4.0

74

1 7

.77

78

4.0

74

1

4.1

35

3 7.0

37

6.2

96

3

5.1

85

2 8

.88

89

Typ

e 1

ER

Algorithms

BF AF

0

5

10

15

HM-L HM ALPSO GWO BBO ABC

4.0

74

1

6.6

66

7 10

.74

07

8.5

18

5

5.9

25

9

13

.33

33

4.8

14

8

7.8

94

7 11

.85

19

9.6

29

6

7.0

37

14

.44

44

Typ

e 2

Err

or

Rat

e

Algorithms

BF AF

0

20

40

60

80

100


90

.98

36

85

.71

43

78

.19

55

82

.17

05

87

.2

73

.33

33

89

.34

43

83

.33

33

75

.93

98

79

.84

5

84

.8

71

.11

11

Sen

siti

vity

(%

)

Algorithms

BF AF



Figure 3 shows overall comparison for the Sensitivity of Heart Disease dataset

classification before and after feature selection. From the figure 3it is evident that the

proposed HM-L has achieved better sensitivity when compared with existing algorithms in all

the cases. And also, the sensitivity of HM-L is almost near to the result obtained before

feature selection using same methodology. Figure 4 shows overall comparison for the

Specificity of Heart Disease dataset classification before and after feature selection. From the

figure 4it is evident that the proposed HM-L has achieved better specificity when compared

with existing algorithms in all the cases. And also, the specificity of HM-L is almost near to

the result obtained before feature selection using same methodology.

Figure 4 Comparison on Specificity - Before Feature Selection Vs After Feature Selection

Figure 5 Comparison on Accuracy - Before Feature Selection Vs After Feature Selection

Figure 5 shows overall comparison for the Accuracy of Heart Disease dataset


proposed HM-L has achieved better accuracy when compared with existing algorithms in all

the cases. And also, the Accuracy of HM-L is almost near to the result obtained before feature


75

80

85

90

95


93

.91

89

91

.66

67

88

.32

12

90

.07

09

92

.41

38

84

.44

44

92

.56

76

92

.14

29

86

.13

14

87

.94

33

90

.34

48

82

.22

22

Spec

ific

ity

(%)

Algorithms

BF AF

0

20

40

60

80

100


92

.59

26

88

.88

89

83

.33

33

86

.29

63

90

78

.88

89

91

.11

11

87

.96

99

81

.11

11

84

.07

41

87

.77

78

76

.66

67

Acc

ura

cy (

%)

Algorithms

BF AF



Figure 6 Comparison on Error Rate - Before Feature Selection Vs After Feature Selection

Figure 6 shows overall comparison for the Error rate of Heart Disease dataset


proposed HM-L has achieved less error rate when compared with existing algorithms in all

the cases. And also, the error rate of HM-L is almost near to the result obtained before feature


Figure 7 Comparison on F-Score - Before Feature Selection Vs After Feature Selection

Figure 7 shows overall comparison for the F-Score of Heart Disease dataset classification

before and after feature selection. From the figure 7it is evident that the proposed HM-L has

achieved better F-Score when compared with existing algorithms in all the cases. And also,

the F-Score of HM-L is almost near to the result obtained before feature selection using same

methodology.

0

5

10

15

20

25


7.4

1 1

1.1

1 1

6.6

7

13

.7

10

21

.11

8.9

9 12

.03

18

.89

15

.93

12

.22

23

.33

Erro

r R

ate

Algorithms

BF AF

0

20

40

60

80

100


91

.73

55

87

.80

49

82

.21

34

85

.14

06

88

.97

96

77

.64

71

90

.08

26

86

.77

69

79

.84

19

82

.73

09

86

.53

06

75

.29

41

F-Sc

ore

Algorithms

BF AF



Figure 8 Comparison on Kappa - Before Feature Selection Vs After Feature Selection

Figure 8 shows overall comparison for the Kappa of Heart Disease dataset classification

before and after feature selection. From the figure 8it is evident that the proposed HM-L has

achieved better Kappa when compared with existing algorithms in all the cases. And also, the

Kappa of HM-L is almost near to the result obtained before feature selection using same

methodology.

6. CONCLUSION

This paper proposes a Hybrid Harmony Search (HM-L) algorithm with Levi distribution to

properly predict HD at appropriate time. In this research work, Correlation-based Feature

Selection is used as a feature selection technique. The effectiveness of hybrid HS algorithm is

validated by employing it against a set of datasets. The obtained results of applied datasets

without and with feature selection are compared to one another. The simulation results ensure

that HSS algorithm achieves better results than the existing methods such as Harmony Search,

Biogeography Optimization Algorithm, Grey Wolf Optimization, AL Particle Swarm

Optimization Algorithm and Artificial Bee Colony. From the comparison results, it is clear

that the efficiency is increased by the use of feature section.

REFERENCES

[1] R.Wu, W.Peters, M.W.Morgan, The Next Generation Clinical Decision Support: Linking

Evidence to Best Practice, Journal of Healthcare Information Management. 16(4), pp. 50-

55, 2002.

[2] Mary K.Obenshain, Application of Data Mining Techniques to Healthcare Data, Infection

Control and Hospital Epidemiology, vol. 25, no.8, pp. 690–695, Aug. 2004.

[3] Shilaskar, S., & Ghatol, A. (2013). Feature selection for medical diagnosis: Evaluation for

cardiovascular diseases. Expert Systems with Applications, 40(10), 4146–4153.

[4] Parthiban, L. and Subramanian, R., 2008. Intelligent heart disease prediction system using

CANFIS and genetic algorithm. International Journal of Biological, Biomedical and

Medical Sciences, 3(3).

[5] Sudha, A., Gayathri, P. and Jaisankar, N., 2012. Effective analysis and predictive model

of stroke disease using classification methods. International Journal of Computer

Applications, 43(14), pp.26-31.

0

10

20

30

40

50

60

70

80

90


85

.02

77

.61

66

.61

72

.46

79

.83

57

.78

82

.03

75

.78

62

.16

67

.99

75

.35

53

.33

Kap

pa

Algorithms

BF AF



[6] Parthiban, G. and Srivatsa, S.K., 2012. Applying machine learning methods in diagnosing

heart disease for diabetic patients. International Journal of Applied Information Systems

(IJAIS), 3, pp. 2249-0868.

[7] Anooj, P.K., 2012. Clinical decision support system: Risk level prediction of heart disease

using weighted fuzzy rules. Journal of King Saud University-Computer and Information

Sciences, 24(1), pp.27-40.

[8] Long, N.C., Meesad, P. and Unger, H., 2015. A highly accurate firefly based algorithm for

heart disease prediction. Expert Systems with Applications, 42(21), pp.8221-8231.

[9] Amin, S.U., Agarwal, K. and Beg, R., 2013, April. Genetic neural network based data

mining in prediction of heart disease using risk factors. In Information & Communication

Technologies (ICT), 2013 IEEE Conference on (pp. 1227-1231). IEEE.

[10] Deekshatulu, B.L. and Chandra, P., 2013. Classification of heart disease using k-nearest

neighbor and genetic algorithm. Procedia Technology, 10, pp.85-94.

[11] Rajathi, S. and Radhamani, G., 2016, March. Prediction and analysis of Rheumatic heart

disease using kNN classification with ACO. In Data Mining and Advanced Computing

(SAPIENCE), International Conference on (pp. 68-73). IEEE.

[12] Otoom, A.F., Abdallah, E.E., Kilani, Y., Kefaye, A. and Ashour, M., 2015. Effective

diagnosis and monitoring of heart disease. heart, 9(1), pp.143-156.

[13] Vembandasamy, K., Sasipriya, R. and Deepa, E., 2015. Heart Diseases Detection Using

Naive Bayes Algorithm. IJISET-International Journal of Innovative Science, Engineering

& Technology, 2, pp. 441-444.

[14] Shashikant, U.G. and Ghatol, A.A., Heart Disease Diagnosis Using Machine Learning

Algorithm. In Proceedings of the International Conference on Information Systems

Design and Intelligent Applications. Advances in Intelligent and Soft Computing (Vol.

132, pp. 217-225).

[15] Alkeshuosh, A.H., Moghadam, M.Z., Al Mansoori, I. and Abdar, M., 2017, September.

Using PSO Algorithm for Producing Best Rules in Diagnosis of Heart Disease. In

Computer and Applications (ICCA), 2017 International Conference on (pp. 306-311).

IEEE.

[16] Mahdavi, M., Fesanghary, M. and Damangir, E., 2007. An improved harmony search

algorithm for solving optimization problems. Applied mathematics and computation,

188(2), pp.1567-1579.

[17] Omran, M.G. and Mahdavi, M., 2008. Global-best harmony search. Applied mathematics

and computation, 198(2), pp.643-656.

[18] Fogedby, H. C. (1994). Langevin equations for continuous time Lévy flights. Physical

Review E, 50(2), 1657.

[19] Walton, S., O. Hassan, K. Morgan, and M. R. Brown. Modified cuckoo search: a new

gradient free optimisation algorithm. Chaos, Solitons & Fractals 44, no. 9 (2011):710-718.

[20] Priyanka Das, S Banerjee, Optimal Allocation of Capacitor in a Radial Distribution

System using Loss Sensitivity Factor and Harmony Search Algorithm, International

Journal of Electrical Engineering & Technology (IJEET), Volume 5, Issue 3, March

(2014), pp. 05-13

[21] Mr. P.Balachennaiah, Dr. M.Suryakalavathi, P.Suresh babu, Optimal Location of Svc for

Real Power Loss Minimization and Voltage Stability Enhancement Using Harmony

Search Algorithm, International Journal of Electrical Engineering & Technology (IJEET),

Volume 5, Issue 1, January (2014), pp. 26-34

[22] Assif Assad, Performance of Harmony Search Algorithm on IEEE CEC 2006 Constraint

Optimization Problems, International Journal of Computer Engineering & Technology

(IJCET), Volume 5, Issue 6, June (2014), pp. 143-155

Date post:	02-Apr-2018
Category:	Documents
Upload:	phamtu
View:	217 times
Download:	3 times

HEART DISEASE PREDICTION USING HYBRID … · International Journal of Mechanical Engineering ......

Documents