Advances in Fuzzy Mathematics.
ISSN 0973-533X Volume 12, Number 3 (2017), pp. 333-345
© Research India Publications
http://www.ripublication.com
Detection of Heart Disease Severity using A Novel
Multilayer Perceptron Model: Validation through
Major Datasets
N Satyanandam
Associate Professor, Dept. of CSE, Bhoj Reddy Engineering College for Women,
Hyderabad, Telangana, India
Dr. Ch Satyanarayana
Professor, Dept. of CSE, JNTUK University College of Engineering,
Kakinada, Andhra Pradesh, India.
Abstract
The recent improvement in the medical science motivates the recent
researchers to produce predictive disease analysis methods in order to prevent
the disease. The use of machine learning and data mining is the highest
addressed field of interest for many years. Though, the complexity of neural
networks, specially multilayer perceptron is still under study and the
maximum capabilities of the MPM is yet to be explored. Hence this work
deploys a multilayer perceptron model for predictive detection of heart disease
severity based on various parameters. The deployed multi-layered perception
uses the back-propagation for optimal supervised learning. The work also
deploys a novel principle attribute analysis to understand the orientation of the
attributes affecting the results. The final outcome of this effort is to analyse the
Heart Disease Severity based on proposed multilayer perceptron model.
Keywords: Disease Detection, MLP, Random Forest, Random Tree,
Improved MLP
I. INTRODUCTION
The hearth or the cardiovascular diseases have a huge impact on the death rates [1] in
the world especially in the developing countries. Celtia et al in the year of 2000 have
proven that cardiovascular diseases cause 25% of the deaths. The work presented by
World Bank Country groups in the year of 2001, had cited the health rate by heart
diseases around 25%. However the work of Mathers et al presented in the year of
2004, had analyses the death rate as 46%, which is a notable increase in the span of 4
334 N Satyanandam and Dr. Ch Satyanarayana
years. It is predicted that in the year of 2020 an approximated 2.5 million people from
India are likely to be severely affected by heart diseases. In spite of the best clinical
practices and available medications, the death rates are increasing and expected to be
55% in India by the end of 2020. The focus of this work is to demonstrate a Novel
Multilayer Perceptron Model to Detect Heart Disease Severity [2]. Henceforth this
work analyses the recent research outcomes from the parallel works [8-32].
The present research trends are directing towards a more specific and focused study of
predictive models for determining the severity of the heart diseases based on the
clinical results and best possible computing techniques [3]. The outcomes from work
of Huyan Wang at al had proposed a traditional model for Chinese medical practices
based on a computing model to diagnosis based on the Bayesian model. Also the
works been carried out with the perspective of generic programming in order to
produce expert systems are notable for prediction and diagnosis of heart diseases [4].
The work of Assanelli et al in the year of 1993 demonstrates the use of ECG data to
predict the heart diseases. Meanwhile, Ng, G. and Ong, K has developed a chest pain
expert system, which diagnoses the cause of chest pain leading towards the cardiac
attacks.
Text classification techniques combined with a Naive Bayes classifier and relational
learning algorithms are methods [5] used by Craven in the year of 1999. Hidden
Markov Models are used in Craven in the year of 2001, but similarly to Rosario and
Hearst produced in the year of 2004, the research focus was entity recognition. A
context based approach using MeSH term co-occurrences are used by Srinivasan and
Rindflesch for relationship discrimination between diseases and drugs [6]. A lot of
work is focused on building rules used to extract relation. Feldman et al. use a rule-
based system to extract relations that are focused on genes, proteins, drugs, and
diseases and demonstrated in 2002. Friedman et al. go deeper into building a rule-
based system by hand-crafting a semantic grammar and a set of semantic constraints
in order to recognize a range of biological and molecular relations [7].Henceforth this
work can be visualized as the potential findings of work and guidelines for the
performance of a framework that is capable to find relevant information about
diseases and treatments in a medical domain repository. The results that obtained will
show that it is a realistic scenario to use NLP and ML techniques [31] to build a tool
that capable to identify and disseminate textual information related to diseases and
treatments [32].
II. PROPOSED MULTILAYER PERCEPTRON MODEL
The proposed multilayer perceptron model [27] is made with the sole purpose to
reduce the confusion matrix and increase the accuracy of the clustering for diseases
based on severity. Henceforth here the work proposes the multilayer perceptron model
[Figure – 1].
Detection of Heart Disease Severity using A Novel Multilayer Perceptron Model 335
Figure 1: Proposed 3 + 2 Layer Proposed Multilayer Perceptron Model
The proposed MLP is arranged as the input layer is responsible for processing the
inputs during the training, the hidden layers are available for considering the weight
adjustment and finally the five output nodes are clustering the results in five
distinguished categories. The detail of the MLP is discussed in this section of the
work [Table – 1].
Table 1: MLP Characterestics
Attribute of MLP Detail Description
Back Propagation Learning Rule, number
of hidden layers
2 to 4 Layers
Random Number Seed 0
Learning Rate 1
Learning Rate Function Static learning rate
Constant Bias Input 1.0
Training Iterations 500
Training Mode Batch Training -
weight changes are applied at the end of
each epoch
Transfer Function Sigmoid (Logistic), S-shape function
between +1 and 0
336 N Satyanandam and Dr. Ch Satyanarayana
Momentum 0.2
Weight Decay 0.1
Bias Input Value 1.0
Inputs 36
Output Layer 5
Total Neurons 5
Total Nodes 226
The precise model of the proposed MLP is enlighten here:
The first layer or the input layer of the MLP is neutralized,
0
n
i i i
i
a In In
Equation................ (1)
The second layer or the first hidden layer,
k ii a Equation................ (2)
In the second layer or in the first hidden layer, the calculation of the weight is done
as follows:
( 1) ( )t t
Equation................ (3)
The third layer or the second hidden layer,
0
n
k i
i
i i
Equation................ (4)
In the third layer or in the second hidden layer, the calculation of the weight is done
as follows:
0
( 1) ( )n
i
i
t t
Equation................ (5)
The fourth layer or the third hidden layer,
0
n
k i
i
i i
Equation.............. (6)
In the fourth layer or the third hidden layer, the calculation of the weight is done as
follows:
0
( 1) ( )n
i
i
t t
Equation................ (7)
The results are been discussed in further section of the work.
Detection of Heart Disease Severity using A Novel Multilayer Perceptron Model 337
III. RESULTS AND DISCUSSION
The objective of this work is to increase the accurately identify and cluster the dataset
[28] for multiple levels of dieses severity [25,30]. Thus firstly, the categories of the
severity are identified [Table – 2].
Table 2: Clusters Information
Dieses Category Severity Predicted Diagnosis Cluster Name
No Disease (A) None None 0
Disease – 1 (B) First Level 1 Major Blood Vessel Blocked 1
Disease – 2 (C) Second Level 2 Major Blood Vessels Blocked 2
Disease – 3 (D) Third Level 3 Major Blood Vessels Blocked 3
Disease – 4 (E) Fourth Level 4 Major Blood Vessels Blocked 4
Henceforth the predictive model analysis is carried out in this work. This paper
compares the clustering performance with Random Tree and Random Forest with
proposed MLP in order to understand the improvement of the performance. Firstly,
the comparative study is carried out on Cleveland dataset [Table – 3].
Table 3: Performance Analysis on Cleveland Dataset
An
aly
sis Ty
pe
Co
rrec
tly C
lassified
Insta
nces (%
)
Inco
rrec
tly C
lassified
Insta
nces (%
)
Ka
pp
a sta
tistic
Mea
n a
bso
lute er
ror
Ro
ot m
ean
squ
are
d
erro
r
Rela
tive a
bso
lute er
ror
(%)
Ro
ot re
lativ
e squ
are
d
erro
r
(%)
Co
nfu
sion
Ma
trix
Random Tree 38.0952 61.9048 0.1381 0.2476 0.4976 85.3761 130.0106 Matrix - 1
Random Forest 47.619 52.381 0.253 0.2514 0.3632 86.6896 94.8893 Matrix - 2
Proposed MLP 61.9048 38.0952 0.4628 0.1534 0.3458 52.8948 90.3359 Matrix – 3
Improvement
|Proposed – Min
(Exisiting1,
Exisiting2) / Min
(Exisiting1,
Exisiting2) * 100 |
62.50 27.27 235.12 38.05 4.79 38.04 4.80 -
338 N Satyanandam and Dr. Ch Satyanarayana
Confusion Matrix – 1
A B C D E
0 1 1 0 1
1 5 6 1 1
0 4 5 3 0
0 2 4 6 0
0 0 0 1 0
Confusion Matrix – 2
A B C D E
1 2 0 0 0
1 11 0 2 0
1 8 2 1 0
0 4 2 6 0
0 0 0 1 0
Confusion Matrix – 3
A B C D E
1 2 0 0 0
1 13 0 0 0
0 4 3 5 0
0 0 4 8 0
0 0 0 0 1
Hence the improvement is clearly notable. Secondly, the comparative study is carried
out on Hungarian dataset [Table –4].
Table 4: Performance Analysis on Hundarian Dataset
A
na
lysis T
yp
e
Co
rre
ctly C
lassified
Insta
nces (%
)
Inco
rrec
tly C
lassified
Insta
nces (%
)
Ka
pp
a sta
tistic
Mea
n a
bso
lute er
ror
Ro
ot m
ean
squ
are
d err
or
Rela
tive a
bso
lute er
ror
(%)
Ro
ot re
lativ
e squ
are
d
erro
r
(%)
Co
nfu
sion
Ma
trix
Random Tree 65 35 0.3923 0.14 0.3742 61.3873 109.2505 Matrix - 1
Random Forest 71 29 0.4703 0.1332 0.2571 58.4056 75.0576 Matrix - 2
Proposed MLP 78 22 0.6042 0.083 0.2437 36.4086 71.1702 Matrix - 3
Improvement
|Proposed – Min
(Exisiting1, Exisiting2) /
Min (Exisiting1,
Exisiting2) * 100 |
20.00 24.14 54.01 37.69 5.21 37.66 5.18 -
Detection of Heart Disease Severity using A Novel Multilayer Perceptron Model 339
Confusion Matrix – 1
A B C D E
55 6 1 0 0
4 3 2 0 0
2 2 2 2 0
0 4 5 2 1
2 0 3 1 3
Confusion Matrix – 2
A B C D E
61 1 0 0 0
3 4 2 0 0
1 3 2 2 0
3 1 1 4 0
2 0 3 1 3
Confusion Matrix – 3
A B C D E
59 3 0 0 0
4 4 1 0 0
0 3 3 2 0
0 1 1 10 0
3 0 0 4 2
Hence the improvement is clearly notable. Thirdly, the comparative study is carried
out on Switzerland dataset [Table – 5].
Table 5: Performance Analysis on Switzerland Dataset
An
aly
sis
Ty
pe
Co
rrec
tly
Cla
ssif
ied
Inst
an
ces
(%)
Inco
rrec
tly
Cla
ssif
ied
Inst
an
ces
(%)
Ka
pp
a s
tati
stic
Mea
n a
bso
lute
erro
r
Ro
ot
mea
n
squ
are
d e
rro
r
Rel
ati
ve
ab
solu
te e
rro
r
(%)
Ro
ot
rela
tiv
e
squ
are
d e
rro
r
(%)
Co
nfu
sio
n
Ma
trix
Random
Tree
38.0952 61.9048
0.1381
0.2476 0.4976 85.3761 130.0106 Matrix
- 1
Random
Forest
47.619 52.381
0.253
0.2514 0.3632 86.6896 94.8893 Matrix
- 2
Proposed
MLP
61.9048 38.0952
0.4628
0.1534 0.3458 52.8948 90.3359 Matrix
– 3
Improvement
|Proposed –
Min
(Exisiting1,
Exisiting2) /
Min
(Exisiting1,
Exisiting2) *
100 |
62.50 27.27 235.12 38.05 4.79 38.04 4.80 -
340 N Satyanandam and Dr. Ch Satyanarayana
Confusion Matrix – 1
A B C D E
0 1 1 0 1
1 5 6 1 1
0 4 5 3 0
0 2 4 6 0
0 0 0 1 0
Confusion Matrix – 2
A B C D E
1 2 0 0 0
1 11 0 2 0
1 8 2 1 0
0 4 2 6 0
0 0 0 1 0
Confusion Matrix – 3
A B C D E
1 2 0 0 0
1 13 0 0 0
0 4 3 5 0
0 0 4 8 0
0 0 0 0 1
Hence the improvement is clearly notable. Fourthly, the comparative study is
carried out on V.A dataset [Table – 6].
Table 6: Performance Analysis on V.A. Dataset
An
aly
sis
Ty
pe
Co
rrec
tly
Cla
ssif
ied
Inst
an
ces
(%)
Inco
rrec
tly
Cla
ssif
ied
In
sta
nce
s
(%)
Ka
pp
a s
tati
stic
Mea
n a
bso
lute
erro
r
Ro
ot
mea
n s
qu
are
d
erro
r
Rel
ati
ve
ab
solu
te
erro
r
(%)
Ro
ot
rela
tiv
e
squ
are
d e
rro
r
(%)
Co
nfu
sio
n M
atr
ix
Random Tree 47.0588 52.9412 0.2984 0.2118 0.4602 68.5953 117.3321 Matrix - 1
Random Forest 39.7059 60.2941 0.1931 0.2576 0.3688 83.4576 94.0285 Matrix - 2
Proposed MLP 41.1765 58.8235 0.2102 0.2287 0.4338 74.0927 110.6122 Matrix - 3
Improvement
|Proposed – Min
(Exisiting1,
Exisiting2) / Min
(Exisiting1,
Exisiting2) * 100
|
1.16 11.11 29.56 7.98 5.74 8.01 17.64 -
Detection of Heart Disease Severity using A Novel Multilayer Perceptron Model 341
Confusion Matrix – 1
A B C D E
12 4 1 0 1
3 16 5 1 3
0 3 4 5 0
0 2 4 3 1
1 0 2 0 0
Confusion Matrix – 2
A B C D E
11 7 0 0 0
12 6 2 5 0
1 4 5 2 0
0 1 5 4 0
1 0 1 0 1
Confusion Matrix – 3
A B C D E
11 6 0 0 1
11 8 4 2 0
2 4 3 2 1
0 0 4 6 0
0 2 0 1 0
Hence the improvement is clearly notable. Hence, the overall improvement is also
considered [Table – 7].
Table 7: Performance Analysis on Hundarian Dataset
An
aly
sis Typ
e
Co
rrectly C
lassified
Insta
nces (%
)
Inco
rrectly
Cla
ssified
Insta
nces (%
)
Kap
pa
statistic
Mea
n a
bso
lute erro
r
Root m
ean
squ
ared
error
Rela
tive a
bso
lute
error
(%)
Root rela
tive sq
ua
red
error
(%)
Cleveland
Dataset
62.50 27.27 235.12 38.05 4.79 38.04 4.80
Hungarian
Dataset
3.70 11.11 8.86 7.98 17.62 8.01 17.64
Switzerland
Dataset
62.50 27.27 235.12 38.05 4.79 38.04 4.80
V. A. Dataset 1.16 11.11 29.56 7.98 5.74 8.01 17.64
342 N Satyanandam and Dr. Ch Satyanarayana
The improvement result is visualized graphically [Figure – 2].
Figure 2: Improvement over all Dataset
IV. CONCLUSION
The work analyses the current progresses [29,30,31] in the space of cardiovascular
syndromes. The difficulties identified by the current advancements as a clear
requirement for a technique to identify most appropriate set of parameters to be
processed during predictive analysis and a requirement for finding the optimal neural
network organization for predictive analysis. The first part of the work exhibits the
optimal genetic algorithm based searching techniques to find the optimal set of
attributes for better and timely prediction of the clustering methods. The construction
of the most suitable attributes set is been automated for any given dataset. Also the
work outcomes in to a MLP based 5-layered algorithms for correct and accurate
clustering of the data. The work demonstrates zero overlapping of the data during
clustering data.
REFERENCES
[1] Asha Gowda Karegowda and M.A. Jayaram, March 6-7, 2009. Cascading GA
& CFS for Feature Subset Selection in Medical Data Mining. International
Conference on IEEE International Advance Computing Conference
(IACC’09), Thapar University, Patiala, Punjab India.
[2] D. Goldberg .1989. Genetic Algorithms in Search, Optimization, and Machine
learning, Addison Wesley,
[3] I. H. Witten, E. Frank. 2005. Data Mining: Practical machine learning tools
and techniques. 2nd Edition, Morgan Kaufmann, San Francisco.
[4] J. Han And M. Kamber. 2001. Data Mining: Concepts and Techniques. San
Detection of Heart Disease Severity using A Novel Multilayer Perceptron Model 343
Francisco, Morgan Kauffmann Publishers.
[5] Jennifer G. Dy. 2004. Feature Selection for Unsupervised Learning, Journal of
Machine Learning, pp845-889.
[6] M.A.Jayaram, Asha Gowda Karegowda.2007. Integrating Decision Tree and
ANN for Categorization of Diabetics Data. International Conference on
Computer Aided Engineering, December 13-15, 2007, IIT Madras, Chennai,
India.
[7] Mark A. Hall ,Correlation-based Feature Selection for Machine Learning,
Dept of Computer science, University of
Waikato .http://www.cs.waikato.ac.nz/ mhall/thesis.pdf
[8] Manoranjan Dash, Kiseiok Choi, Petr Scheuermann, Huan Liu. 2002. Feature
Selection for Clustering – a Filter Solution. In Proceedings of the Second
International Conference on Data Mining.
[9] M. Dash 1, H. Liu2. March 1997. Feature Selection for Classification,
Intelligent Data Analysis 1 (131–156, www.elsevier.com/locate/ida]
[10] Ron Kohavi, George H. John.1997. Wrappers for feature subset Selection,
Artificial Intelligence, Vol. 97, No. 1-2. pp. 273-324.
[11] Shyamala Doraisamy ,Shahram Golzari ,Noris Mohd. Norowi, Md. Nasir B
Sulaiman , Nur Izura Udzir. 2008. A Study on Feature Selection and
ClassificationTechniques for Automatic Genre Classification of Traditional
Malay Music. ismir2008.ismir.net/papers/ISMIR2008 256.pdf(2008).
[12] Volfer Rotz, and Tilman Lange. 2003. Feature Selection in Clustering
Problems”, In Advances in Neural Information Processing Systems 16.
[13] Y.Saeys, I.Inza, and P. LarrANNaga,. 2007. A review of feature selection
techniquesin bioinformatics, Bioinformatics, 23(19),, pp.2507-2517.
[14] Z. Haiyang, "A Short Introduction to Data Mining and Its Applications", IEEE,
2011
[15] J. Han and M. Kamber, “Data Mining: Concepts and Techniques”, Morgan
Kaufmann, 2nd , 2006
[16] R. Agrawal, T. Imielinski, and A.N. Swami, "Database Mining: A
Performance Perspective," IEEE Trans. Knowledge and Data Engineering, vol.
5, no. 6, pp. 914-925, Dec. 1993.
[17] J.R. Quinlan, "Induction of Decision Trees," Machine Learning, vol. 1, no. 1,
pp. 81-106, 1986.
[18] J.R. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
[19] Y. Bengio, J. M. Buhmann, M. Embrechts, and J. M. Zurada, "Introduction to
the special issue on neural networks for data mining and knowledge
discovery," IEEE Trans. Neural Networks, vol. 11, pp. 545-549, 2000.
344 N Satyanandam and Dr. Ch Satyanarayana
[20] D. Michie, D.J. Spiegelhalter, and C.C. Taylor, "Machine Learning, Neural
and Statistical Classification", Ellis Horwood Series in Artificial Intelligence,
1994.
[21] J.R. Quinlan, "Comparing Connectionist and Symbolic Learning Methods,"
S.J. Hanson, G.A. Drastall, and R.L. Rivest, eds., Computational Learning
Theory and Natural Learning Systems, vol. 1, pp. 445-456. A Bradford Book,
MIT Press, 1994.
[22] J.W. Shavlik, R.J. Mooney, and G.G. Towell, "Symbolic and Neural Learning
Algorithms: An Experimental Comparison," Machine Learning, vol. 6, no. 2,
pp. 111-143, 1991.
[23] P. Clark and T. Niblett, "The CN2 induction algorithm. Machine learning",
3(4):261-283, 1989.
[24] Y. Freund and L. Mason. The alternating decision tree algorithm. In
Proceedings of the 16th International Conference on Machine Learning, pages
124-133, 1999.
[25] UCI Machine Learning Repository: http://archive.ics.uci.edu/ml/datasets.html
[26] Weka: http://www.cs.waikato.ac.nz/~ml/weka/
[27] I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine
Learning Tools and Techniques, 3rd ed. Morgan Kaufmann, 2011
[28] P. J. Werbos, "Backpropagation Through Time: What It Does and How to Do
It", IEEE, 1990
[29] H. Lu, R. Setiono, and H. Liu, "Effective Data Mining Using Neural
Networks", IEEE, 1996
[30] UCI Heart Disease Dataset
[31] N.Satyanandam and Dr. Ch. Satyanarayana, “A New Multilayer Perceptron
Model to Detect Heart Disease Severity”,IJSER,2016
[32] Stefan Bohn, Michael Lessnau and Oliver Burgert, “Monitoring and Diagnosis
of Networked Medical Hardware and Software for the Integrated Operating
Room”,Innovation Center Computer Assisted Surgery (ICCAS), University of
Leipzig, Germany,2010
Detection of Heart Disease Severity using A Novel Multilayer Perceptron Model 345
ABOUT THE AUTHORS
N Satyanandam is working as Associate Professor in the department
of Computer Science and Engineering, Bhoj Reddy Engineering
College for Women, Hyderabad, Telangana, India. He received
B.Tech(CSE) in 1996 and MBA (MM) in 1999; both from Andhra
University, Visakhapatnam and M.Tech (Computer Science&
Engineering) in 2004 from JNTU, Hyderabad. He has 17 years of
teaching experience. He is pursuing Ph.D in JNTUH-Hyderabad. He published 9
research papers in National and International Journals & Conferences. His research
areas of interests are Data Mining& Warehousing, Machine Learning, Neural
Networks, Digital Image Processing, . He is a Life Member of ISTE.
Dr. Ch Satyanarayana is working as a Professor in the department of
Computer Science & Engineering, University College of Engineering
JNTUK, Kakinada, Andhra Pradesh, India. He received B.Tech(CSE) in
1996 and M.Tech(CST) in 1998; both from Andhra University,
Visakhapatnam. He has 17 years of teaching experience in JNTUKUCE.
His research areas of interests are Pattern Recognition, Image Processing, Speech
Processing, Computer Graphics, Data Mining& Warehousing, Machine Learning and
Compiler Writing. He has published more than 100 papers in National and
International Journals & Conferences. He is a member of different technical bodies
like ISTE, IETE and CSI.
346 N Satyanandam and Dr. Ch Satyanarayana