Date post: | 27-Jan-2017 |
Category: |
Technology |
Upload: | flavio-luiz-seixas |
View: | 44 times |
Download: | 1 times |
•
October, 2015
FLAVIO LUIZ SEIXAS, PHD.
SIADE PROJECT
Participating Institutions
• Center for Studies and Research on Aging (CEPE-Rio),Vital Brazil Institute, Rio de Janeiro.
• Center for Alzheimer's Disease and Related Disorder (CDA-IPUB-UFRJ),Institute of Psychiatry, Federal University of Rio de Janeiro.
• Institute of Computing, Federal Fluminense University (IC-UFF), Niterói.
• Midiacom Lab, Federal Fluminense University, Niterói.
• Medical Sciences College, Rio de Janeiro State University, Rio de Janeiro.
• National Laboratory for Scientific Computing (INCT), Brazil.
• Pontifical Catholic University of Rio de Janeiro (PUC-Rio), Rio de Janeiro.
• King’s College London (KCL).
Agenda
• Motivation
• Objectives
• Clinical decision modeling
• Achievements
• Principal challenges and future works
Motivation
• Alzheimer’s disease represents 50-80% of dementia cases.
• Dementia has a prevalence of 7.8% of elderly from a local
community of São Paulo. Herrera et. al. (2002)
• Another survey indicated 6.9% of elderly from São Paulo.
Alzheimer’s represented 59% of dementia cases. Bottino et. al. (2006)
• Dementia has a prevalence from 4.6% to 9.7% of elderly. Rodriguez et. al. (2008).
• In 2020, Brazil will occupy the sixth worldwide ranking in
terms of elderly population.
Motivation
Decision support systems have been designed for helping
physician in clinical decision making.
Benefits:
• Ability to address the information overload that
physicians face.
• Integrating evidence-based knowledge.
Objective
Design and develop a clinical decision support system for
diagnosis of Dementia, Alzheimer`s Disease and Mild
Cognitive Impairment.
Why?
• World-wide population aging.
• High prevalence of Dementia among elderly.
• Early diagnosis of Alzheimer’s Disease can improve
the treatment efficiency, patient quality of life and
reduce the costs for public health systems.
CDSS - Principal Components
Physician
Mobile application
Communication interface
Inference engine
Knowledge base
Ask for a decision support for diagnosis.
Internet
HTTP messages
Provides suggestions for possible diagnosis that match a patient signs and symptoms.
Clinical decision support system
Published references related to diagnosis
criteria
Knowledge acquisition
Normal controls and patients’ clinical
records
Decision Modeling Process
Decision modeling for
a disease Identify the diagnosis
guideline for the disease
Diagnosis criteria for
the disease
Preprocess the clinical records of patients and normal controls
Training database
Build a Bayesian network structure
Perform Bayesian parameter learning
Evaluate the Bayesian learning
Deploy the decision model Acceptable
performance measures?
Review the decision model
Additional attributes
Additional clinical records
Decision model modeled
No
Yes
Patient carerequested
Take patient medical history and/or carry out clinical examinations for
dementia screening
Does the patient have
possible dementia?
Carry out neuropsychological tests for
Dementia
Carry out treatment for other diseases
Treatmentfollow-up (*)
If diagnosis of Dementia confirmed?
Carry out psychological tests exams for Mild
Cognitive Impairment
Carry out neuropsychological tests and exams for Dementia
due to Alzheimer s Disease
Treatmentfollow-up (*)
Treatment forDementia due to
Alzheimer s Diseasefollow-up (*)
Treatmentfollow-up (*)
Treatment forMild Cognitive
Impairmentfollow-up (*)
If diagnosis of Alzheimer s Disease
confirmed?
No Yes
Yes
No
If diagnosis of Mild Cognitive
Impairment confirmed?
No Yes No Yes
Dia
gnos
is o
f D
emen
tia,
Alz
heim
er s
Dis
ease
and
Mild
Cog
nitiv
e Im
pair
men
t
(*) A treatment should be defined by a physician
Diagnosis Process for Dementia, AD and MCI
Decision Modeling Process
Decision modeling for
a disease Identify the diagnosis
guideline for the disease
Diagnosis criteria for
the disease
Preprocess the clinical records of patients and normal controls
Training database
Build a Bayesian network structure
Perform Bayesian parameter learning
Evaluate the Bayesian learning
Deploy the decision model Acceptable
performance measures?
Review the decision model
Additional attributes
Additional clinical records
Decision model modeled
No
Yes
Preprocess the patients’ health
records Integrate the patients’ health records spread
across multiple spreadsheets in one
training database
Database balancing
Attributes selection
Discretize numerical attributes
Training database
preprocessed
Preprocessing the Health Records
positive 135
negative 45
Alzheimer’s Disease
Dementia
Mild Cognitive Impairmentnegative
67
positive 180
negative 35
positive 32
Composed by:• Normal controls and patients’ health records provided by Center for Alzheimer's
Disease and Related Disorder, Institute of Psychiatry, UFRJ.Project approved by Research Ethics Committee (2012).
Training Database
positive 135
negative 45
Alzheimer’s Disease Dementia Mild Cognitive Impairment
negative 67
positive 180
negative 35
positive 32
Befo
re b
alan
cing
Afte
r bal
anci
ng
negative 35
positive 32
negative 134
positive 180positive
135
negative 90
Data Balancing
Method:SMOTE (Synthetic Minority Over-sampling Technique)1
1: Chawla, N. V.; Bowyer, K. W.; Hall, L. O.; Kegelmeyer, W. P. SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research, v. 16, p. 321-357, 2002.
Attribute( MD( Entropy(
Mini$mental*state*examination*score* 5* 0.2791*
Clinical*Dementia*rating*scale* 11* 0.2441*
Pfeffer*questionnaire*score* 12* 0.2074*
Verbal*fluency*test*score* 8* 0.1665*
Clock*drawing*test*scale* 12* 0.0881*
Trial*making*test*scale* 40* 0.0829*
Age* 4* 0.0684*
Lawton*scale* 58* 0.0342*
IQCode*score* 56* 0.0324*
Stroop*color*word*test* 60* 0.0209*
Gender* 9* 0.0001*
Depression* 16* 0.0001*
Education*level* 2* 0.0423*
Rey*Complex*Figure* 78* 0.0181*
Cambridge*Cognitive*Examination* 79* 0.0000*
Digit*symbol* 81* 0.0000*
Neuropsychiatric*inventory* 56* 0.0000*
Cornell*depression*scale* 62* 0.0000*
Timed*Up*and*Go* 64* 0.0000*
POMA* 85* 0.0000*
Sit$to$Stand*test* 97* 0.0000*
Digit*span*test* 62* 0.0000*
Rey*Auditory$Verbal*Learning* 93* 0.0000*
Brain*anatomical*structures*volume* 83* 0.0000*
Criteria:Attributes filtered by missing
data rate (MD<60%)
AND
Information Gain
(Entropy>0.00001)
MD = Missing data ratio. It is calculated by
the ratio between the number of missing
data records and the total number of
records of the corresponding attribute.
Attributes Selection
Bayes’ Rule
Bayes’'rule:'
P(h | e) = P(e | h) ⋅P(h)P(e) !
the probability of a hypothesis h conditioned upon some evidence e is equal to its
likelihood P(e | h)!times its probability prior to any evidence P(h), normalized by
dividing P(e).
Definition: after applying Bayes’ theorem to obtain P(h | e) adopt that as your
posterior degree of belief in h, or Bel(h) = P(h | e).
Given dichotomous random variables (takes on one of only two possible values when
observed or measured):
P(h | e) = P(e | h) ⋅P(h)P(e | h) ⋅P(h)+P(e |¬h) ⋅P(¬h) !
Xi
Xj
Predictive
reasoning
Diagnostic
reasoning
Bayesian Network
Bayesian(network:(
Bayesian(network(is(a(graphical(structure(that(allows(us(to(represent(and(about(an( uncertain( domain.( The( nodes( in( a( Bayesian( network( represent( a( set( of(random(variables(X"="X1,"…"Xi,"…Xn.(A(set(of(directed(arcs((or(links)(connect(pairs(of(nodes(Xi"!"Xj,(representing(the(direct(dependencies(between(variables.(
Example:
Suppose that we have this very simple model of flu causing a high temperature with
the following prior and conditional probabilities distribution values.
If an individual has a high temperature (i.e., the evidence available is Hi=True), the
computation for this diagnostic reasoning is as follows:
Bel(Flu = True) =α ⋅P(Hi = True | Flu = True) ⋅P(Flu = True) =α ⋅0.05 ⋅0.9 =α ⋅0.045
Bel(Flu = False) =α ⋅P(Hi = True | Flu = False) ⋅P(Flu = False) =α ⋅0.95 ⋅0.2 =α ⋅0.19!
Pr(Flu=True) 5%
Pr(Flu=False) 95%
Pr(Hi=True | Flu=True) 90%
Pr(Hi=False | Flu=True) 10%
Pr(Hi=True | Flu=False) 20%
Pr(Hi=False | Flu=False) 80%
Bayesian Network
If an individual has a high temperature (i.e., the evidence available is Hi=True), the
computation for this diagnostic reasoning is as follows:
Bel(Flu = T ) =α ⋅P(Hi = T | Flu = T ) ⋅P(Flu = T ) =α ⋅0.05 ⋅0.9 =α ⋅0.045
Bel(Flu = F) =α ⋅P(Hi = T | Flu = F) ⋅P(Flu = F) =α ⋅0.95 ⋅0.2 =α ⋅0.19
Bel(Flu = T )+Bel(Flu = F) =1 given that variable states are mutually exclusive.
So,α ⋅0.045+α ⋅0.19 =1∴α = 10.045+ 0.19
Bel(Flu = True) = 0.0450.19+ 0.045
= 0.19
Bel(Flu = False) = 0.190.19+ 0.045
= 0.81
Bayesian Network
Decision Modeling Process
Decision modeling for
a disease Identify the diagnosis
guideline for the disease
Diagnosis criteria for
the disease
Preprocess the clinical records of patients and normal controls
Training database
Build a Bayesian network structure
Perform Bayesian parameter learning
Evaluate the Bayesian learning
Deploy the decision model Acceptable
performance measures?
Review the decision model
Additional attributes
Additional clinical records
Decision model modeled
No
Yes
…"Background information
(predisposal factors)
…"
Query node
(disease)
Findings (symptoms, signs,
neuropsychological tests results)
U
D B1 B2 Bn
Q
F1 F2 Fm
Utility function
Decision box
Generic Bayesian Network Structure
Decision Modeling Process
Decision modeling for
a disease Identify the diagnosis
guideline for the disease
Diagnosis criteria for
the disease
Preprocess the clinical records of patients and normal controls
Training database
Build a Bayesian network structure
Perform Bayesian parameter learning
Evaluate the Bayesian learning
Deploy the decision model Acceptable
performance measures?
Review the decision model
Additional attributes
Additional clinical records
Decision model modeled
No
Yes
Bayesian Learning
Objective:!learn!the!most!probable!h!given!data!D#=#{#Xi#;#di#}#!For-each-h-∈ -H:-
Calculate!P(h |D)∝P(D | h) ⋅P(h) !!Bayesian-estimators:-
Maximum!A!posteriori!Probability:!!hMAP = argmaxP(h |D) = argmaxP(D | h) ⋅P(h) !
!Maximum!Likelihood:!
hML = argmaxP(D | h) !
Discretize Numerical Attributes
Minimum&Description&Length&(MDL)&(1):&Occam’s razor: choose the shortest explanation for the observed data.
hMAP = argmaxP(D | h) ⋅P(h)hMAP = argmax lgP(D | h)+ lgP(h)[ ]hMAP = argmin − lgP(D | h)− lgP(h)[ ]
This equation can be interpreted as a statement that short hypotheses are preferred.
Assuming that LC(i) ≅ description length of message i with respect to C.
LCD|H (D | h) = − logP(D | h) , where CD|h is the optimal code for describing data D.
LCH (h) = − logP(h) , where CH is the optimal code for hypothesis space H.
So:
hMAP ∝ argminH∈h
LCD|h (D | h)+ LCH (h)#$ %&
1: Kononenko, I. On biases in estimating multi-valued attributes. International Joint Conference on Artificial Intelligence, 1995.
Lawrence Erlbaum Associates. p.1034-1040.
Bayesian Learning: EM Algorithm
Expectation-Maximization algorithm (1) (1/3):
• Find a maximum likelihood estimates for θ when given dataset is incomplete.
• Starts with random probability distributions.
• Alternates between two steps.
• Expectation step: “complete” the data set by using the current parameter
estimates θ̂ (calculate expectations for missing values).
• Maximization step: use the “complete” data set to find a new maximum
likelihood estimate θ̂ ' for the parameters.
1: Dempster, A. P.; Laird, N. M.; Rubin, D. B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the
Royal Statistical Society. Series B (Methodological), v. 39, n. 1, p. 1-38, 1977. ISSN 0035-9246.
Bayesian Learning: EM Algorithm
Expectation-Maximization algorithm (2/3):
Let:
yi – observable variables.
zi – latent variables.
θ – all possible parameters in the model.
Goal is to find:
θ̂ = argmaxθ
P(θ |D)
P(θ | yi,..., yn )∝P(y1...yn |θ ) ⋅P(θ )∝P(y1...yn |θ )
As P(y1...yn |θ ) = P(y1...yn, z1...zn |θ )∫ dz
Bayesian Learning: EM Algorithm
Expectation-Maximization algorithm (3/3):
Using the auxiliary function:
Q(θ |θt ) = P(z1...zn |θt, y1...yn )∫ logP(θ, z | y1...yn )dz
What EM algorithm does is:
θt+1 = argmaxQ(θ |θt ) , with random starting point.
E-Step: find the probabilities for z1…zn if all parameters are fixed to θt
M-Step: now that P(z1...zn |θt, y1...yn ) is fixed, find θ that maximizes the integral.
Utility
Dementia? 6%
94%
>13
0-13
Education
82%
18%
Female
Male
Gender
56%
44%
>72
0-72
Age
58%
42%
Positive
Negative
Diagnosis
1%
1%
16%
21%
50%
12%
5
4
3
2
1
0
Clock Drawing Test (CDT) scale
20%
41%
39%
27-30
18-26
0-17
Mini Mental State Exam (MMSE) score
51%
46%
16%
>11
5-11
0-4
Verbal Fluency Test (VFT) score
19%
15%
32%
29%
6%
3-severe
2-moderate
1-mild
0.5-very mild
0-normal control
Clinical Dementia Rating (CDR) scale
72%
28%
>3.55
0-3.55
IQCode (Informant Questionnaire on Cognitive Decline in the Elderly) score
74%
26%
>9
0-9
Lawton scale
71%
29%
>15
0-15
Stroop color word test
72% 18% 10%
>59 17-59
0-16 Trial Making Test (TMT)
39% 61%
>51 0-51
Berg balance scale
78%
8%
14%
>2
1-2
0
Pfeffer questionnaire
32% 68%
Presence Absence
Depression
Utility
Alzheimer’s Disease? Dementia?
4%
96%
>13
0-13
Education
77%
23%
Female
Male
Gender
66%
34%
>72
0-72
Age
59%
41%
Positive
Negative
Diagnosis
0%
1%
11%
12%
60%
17%
5
4
3
2
1
0
Clock Drawing Test (CDT) scale
4%
32%
64%
27-30
18-26
0-17
Mini Mental State Exam (MMSE) score
14%
60%
26%
>11
5-11
0-4
Verbal Fluency Test (VFT) score
23%
22%
54%
1%
0%
3-severe
2-moderate
1-mild
0.5-very mild
0-normal control
Clinical Dementia Rating (CDR) scale
71% 9% 20%
>59 17-59
0-16 Trial Making Test (TMT)
73%
27%
>15
0-15
Stroop color word test
77%
23%
>9
0-9
Lawton scale
81%
19%
>3.55
0-3.55
IQCode (Informant Questionnaire on Cognitive Decline in the Elderly) score
33% 67%
>51 0-51
Berg balance scale
97%
3%
0%
>2
1-2
0
Pfeffer questionnaire
37% 63%
Presence Absence
Depression
Positive
Utility
Mild Cognitive Disorder?
Dementia? 14%
86%
>15
0-15
Education
77%
23%
Female
Male Gender
48%
52%
>69
0-69
Age
59%
41%
Positive
Negative
Diagnosis
0%
0%
44%
28%
28%
0%
5
4
3
2
1
0
Clock Drawing Test (CDT) scale
39%
61%
29-30
0-28
Mini Mental State Exam (MMSE) score
37%
63%
>15
0-15
Verbal Fluency Test (VFT) score
2%
2%
32%
23%
41%
3-severe
2-moderate
1-mild
0.5-very mild
0-normal control
Clinical Dementia Rating (CDR) scale
78%
22%
>36
0-36
Trial Making Test (TMT)
47%
53%
>17
0-17
Stroop color word test
64%
36%
>14
0-14
Lawton scale
69%
31%
>3.32
0-0.32
IQCode (Informant Questionnaire on Cognitive Decline in the Elderly) score
41%
22%
36%
>55
55
0-54
Berg balance scale
43%
57%
>1
0-1
Pfeffer questionnaire
45% 55%
Presence Absence
Depression
Negative Mild Cognitive
Impairment?
Decision Modeling Process
Decision modeling for
a disease Identify the diagnosis
guideline for the disease
Diagnosis criteria for
the disease
Preprocess the clinical records of patients and normal controls
Training database
Build a Bayesian network structure
Perform Bayesian parameter learning
Evaluate the Bayesian learning
Deploy the decision model Acceptable
performance measures?
Review the decision model
Additional attributes
Additional clinical records
Decision model modeled
No
Yes
Bayesian Learning: Results Evaluation
1. Using cross-validation with 4 folds, we compared
Bayesian Network performance with other well-known
classifiers:• Näive Bayes
• Logistic Regression
• Multilayer Perceptron
• Decision Table
• Decision Stump using Boost algorithm
• J48 Decision Tree
2. Qualitative evaluation of sensitivity analysis results.
Bayesian Learning: Results Evaluation
Classification performance measures:
Performance measure Acronym Domain Best score
Area under ROC curve AUC [0, 1] 1
Harmonic mean of
precision and recallF1 [0, 1] 1
Mean square error MSE [0, 1] 0
Mean cross-entropy MXE [0, ∞) 0
Bayesian Learning: Results Evaluation
Bayesian Learning: Results Evaluation
Bayesian Learning: Results Evaluation
Bayesian Learning: Results Evaluation
Bayesian Learning: Sensitivity Analysis
Next Challenges
1. Design and develop a prototype application.
http://siade.midiacom.uff.br
Future Works
2. Evaluate the decision support system in a real clinical
daily routine.
3. Improve the decision model with a continuous Bayesian
network learning process.
4. Extend the clinical decision model to other domains.
Future Works
About Bayesian modeling:
1. How to establish a continuous parameters adjustment method for Bayesian
models?
2. A higher missing data ratio may cause bias, imprecision or confounding. Is it
possible finding out a model for missing data? What should be a reasonable
level of missing data ratio?
3. The independence between random variables with same parent is an
assumption from Bayesian-based models. What is the better way to deal
with it? What are its effects in the Bayesian results?
Questions
About Dementia and other related mental disorders:
4. How could we define a health cost-effective analysis for utility node?
5. Is there any other patients database with normal controls that could be used
as training database for Bayesian learning?
6. How could we integrate the identified decision points of the current clinical
guidelines with the decision boxes of Bayesian networks?
Questions
About Decision-Support System:
7. Is there any health information system that we could integrate with our
decision-support model?
8. Depends on (7), how could we assure the semantic interoperability between
the knowledge base mapped on decision-support model and the health
information system?
9. Our decision-support system has focused on clinical diagnosis process. Is
there another health care area that is relevant for designing and developing
a similar decision-support system? (e.g., patient-centered treatment
planning, health monitoring system...)
Questions
This research was partially supported by:
• FAPERJ (Research Support Foundation of the State of Rio de Janeiro).
• CNPQ (National Council for Scientific and Technological Development).
Acknowledgements
Acknowledgements
I would like to thank…
Robin Morris, Daniel Stahl (King’s College London),
Jerson Laks (Federal University of Rio de Janeiro), and
Daniel Mograbi (Pontifical Catholic University of Rio de Janeiro)
for such opportunity.