+ All Categories
Home > Documents > Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

Date post: 30-Dec-2016
Category:
Upload: elpida
View: 243 times
Download: 4 times
Share this document with a friend
17
Artificial Intelligence in Medicine 60 (2014) 133–149 Contents lists available at ScienceDirect Artificial Intelligence in Medicine j o ur na l ho mepage: www.elsevier.com/locate/aiim Temporal abstraction and temporal Bayesian networks in clinical domains: A survey Kalia Orphanou a,, Athena Stassopoulou b , Elpida Keravnou c a Department of Computer Science, University of Cyprus, P.O. Box 20537, 1678 Nicosia, Cyprus b Department of Computer Science, University of Nicosia, P.O. Box 24005, 1700 Nicosia, Cyprus c Department of Electrical Engineering, Computer Engineering and Informatics, Cyprus University of Technology, 30 Archbishop Kyprianou Street, 3036 Limassol, Cyprus a r t i c l e i n f o Article history: Received 6 February 2012 Received in revised form 15 November 2013 Accepted 27 December 2013 Keywords: Temporal abstraction Temporal reasoning Bayesian networks Temporal Bayesian networks Medical knowledge-based systems a b s t r a c t Objectives: Temporal abstraction (TA) of clinical data aims to abstract and interpret clinical data into meaningful higher-level interval concepts. Abstracted concepts are used for diagnostic, prediction and therapy planning purposes. On the other hand, temporal Bayesian networks (TBNs) are temporal exten- sions of the known probabilistic graphical models, Bayesian networks. TBNs can represent temporal relationships between events and their state changes, or the evolution of a process, through time. This paper offers a survey on techniques/methods from these two areas that were used independently in many clinical domains (e.g. diabetes, hepatitis, cancer) for various clinical tasks (e.g. diagnosis, prognosis). A main objective of this survey, in addition to presenting the key aspects of TA and TBNs, is to point out important benefits from a potential integration of TA and TBNs in medical domains and tasks. The moti- vation for integrating these two areas is their complementary function: TA provides clinicians with high level views of data while TBNs serve as a knowledge representation and reasoning tool under uncertainty, which is inherent in all clinical tasks. Methods: Key publications from these two areas of relevance to clinical systems, mainly circumscribed to the latest two decades, are reviewed and classified. TA techniques are compared on the basis of: (a) knowledge acquisition and representation for deriving TA concepts and (b) methodology for deriving basic and complex temporal abstractions. TBNs are compared on the basis of: (a) representation of time, (b) knowledge representation and acquisition, (c) inference methods and the computational demands of the network, and (d) their applications in medicine. Results: The survey performs an extensive comparative analysis to illustrate the separate merits and lim- itations of various TA and TBN techniques used in clinical systems with the purpose of anticipating potential gains through an integration of the two techniques, thus leading to a unified methodol- ogy for clinical systems. The surveyed contributions are evaluated using frameworks of respective key features. In addition, for the evaluation of TBN methods, a unifying clinical domain (diabetes) is used. Conclusion: The main conclusion transpiring from this review is that techniques/methods from these two areas, that so far are being largely used independently of each other in clinical domains, could be effectively integrated in the context of medical decision-support systems. The anticipated key benefits of the perceived integration are: (a) during problem solving, the reasoning can be directed at different levels of temporal and/or conceptual abstractions since the nodes of the TBNs can be complex entities, temporally and structurally and (b) during model building, knowledge generated in the form of basic and/or complex abstractions, can be deployed in a TBN. © 2014 Elsevier B.V. All rights reserved. Corresponding author. Tel.: +357 22 892673; fax: +357 22 892701. E-mail address: [email protected] (K. Orphanou). 1. Introduction Time is an integral aspect of medical problem solving. Various time-oriented systems have been developed focusing on differ- ent clinical tasks (e.g. prognosis) and areas (e.g. oncology). An important characteristic of these systems is the methodologies that they have used for the analysis, representation, interpretation and 0933-3657/$ see front matter © 2014 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.artmed.2013.12.007
Transcript
Page 1: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

Td

Ka

b

c

3

a

ARR1A

KTTBTM

0h

Artificial Intelligence in Medicine 60 (2014) 133–149

Contents lists available at ScienceDirect

Artificial Intelligence in Medicine

j o ur na l ho mepage: www.elsev ier .com/ locate /a i im

emporal abstraction and temporal Bayesian networks in clinicalomains: A survey

alia Orphanoua,∗, Athena Stassopouloub, Elpida Keravnouc

Department of Computer Science, University of Cyprus, P.O. Box 20537, 1678 Nicosia, CyprusDepartment of Computer Science, University of Nicosia, P.O. Box 24005, 1700 Nicosia, CyprusDepartment of Electrical Engineering, Computer Engineering and Informatics, Cyprus University of Technology, 30 Archbishop Kyprianou Street,036 Limassol, Cyprus

r t i c l e i n f o

rticle history:eceived 6 February 2012eceived in revised form5 November 2013ccepted 27 December 2013

eywords:emporal abstractionemporal reasoningayesian networksemporal Bayesian networksedical knowledge-based systems

a b s t r a c t

Objectives: Temporal abstraction (TA) of clinical data aims to abstract and interpret clinical data intomeaningful higher-level interval concepts. Abstracted concepts are used for diagnostic, prediction andtherapy planning purposes. On the other hand, temporal Bayesian networks (TBNs) are temporal exten-sions of the known probabilistic graphical models, Bayesian networks. TBNs can represent temporalrelationships between events and their state changes, or the evolution of a process, through time. Thispaper offers a survey on techniques/methods from these two areas that were used independently in manyclinical domains (e.g. diabetes, hepatitis, cancer) for various clinical tasks (e.g. diagnosis, prognosis). Amain objective of this survey, in addition to presenting the key aspects of TA and TBNs, is to point outimportant benefits from a potential integration of TA and TBNs in medical domains and tasks. The moti-vation for integrating these two areas is their complementary function: TA provides clinicians with highlevel views of data while TBNs serve as a knowledge representation and reasoning tool under uncertainty,which is inherent in all clinical tasks.Methods: Key publications from these two areas of relevance to clinical systems, mainly circumscribedto the latest two decades, are reviewed and classified. TA techniques are compared on the basis of: (a)knowledge acquisition and representation for deriving TA concepts and (b) methodology for derivingbasic and complex temporal abstractions. TBNs are compared on the basis of: (a) representation of time,(b) knowledge representation and acquisition, (c) inference methods and the computational demands ofthe network, and (d) their applications in medicine.Results: The survey performs an extensive comparative analysis to illustrate the separate merits and lim-itations of various TA and TBN techniques used in clinical systems with the purpose of anticipatingpotential gains through an integration of the two techniques, thus leading to a unified methodol-ogy for clinical systems. The surveyed contributions are evaluated using frameworks of respectivekey features. In addition, for the evaluation of TBN methods, a unifying clinical domain (diabetes) isused.Conclusion: The main conclusion transpiring from this review is that techniques/methods from these

two areas, that so far are being largely used independently of each other in clinical domains, could beeffectively integrated in the context of medical decision-support systems. The anticipated key benefitsof the perceived integration are: (a) during problem solving, the reasoning can be directed at differentlevels of temporal and/or conceptual abstractions since the nodes of the TBNs can be complex entities,temporally and structurally and (b) during model building, knowledge generated in the form of basicand/or complex abstractions, can be deployed in a TBN.

© 2014 Elsevier B.V. All rights reserved.

∗ Corresponding author. Tel.: +357 22 892673; fax: +357 22 892701.E-mail address: [email protected] (K. Orphanou).

933-3657/$ – see front matter © 2014 Elsevier B.V. All rights reserved.ttp://dx.doi.org/10.1016/j.artmed.2013.12.007

1. Introduction

Time is an integral aspect of medical problem solving. Varioustime-oriented systems have been developed focusing on differ-

ent clinical tasks (e.g. prognosis) and areas (e.g. oncology). Animportant characteristic of these systems is the methodologies thatthey have used for the analysis, representation, interpretation and
Page 2: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

1 ligenc

raipeahua

gaahtfaitsia

hcoTttcuaisFtbFihwreosbortgi

ttfokwrBe(iwtr

34 K. Orphanou et al. / Artificial Intel

easoning with longitudinal data [1]. Temporal abstraction (TA)nd temporal Bayesian networks (TBNs) have become two top-cs of much interest and research in such systems. TA is a crucialrocess, especially in clinical monitoring, therapy planning andxploration of clinical databases. TA methods deal with the man-gement and abstraction of time-oriented clinical data, providingigh-level views of such data under given contexts. Their principalse so far has been in therapy planning and in the summarizationnd interpretation of patient records.

TBNs are temporal extensions of Bayesian networks, which areraphical models representing explicitly probabilistic relationshipsmong variables. They can perform knowledge representationnd reasoning under uncertainty and this makes their applicationighly suitable for the medical domain which is inherently uncer-ain. Uncertainty is present in all medical tasks and it can be in theorm of uncertainty in the data, measurement uncertainty, as wells uncertainty in the medical knowledge. TBNs have been proposedn the literature to incorporate the explicit or implicit represen-ation of time. TBNs have many applications in medicine in tasksuch as medical diagnosis, forecasting, and medical decision mak-ng. They have been applied in various clinical domains to interpretnd reason with large amounts of longitudinal data [2–6].

Although TA methods and many temporal extensions of BNsave been successfully used independently of each other in manylinical systems, they are complementary to each other. TA meth-ds are used to create high-level temporal concepts from data whileBNs are used for reasoning and decision-making under uncer-ainty. An integration of these two methodologies with respecto clinical tasks, could potentially offer substantial benefits tolinicians. For instance, it could provide them with a concretenderstanding of how causal dependencies between temporalbstract concepts influence a particular disease outcome. Know-ng how state changes and how trends in variables affect a patient’state or the occurrence of a disease can facilitate treatment choices.urthermore, the interpretation of high-level data can be effec-ively achieved by constructing a causal temporal model that woulde able to explain the temporal patterns observed in the data.or example, the representation of temporal abstracted conceptsn the nodes of a complex network such as TreatBN [7], wouldave resulted in a simpler and more comprehensible network,here the represented abstracted concepts and the prognostic

esults are compatible with experts’ knowledge and medical lit-rature. Moreover, the network can give a concrete understatingf causal dependencies between abstract concepts occurring in aynchronous or asynchronous fashion utilizing temporal relationsesides precedence (e.g. meets, during, overlaps). Taking this devel-pment into consideration, the current paper provides a broadeview of what has been achieved until today independently inhese two areas and discusses the anticipated benefits of their inte-ration, which is what motivates our research in the direction ofntegrating TA with TBNs.

This survey reviews key aspects of TA methods with respecto clinical systems, such as the acquisition of knowledge drivinghe derivation of temporal abstractions and the methodologiesor deriving basic and complex temporal abstractions. The typesf temporal abstractions (basic, complex), and consequently thenowledge required for their derivation, constitute the linkageith the Bayesian networks. In other words, the derived tempo-

al abstractions will be given as input to the Bayesian network.oth basic and complex abstractions should be used in the infer-nce process. Acquiring the knowledge for deriving the abstractionsknowledge acquisition) is an essential part of a TA process which

s also useful for the construction of a TBN. With respect to TBNs,

e have chosen to focus on the categories of those networkshat have applications in clinical domains. These categories areeviewed based on the following criteria: (a) representation of time,

e in Medicine 60 (2014) 133–149

(b) knowledge representation and acquisition, (c) inference meth-ods and the computational demands of the network, and (d) theirapplications in medicine.

A simple example, drawn from the domain of diabetes, is usedfor illustrating the different categories of TBNs. The same exampleis used in Section 4 to discuss the potential integration of temporalabstraction with each of the surveyed TBNs. The diabetes domainwas chosen due to the fact that many TA methodologies and TBNmodel categories have applications in this domain [8–12].

A comprehensive review on temporal abstraction in intelligentclinical data analysis was published in 2007 [13]. That review pro-vided information on temporal abstraction techniques applied toclinical systems and their main features in general. However, it didnot address the potential integration of TA methods with respect toa particular problem solving engine such as the TBNs, which is thefocus of our study. As such, the aforementioned survey is broaderin context with respect to its coverage of TA methods and theirfeatures (information about the data that are abstracted, complextemporal abstractions, dimensionality of TA, etc.) in comparison toour survey, but it addresses TA methods in isolation of any higherdecision making per se. Our discussion of TA is on the one handfrom a narrower perspective, namely how to deploy effectively TAmethods in conjunction with TBNs, but on the other hand is from aspecific focus. As expected, there is some overlap between the twosurveys regarding their discussion of TA methods, but our focus isdistinct and is based on the possible integration of TA with TBNs.

The rest of the paper is organized as follows: Section 2 providesa comparative analysis of temporal abstraction techniques (basicand complex TA) and knowledge acquisition methods for creat-ing abstractions in the most well known clinical systems. Section 3presents those temporal extensions of Bayesian networks that havebeen applied in medical domains. Section 4 discusses the possibleintegration of TA with each of the presented TBN models using thediabetes example, and pointing out the advantages and disadvan-tages of each pairwise integration. Finally, Section 5 concludes thediscussion by outlining the benefits of the proposed integration andgiving pointers for further research.

2. Temporal abstraction

Time can be represented either as time points or as timeintervals. Time points (e.g. instants) are related to distinct eventswhereas intervals are related to situations lasting for a period oftime. The most influential, and by now classical, logical formalismsfor representing temporal information are those developed by Allen[14], McDermott [15] and Kowalski and Sergot [16]. Another for-malism proposed by Shoham [17] is an improvement of those threeformalisms, having precise syntax and semantics.

In real life, data (and knowledge), especially in medical domains,are riddled with uncertainty and incompleteness, or are undulylow-level and voluminous for direct reasoning with. Various formsof abstraction are therefore called for. A basic abstraction mech-anism is used for mapping data using a particular temporalgranularity (or time unit) or a set of temporal granularities, thushaving multiple granularities and means of traversing from one tothe other. When there are multiple temporal granularities, theseare often hierarchically organized, thus talking about finer/coarsergranularities, but there can be structural relations between distinctgranularities. A granularity functions to convert a dense time-axisto discrete chunks (granules) of time, enabling data to be expressedwith respect to conceptually meaningful units of time, instead of

infinitesimal references to time. A temporal granularity is thereforeused to specify the temporal qualification of a set of data. Tempo-ral qualifications are always chronologically ordered and relate theoccurrence time of an event to another event [18]. As mentioned,
Page 3: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

ligenc

tngdodpatdassHs

dmifaethtrrmtsaup

pewtTougtletsotHaiTtsfs

fiatIdaal

in a declarative way [13]. Similar to TOPAZ, the IDEFIX temporalreasoning mechanism requires knowledge of the time of validityof the attributes in the database and knowledge of the temporal

K. Orphanou et al. / Artificial Intel

he main use of a temporal granularity is to hide details that areot known are relevant for an application. Examples of temporalranularities are the ones related to Gregorian calendar such asay, minute, and second as well as the evolution and specializationf these granularities for specific contexts or applications: tradingays, banking days, academic semesters, etc. The choice of a tem-oral granularity (or set of temporal granularities) depends on thepplication. For example, for the diagnosis of a disease, some symp-oms are identified if a given sequence of events is occurring eachay for a week or so. Other times, the symptoms can be associated to

season, e.g. when diagnosing a hay fever [19]. Most of the decisionupport systems developed under clinical domains can support aingle time granularity for all the events represented in the system.owever, some decision support systems have been developed to

upport different granularities [20–23].The capability of providing a high-level view of the temporal

ata is a valuable step in processing data, for example throughonitoring the evolution of a patient [19]. Temporal abstraction

s a knowledge-based and heuristic process trying to bring out use-ul information from the data [24]. For example, state or mergebstractions, i.e. deriving maximal validity intervals for some prop-rty/parameter, as well as single trend abstractions, are basicemporal abstractions, while a chain of trends, periodic or otherigher level patterns are complex abstractions. Complex abstrac-ions can entail multiple properties/parameters as well as otherelations apart from temporal relations, e.g. causal or structuralelations. The derived abstracted data (high level concepts) can beatched against domain knowledge. In general, the TA task aims

o describe a set of time series data and external events throughequences of context-specific temporal intervals. The generatedbstractions are meaningful only within the duration of a partic-lar context interval, such as administration of the drug ‘AZT’ asart of the therapy of AIDS [25].

The TA task can be distinguished into two subtasks: a) basic tem-oral abstractions which abstract events (time-point data) withinpisodes (interval data) and b) complex temporal abstractionshich abstract episodes into other episodes using existing abstrac-

ions. Below, a number of intelligent clinical systems employingA techniques are overviewed. The systems are compared basedn the TA techniques applied to the data, the method that theysed for acquiring the knowledge that is needed for deriving theiven abstractions and whether they are able to perform abstrac-ions for single or multiple patients. It is noted that in machineearning, or more generally, the automatic or semi-automatic gen-ration of new medical knowledge, multiple cases/patients needo be processed. The generated knowledge forms the model forolving problems with respect to particular patients. In the contextf medical problem solving/decision making, TA mainly concernshe generation of abstractions from the records of a single patient.owever, the process is of direct relevance to knowledge gener-tion as well, since the abstractions from a representative set ofndividual patients can lead (be abstracted) to useful knowledge.he ability to perform abstractions for multiple patients, irrespec-ive of purpose, entails the generalizations of abstractions fromingle patients or the computation of abstracted concepts directlyrom the records of a population of patients or a combination ofuch techniques.

The ventilator management system (VM) [26] is one of therst temporal reasoning systems in medicine which uses temporalbstraction techniques to interpret online patients’ data for moni-oring purposes in intensive care units (ICUs). The TOPAZ [27] andDEFIX [28] systems are also two of the earlier clinical systems

eveloped. The TOPAZ system summarizes patient data by utilizingn integrated interpretation model approach. The IDEFIX system is

knowledge-based system which examines low-level data such asaboratory values or symptoms from patients who have systemic

e in Medicine 60 (2014) 133–149 135

lupus erythematosus (ARAMIS database) [29]. It infers high-leveldata such as renal failure using TA methods.

In addition, the RÉSUMÉ system [20] implements theknowledge-based temporal abstraction method (KBTA) [30] to cre-ate temporal abstractions. For the TA process, time stamped patientdata, clinical events and the domain knowledge base are givenas input. The RÉSUMÉ system was evaluated for the purpose ofsummarizing patient data in many clinical domains such as oncol-ogy [20], monitoring of children’s growth [31] and management ofinsulin-dependent diabetes [9]. An extension of the RÉSUMÉ sys-tem is the RASTA system [21] which uses a distribution algorithmthat allows the task of generating abstractions to be distributedover many computers. As it follows, RASTA is able to work on verylarge data sets and to abstract data streams for multiple patients.RASTA is used as a temporal reasoning component of clinical deci-sion support systems. Another methodology that was developed tocreate abstractions on multiple patients is called probabilistic tem-poral abstraction (PTA) [32]. The PTA is an extension of the KBTAmethod that was implemented in the RÉSUMÉ system.

The system developed by Ho et al. [22] is a clinical systemapplied to the hepatitis domain. In order to derive abstract descrip-tions and temporal patterns of hepatitis data, the system uses anew unsupervised temporal abstraction technique, called tempo-ral relation extraction [22]. This approach is effective for detectingtemporal patterns from irregular temporal sequences and tempo-ral relations amongst detected patterns. Furthermore, Batal et al.[33] proposed a framework for generating multivariate time seriesfeatures, suitable for classification tasks. This method is used toautomatically mine developed temporal abstraction patterns anduses them to learn a classification model. The framework wasapplied to the heparin-induced thrombocytopenia dataset.

2.1. Acquisition of knowledge

The above systems are now examined in a chronological orderon the basis of the aforementioned criteria. Knowledge acquisition(KA) tools are designed either based on expert domain knowledgeor using machine learning methods. The main problem of medi-cal knowledge acquisition is that clinicians cannot understand theformal specification languages that knowledge engineers use todescribe a problem, while knowledge engineers do not completelyunderstand the semantics of medical knowledge. Bellow we outlinevarious methodologies dealing with this problem.

In the VM program [26], knowledge acquisition is a data-basedprocess, collecting information from current patients’ data andpatients’ medical history. A data-based knowledge acquisition pro-cess is also used in many medical prediction problems. In suchproblems, survival outcome variables 1 are often dichotomized anda threshold value should be selected for the dichotomization pro-cess. In such systems, a data-driven knowledge acquisition methodis often used for threshold selection. In [34], the optimal thresh-old value to dichotomize the outcome length of stay at the ICU isselected based on the maximal model calibration of a constructedtree model. The tree model is constructed from the data.

Manual acquisition of abstraction knowledge based on expertknowledge is applied in the TOPAZ system [27]. Domain expert’sknowledge is required to define the clinical concepts that are rep-resented in the physiological mode. The knowledge is represented

1 Survival outcomes are the ones that describe the time until a specific eventoccurs such as the length of hospitalization, or the length of mechanical ventilation.

Page 4: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

1 ligenc

es

akt[teiwebao

awtk

itkipufi

qdTai

2

ttpsrahi

mtct(cmSat[ro

fiipm

deriving trends in historical and real data. The intervals are gen-erated using interpolation between data points [39] and temporalinference rules. These temporal intervals have the attribute values:

36 K. Orphanou et al. / Artificial Intel

volution of diseases. Knowledge is acquired from an expert andtored in a knowledge base.

Moreover, in the RÉSUMÉ system [20], the authors designed graphical knowledge acquisition (KA) tool that acquires thenowledge required for forming high-level concepts from rawime-oriented clinical data. This tool was designed using Protégé-II35], a tool for building knowledge-based systems. The resultant KAool is a semi-automated graphical tool, since it requires (unlike, forxample, machine-learning techniques that learn only from data)nteraction with a human domain expert (e.g. an expert physician)

ho enters the domain-specific knowledge manually. The knowl-dge engineer of a KA tool selects the temporal-abstraction methody defining the domain’s basic ontology (concepts such as ‘drugs’nd ‘protocols’) and the relationships between the domain’s ontol-gy and the method’s internal terminology (events).

Abstraction knowledge in the RASTA system [21] is entered by domain expert. The knowledge is stored in a knowledge basehich contains abstraction hierarchies and a detailed specifica-

ion for each abstraction. The PTA methodology also implementsnowledge acquisition from experts.

A more recent TA knowledge acquisition approach wasntroduced in [36] by Hatsek et al., who present and evaluatehe GESHER tool for graphical acquisition of KBTA ontologicalnowledge to support medical experts and knowledge engineersn specifying and maintaining the clinical knowledge. Experthysicians of a particular clinical domain, clinical editors whonderstand the medical knowledge and they are familiar withormal specification language, and knowledge engineers are allnvolved in the methodology of KA.

A comparison of two KA methods was presented in [37] whereualitative and quantitative TA techniques were applied to a pre-iction problem from intensive care monitoring data. In conclusion,A is by and large a knowledge driven process, and as such thecquisition of the relevant domain and other knowledge forms anntegral part of the given TA methodologies.

.2. Basic temporal abstraction techniques

As mentioned, there are two types of basic temporal abstrac-ions: state and trend [38]. State abstraction construction involveshe determination of the state of an individual parameter based onredefined categories (e.g. high glucose, low glucose). States repre-ent the time period during which an information is valid (e.g. heartate being stable during 5 min). Predefined states used for statebstractions are based on domain expert knowledge such as: low,igh, stable and unstable. Trend TAs, on the other hand, represent

ncrease, decrease and stationary patterns in time series.The VM system [26] accepts as input quantitative patient

easurements in ICU collected through sensors while the interpre-ation goal is stated in terms of clinical contexts. In order to providelinicians with a summary of a patient’s status, basic abstrac-ions are created using production rules within the given goalclinical contexts). The IDEFIX [28] system represents abstractedoncepts by three entities of increasing complexity: abnormal pri-ary attributes (APAs), abnormal states and abnormal diseases.

tate abstractions, such as abnormal states and abnormal primaryttributes are derived directly from database attribute values andhe knowledge of their range of normality. Moreover, the TOPAZ27] system generates abstractions by detecting significant tempo-al features in time-ordered data and by combining features intone concept in order to create interval-based abstractions.

In the RÉSUMÉ system [20], basic abstractions are derived using

ve subtasks: (i) temporal-context restriction, (ii) vertical temporal

nference, (iii) horizontal temporal inference, (iv) temporal inter-olation, and (v) temporal pattern matching. Overall, five temporalechanisms are used to solve these subtasks. Shahar’s approach

e in Medicine 60 (2014) 133–149

uses four of the mechanisms for deriving basic abstractions: thecontext forming mechanism is used to create domain specific inter-pretation contexts using combinations of domain-specific events,abstraction goals of the abstraction process and also combinationsof existing interpretation contexts. The contemporaneous abstrac-tion mechanism abstracts one or more parameters that occurredon simultaneous time points or time intervals, into a value of anew abstract parameter (state abstraction). The temporal inferencemechanism supports temporal semantic inference and temporalhorizontal inference. Temporal semantic inference infers interval-based logical conclusions given interval-based propositions andtemporal horizontal inference joins two abstractions to determinethe value of a new abstraction (e.g. joining two touching inter-vals which are decreasing into a decreasing superinterval). Finally,the temporal interpolation mechanism [39] connects gaps betweentime points or time intervals, using domain specific dynamicknowledge about the parameters involved.

The distributed algorithm of the RASTA system [21] allowsindependent evaluation of each abstraction in an abstraction hier-archy. Each abstraction process can be configured as a separateprocess running on different parallel machines and it is usedfor a single patient. Moreover, the PTA [32] approach performsabstractions for multiple patients. It derives basic abstractionsfor a single patient using the temporal interpolation subtask andone interpolation-dependent subtask, temporal coarsening. Thetemporal interpolation subtask estimates the distribution of astochastic process 2 state in a similar way as in the RÉSUMÉ sys-tem. For example, it estimates the distribution of raw hematologicaldata during a week in which raw data were not measured, usingthe distribution of values before and after that week. The temporalcoarsening subtask calculates the stochastic process at a coarsertime granularity. Moreover, for deriving abstractions for multi-ple patients, another two subtasks are used: temporal aggregationand temporal correlation which can derive both basic and complexabstractions. The temporal aggregation subtask applies an aggre-gation function (such as minimum, maximum, average, etc.) to thestates of stochastic processes of the same sample. The temporalcorrelation subtask compares two patient populations and out-puts a series of correlation factors between states of the stochasticprocesses of the two populations.

Ho et al. [22] system converts quantitative data from the hep-atitis dataset to qualitative data based on the following stateprimitives: {normal, low, very low, extreme low, high, very high,and extremely high}. The given system also generates trendabstractions based on the following trend primitives: {stable,increasing, fast increasing, decreasing and fast decreasing}. Themost frequently used method for deriving trend abstractions is apiecewise linear representation of raw temporal data. Segmenta-tion algorithms are divided into three groups [40]: sliding windowalgorithms, top-down and bottom-up algorithms. The choice ofalgorithms depends on the characteristics of the dataset such as thelength of the time-series, the expected level of noise and so on. Batalet al. [33] use the sliding window method to derive trend abstrac-tions giving them the labels: {increasing, decreasing, steady}.

Most of the approaches for deriving trend abstractions describedabove, assume that the data contains no noise. Salatian andHunter [41] overcome this limitation by applying a median filteringmethod to remove noise. It uses an interval merging algorithm for

2 A stochastic process {Xt : t ∈ T} is a set of random variables. The index t is ofteninterpreted as time, therefore Xt is referred as the state of process at time t. The setT is called the index set of the process.

Page 5: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

K. Orphanou et al. / Artificial Intelligenc

icravii

2

aTp(icorAuatsittr

tAw[twqeptd

amrT“ca

causes and its symptoms. Given some symptoms and some of thedisease causes (risk factors) the network can be used to computethe probability of the presence of a disease.

Fig. 1. Allen’s interval relationships [14].

ncreasing, steady and decreasing. Increasing and decreasing trendsan be classified into slow, moderate or rapid, depending on theirate of change. In contrast to Shahar’s approach, in which intervalsre classified as steady if all values in an interval have the samealue, the Hunter et al. approach derives steady intervals by merg-ng two distinct increasing and decreasing trends using temporalnferencing rules.

.3. Complex temporal abstraction techniques

Complex TAs take as input two or more interval sets (episodes)nd combine them to a new interval set associated to a new TA.he goal is to discover temporal relationships between discoveredatterns of basic TAs or other complex TAs. Each complex patterntime interval) is defined using starting and ending points of thenvolved episodes. Bellazzi et al. [8] state that complex abstractionsan be used in two different ways: (a) to represent the persistencef complex clinical situations and (b) to detect complex tempo-al patterns which can not be detected using basic abstractions.

popular technique for discovering complex temporal patterns issing Allen’s 13 temporal relationships as shown in Fig. 1 to expressny relationship held between time intervals. Some of the rela-ions are mirror image of each other, e.g. “Y started by X” is theame with the relation: “X starts Y”. Temporal relations give somensight into causal relationships and the derivation of temporal pat-erns is very helpful for inspecting temporal dependencies betweenhe basic abstracted concepts. The causal dependencies can then beepresented through a TBN.

One of the first developed clinical systems that incorporatedhe derivation of complex abstractions is the TOPAZ system [27].bstractions are generated using an object-oriented temporal net-ork, called ETNET [42] and a temporal query language TQuery

42]. ETNET uses temporal matching algorithms to associate impor-ant temporal intervals with observations from the patient’s record,hile TQuery retrieves data from the ETNET by constraining

ueries to specific contextual properties. Consequently, the infer-nce engine of TOPAZ searches for patterns in the rules that matchatterns in the data. In the IDEFIX system [28], complex abstrac-ions represent the presence of a disease and they are derived usingeductive reasoning.

Moreover, in the RÉSUMÉ system [20], temporal patterns arebstracted using the temporal pattern matching mechanism whichatches predefined complex temporal patterns or runtime tempo-

al queries with the abstractions created by other TA mechanisms.

he output is a parameter of the pattern abstraction type such asrebound hyperglycemia”. Abstractions can be joined together toreate new complex abstractions using the temporal interpolationnd temporal inference mechanisms and taking into consideration

e in Medicine 60 (2014) 133–149 137

the interpretation context intervals over which they hold [25]. Inthe RASTA system [21], complex abstractions are also generatedusing temporal interpolation and temporal inference mechanisms.Using the PTA approach [32], complex abstractions for a singlepatient are derived using temporal interpolation mechanisms butalso using temporal transformation mechanisms which create newstochastic processes given current stochastic processes of a lowerabstraction level. Complex abstractions for multiple patients arederived in a similar way as basic abstractions.

In Batal et al. [33] system, complex abstractions represent tem-poral patterns. The sliding window method is used to generatetemporal patterns by finding correlations between events. Morespecifically, the width of the sliding window (w) is the maximumpattern duration, as specified by the user. The algorithm only con-siders temporal relationships which can be observed within thiswindow. This algorithm is based on the assumption that events thatoccur far enough from each other, have no temporal relationship.The derived temporal patterns hold in time interval T within w.Temporal relationships are derived based on Allen’s interval rela-tions. Similarly, in the Ho et al. system [22], complex abstractionsare temporal patterns describing the temporal relations betweenstate and trend abstractions using the following relation primi-tives: ‘change state to’, ‘and’, ‘and then’, ‘majority/minority’ (e.g.X/Y for glucose levels means that the majority of points are in stateX = abnormal and the minority of points are in state Y = normal).

A comparison of TA methods as they are applied in the differentclinical systems described above is summarized in Table 1. Tem-poral abstraction techniques therefore extract qualitative discretedata utilizing domain/expert knowledge. For the integration of TAwith TBN, the constructed TBN will be used as an inference engineand the temporal abstracted concepts as its input. In the next sec-tion, temporal extensions of Bayesian networks applied in clinicaldomains are reviewed.

3. Bayesian networks and time

Bayesian networks (BNs) [43–45], also known as beliefnetworks, are graphs that belong to the family of probabilisticmodels. They were introduced as a knowledge representation andinference approach under uncertainty through the use of proba-bility theory. They were successfully applied in diverse fields suchas medical diagnosis [46], forecasting [47–50], image recognition[51,52], language understanding [53], risk assessment and manage-ment [54–56], speech recognition [57], and troubleshooting [58]amongst others.

A BN is a directed acyclic graph with an associated set of prob-ability distributions. The graph consists of vertices (nodes) andedges (arcs). Nodes on the graph represent random variables whichdenote an attribute, a feature or a hypothesis. Each node has amutually exclusive and exhaustive number of values (states). Thesenodes represent the variables of interest for a specific problem suchas a disease or a symptom. Arcs represent direct dependencies (orcause–effect relationships) among variables. An arc (arrow) fromA to B indicates that the value taken by variable B depends on thevalue taken by variable A (or that B is influenced, or caused, byA).3 The strength of these dependencies is quantified by conditionalprobabilities. As an example, a BN applied in the medical domaincould represent cause–effect relationships between a disease, its

3 If there is an arrow from node A to node B, A is said to be the parent of B.

Page 6: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

138 K. Orphanou et al. / Artificial Intelligence in Medicine 60 (2014) 133–149

Table 1Comparison of TA methods applied in clinical systems.

System/properties Basic TA methods Complex TA methods Knowledgeacquisition(KA) tool

Single/multipledata stream

RESUME KBTA method KBTA method Semi-automated

Single

Temporal-context restriction, temporalinference

Temporal interpolation, temporalpattern matching

RASTA Temporal-context restriction, temporalinference

Temporal interpolation Semi-automated

Multiple

TOPAZ Selecting significant temporal features No Expert SingleIDEFIX State abstractions based on normal

range chartDeductive reasoning to combine basicabstractions

Expert Single

PTA Method Single patient: temporal interpolationand temporal coarsening

Single patient: temporal interpolationand temporal transformation

Expert Multiple

Multiple patients: temporalaggregation and temporal correlation

Multiple patients: temporalaggregation and temporal correlation

Batal et al. Sliding window method for trendabstractions

Apply Allen’s relations between basicand/or other complex abstractions

Expert Single

State abstractions based on expert’sknowledge

Ho et al. State abstractions (low, normal, verylow, extreme low, high, very high andextreme high)

Looks for relations between basicabstractions using the followingrelation primitives: change state to,and, and then, majority/minority

Expert Single

Larizza et al. Sliding window method for trendabstractions

Apply Allen’s relations between basicor other complex abstractions

Expert Single

State abstractions based on expert’sknowledge

uctio

o

1

2

Td∏

vafnv

F

VM Production rules Prod

More formally, a BN with a set of variables A = A1, . . ., An consistsf:

A network structure which encodes the probabilistic dependen-cies among the variables.

The network parameters which are a set of local probabilitydistributions Pr(Ai|parents(Ai)) associated with each node. Eachprobability distribution quantifies the effect of parents on thenode. These are given as tables, in the case of discrete variables,which are called conditional probability tables (CPT). For vari-ables that do not have any parents (i.e. the roots), their priorprobability distribution is defined.

he network structure and the set of local probability distributionsefine the joint probability distribution for A as: Pr(A1, . . ., An) =

ni=1Pr(Ai|parents(Ai)) As an example, let us consider three discrete

ariables represented as nodes A, B and C and the network structures shown in Fig. 2. The network is quantified with prior probabilities

or the root nodes A and B and a conditional probability table forode C which defines Pr(C|A, B) over all possible combinations ofalues for A, B and C. The joint probability distribution is therefore

ig. 2. A BN model with three variables A, B, C where A and B are the parents of C.

n rules Data Single

defined as: Pr(A, B, C) = Pr(C|A, B) · Pr(A) · Pr(B) considering all thepossible values of the variables.

A variety of extensions to these networks introduce temporal-ity into BNs to model temporal phenomena and reasoning throughtime. In temporal extensions of BNs, the initial (prior) model (attime t0), should be constructed in cooperation with domain expertsas in atemporal BNs. The knowledge acquired from domain expertswill be used to define domain variables and the dependenciesbetween variables. The transition model includes the computationof parameters which should be either in a probability density func-tion form or in discrete form depending on the time representation.

Temporal probabilistic networks were recently used in manymedical problems such as diagnosis [6,59,60], treatment selec-tion [59,61], therapy monitoring [4,62] and prognosis [6,60,63–65].This is due to the fact that they can deal well with uncertainty intime-series medical data and they allow one to learn about causalrelationships and dependencies of clinical features.

The most popular temporal extension of BNs is dynamicBayesian networks (DBNs) [66], which use a discrete-time rep-resentation. Extensions of DBNs for decision making are dynamicinfluence diagrams (DIDs) [67,68] and partially observable Markovdecision processes (POMDP) [61,69]. Known models that useinterval-based representation of time are: networks of proba-bilistic events in discrete time (NPEDT) [3], modifiable temporalBayesian networks (MTBNs) [70], probabilistic temporal networks(PTN) [71] and temporal nodes Bayesian networks (TNBN) [60].Additionally, examples of models using continuous time repre-sentation are the continuous time Bayesian networks (CTBNs)[72] and Berzuini’s network of dates model [73]. Irregular timeBayesian networks (ITBNs) [74] are a new temporal extension ofBNs which are able to deal with processes occurring irregularlythrough time.

In the following subsections, we will choose to refer only to

those BNs that were applied to a clinical domain. Part of the diabetesexample proposed in [75] is used for their comparative analysis.In this example, three variables will be represented through theBayesian network models: Meal, CHO and BG. Meal represents the
Page 7: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

K. Orphanou et al. / Artificial Intelligence in Medicine 60 (2014) 133–149 139

diabe

awgBoifatao

wn

3

trim

3

tBttrnolraasatt‘naaot

na

Fig. 3. Dynamic Bayesian network with six time-slices (N = 5) representing the

mount of meal intake, which can take values from 0 to 100 ghereas CHO represents the rates of carbohydrate remaining in the

ut after taking a meal, which can take values from 0 to 8 mmol/kg.G represents the predicted blood glucose concentration at the endf the hour which can take values from 0 to 20 mmol/L. Edges arentroduced to represent interactions that occur within the modelor predicting the blood glucose concentration. The specific edgesnd the network topology, in general, are different for each of theemporal BN models described in this section. Hence, more detailsbout network connectivity will be given in the respective sectionf each BN model.

DBNs are described in Section 3.1. Section 3.2 presents NPEDTs,hereas Section 3.3 describes CTBNs. Irregular time Bayesianetworks are presented in Section 3.4.

.1. Dynamic Bayesian networks

DBNs are the most widely used temporal BNs. They are ableo model stochastic processes in discrete time [66,76] and utilize aepresentation of a dynamic process via a set of stochastic variablesn a sequence of time slices. In this subsection, we discuss the DBN

odel with respect to the four chosen criteria.

.1.1. Knowledge representation and acquisitionA DBN is a network with the repeated structure of a BN for each

ime slice over a certain interval [77]. A DBN is a tuple (B0, B1), where0 is a Bayesian network that represents the prior distribution forhe variables in the first time slice and B1 represents the transi-ion model for the variables in two consecutive time slices. DBNsepresent the change of variable states at different time points. Aode can be either a hidden node, whose values are never observed,r an observed node (with a known value). Arcs represent theocal or transitional dependencies among variables. Intra-slice arcsepresent the dependencies within the same time-slice like in antemporal BN. Inter-slice arcs connect nodes between time slicesnd represent their temporal evolutions. Every node in the secondlice and each node in the first slice that is not a root node has anssociated conditional probability table (CPT). DBNs are assumedo be time invariant which means that the network structure perime slice and across time-slices do not change. Therefore, the worddynamic’ in their name, is used to describe a dynamic system andot a network that changes structure over time. Furthermore, it isssumed that DBNs use the Markovian property: conditional prob-bility distribution of each variable at time n, for all n > 1, dependsnly on the parents from the same time slice or from the previousime slice but not from earlier time slices [77].

Let An =(A1n, A2

n, . . ., Amn ), m ≥ 1 denotes a set of variables at time

, with m being the number of variables, the parameters of a DBN

re defined as [66]:

The initial state distribution Pr(A0) at time slice zero such asPr(BG0) in the case of the diabetes example as introduced in Fig. 3.

tes example. The duration interval between two consecutive time slices is 4-h.

• The transition probability for the variables in two consecutivetime slices:

Pr(An|An−1) = �mi=1Pr(Ai

n|parentsn(Ain))Pr(Ai

n|parentsn−1(Ain))

where parents(Ain) denotes the parent set of (Ai

n) from the sametime slice n, and the set parentsn−1(Ai

n) denotes the parent set of(Ai

n) from the previous time slice.• The joint probability distribution (JPD) for N consecutive time

slices is defined by:Pr(A0, . . ., AN) = �N

n=0�mi=1Pr(Ai

n|parents(Ain)). In the diabetes

example of Fig. 3, an edge is introduced between variables Mealand CHO within the same time slice to indicate the effect of theamount of meal intake on the rate of carbohydrate remainingin the gut, and between CHO and BG at different time slices,indicating that the rate of carbohydrate remaining in the gutat a certain time t will affect the blood glucose levels at timet + 1. We assume for simplicity that each time slice representsthe time period of 4-h, thus the total number of time slicesis six and therefore n = [0, . . ., 5]. CHOn−1 and Mealn−1 forn > 1, are the observable variables that affect the probabilitydistribution of their children BGn and CHOn−1 respectively. BGn

is the hidden node at time n represented in the correspond-ing time slice. Thus, the JPD is computed by: Pr(Meal0, . . ., BG5) =�5

n=0Pr(Mealn) · Pr(CHOn|CHOn−1, Mealn) · Pr(BGn|BGn−1, CHOn−1)

3.1.2. Time representation and granularityA DBN cannot represent processes that evolve at different time

granularities, but it represents the whole system at the finest pos-sible granularity. The hidden Markov model (HMM) is consideredto be the simplest DBN which represents hidden variables thatprogress over time and the observable variables that are depend-ent on the hidden variables. The Kalman filter model (KFL) is thesimplest continuous DBN. It has the same topology as an HMM butall the nodes are assumed to have linear Gaussian distributions.

3.1.3. Inference and computational demandsIn many applications of DBNs, the underlying network struc-

ture is unknown and the most important problem is to learn thestructure and parameters from temporal data and prior domainknowledge. Observability is a significant factor on the solution of alearning problem since in situations of partial observability whennodes are hidden or when data are missing, it is harder to learn aBN [78]. The structure and the parameters of a DBN can be learnedeither from data or using expert knowledge.

The methods used for learning parameters from data are thefollowing: Markov chain Monte Carlo (MMC) [79] and the expecta-tion maximization algorithm (EM). These algorithms can be appliedboth to discrete and continuous variables. For a continuous variable,

the likelihood of a particular value is obtained from the probabil-ity density function. Various smoothing and filtering algorithmsare also methods for learning parameters for continuous variables,for instance, linear Gaussian distributions (such as Kalman filters,
Page 8: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

1 ligenc

lm

dredamtnat

a

PfarwDs

3

fsaecbimpimt

wwiisttih

40 K. Orphanou et al. / Artificial Intel

inear update filters) [80], a mixture of Gaussian distributions, andany other sampling and approximation algorithms.Learning DBN structures from data is a relatively new research

irection. The first investigation of a DBN structure learning waseported in [81] where human car driving conditions in a simulatednvironment were modeled. Learning was performed using twoifferent scoring metrics: Bayes Dirichlet equivalent scoring (BDe)nd Bayesian information criterion (BIC) [82] assuming that theodel followed a first-order Markov process. It consisted of two

asks: a) learning the prior network and b) learning the transitionetwork. In the presence of hidden variables (missing data), the EMlgorithm in combination with scoring metrics was used to learnhe network structure.

Assuming, y1:n = {y1, . . ., yn} represents the observations up tond including time n, inference in a DBN consists of three tasks:

Monitoring (filtering), which is the task of computing the currentbelief state for a variable Ai

n given all evidence that are availableup to and including time n. To achieve this, P(Ai

n|y1:n) needs to becalculated. It is used to keep track of the current state for makingrational decisions.Smoothing, which is the task of computing a belief state in thepast at time n given all evidence up to the time N (present).Thus, P(Ai

n|y1:N) is computed where N > n. Smoothing is usefulto get a better estimate of the past state, because more evidenceis available at time N than at time n.Prediction, which is the task of predicting a future belief stateat time n given all evidence from the past. This means that theprobability P(Ai

n|y1:N) needs to be calculated given the observa-tions that are available about the past up to time N, where N < n.Prediction can be used to evaluate the effect of possible actionson the future state.

Summarized, the goal of inference in a DBN is to calculate(An|y1:N), where N > n for smoothing, N < n for prediction and N = nor monitoring. Examples of algorithms for exact inference on DBNsre the forward-backward algorithm [83] and the interface algo-ithm [66]. Furthermore, algorithms for approximate reasoningere proposed in the literature, appropriate for large and complexBNs. Examples are the Boyen–Koller algorithm [84], stochastic

ampling [85] and variable elimination [86,87].

.1.4. Applications in medicineThere are several applications of DBNs in the medical domain,

or performing various clinical tasks such as diagnosis and progno-is. They represent medical knowledge explicitly in terms of causesnd effects as obtained from data, domain experts, and medical lit-rature. Considerable work on dynamic models in medicine wasarried out by Cao and collaborators who successfully used a com-ination of graphical models with Markov chains to solve problems

n different medical domains such as colorectal cancer manage-ent [88], neurosurgical intensive care unit monitoring [5,59], and

alate management [89]. Other applications of DBNs in medicinenclude forecasting sleep apnea [2], and diagnosis and decision

aking after monitoring patients suffering from renal failure andreated by hemodialysis [4].

In [90], a DBN was applied to predict the future state of patientsith a carcinoid tumor. A pathophysiological model of the patientas built by representing the effects of various complications start-

ng with those that have the largest effect on the variables ofnterest. A full prognostic model, able to predict future patienttates was built by representing the treatments and their poten-

ial effects. For the purposes of constructing a DBN model, an initialime, a transition interval and an ending time were chosen accord-ng to the available domain knowledge. The carcinoid model usedospital admission as the initial time, since this is a logical point to

e in Medicine 60 (2014) 133–149

start the prognostic process and a three month period as intervaltime, as the horizon was stated by the survival time of individualpatients [90].

The Pittsburg cervical cancer screening model (PCCSM) [63]is another DBN system used for the prediction of the risk ofcervical precancer and cervical cancer for patients undergoing cer-vical screening. PCCSM uses a combination of many sources ofknowledge including expert opinion, published medical literature,patient data collected for almost four years and test results. PCCSMis currently used in the United States as a new cervical cancerscreening technology following approval from the food and drugadministration.

Furthermore, a DBN was applied for the diagnosis and treatmentof ventilator-associated pneumonia (VAP) in patients admitted toan intensive care unit (ICU). A static network for diagnosing VAPwas constructed by Visscher et al. [91] and was used for everypatient for each day on the ICU separately, without taking intoaccount the patient’s characteristics from earlier days. However,the dynamic model [59] proved to have better performance at dis-tinguishing between patients with VAP and without VAP than thestatic model and gave better prediction results by taking into con-sideration all available history of a patient and by representing timeexplicitly. The dynamic model was constructed in cooperation witha domain expert; the transition interval for the model was chosento be 24 h. The elicitation method proposed by van der Gaag et al.[92,93] was used in order to estimate the conditional probabili-ties of the transitional relations of the two hidden processes by theexpert.

Scalzo et al. [94] proposed a probabilistic framework using adynamic Markov network to track intracranial pressure pulses inreal time. A dynamic Markov network was constructed to exploittemporal dependencies between successive peaks. An efficientnon-parametric Bayesian inference algorithm, was used for theinference process.

3.2. Network of probabilistic events in discrete time

Networks of probabilistic events in discrete time (NPEDT) isanother temporal probabilistic model proposed by Galán and Díez[3] which represents discrete-time events. In this subsection, wediscuss the NPEDT model with respect to the four chosen criteria.

3.2.1. Knowledge representation and acquisitionNodes in the model represent temporal random variables which

denote the presence or absence of an event at each time instant.For example, if variable B represents the event: “abnormal glucoselevels”, B[a] means that the patient had abnormal glucose levelsat instant a. The links in the network represent the causal tempo-ral relationships between events. The conditional probability table(CPT) of each variable represents the probability of occurrence ofthe child event given that its parent events occurred at any possi-ble time point. Thus, the CPT represents the most probable delaysbetween the occurrence of a parent event and the correspondingchild event. In a family of n parents A1, . . ., An and one child B, theCPT of B given its parents is defined by [3]:

P(B[tB]|A1[t1], . . ., An[tn]) with tB�[0, . . ., nB, never],

tA�[0, . . ., nA, never] (1)

where [0 − nB] is the temporal range of B and [0 − nA] is the temporalrange of A.

Temporal noisy gates [95,96] facilitate knowledge acquisition

and representation by modeling uncertain temporal knowledge.The noisy-OR gate [97] is a widely used canonical interaction modelin Bayesian networks used to model the interaction among n causes(children) (A1, A2, . . ., An) and their common effect (parent) Y. The
Page 9: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

K. Orphanou et al. / Artificial Intelligence in Medicine 60 (2014) 133–149 141

F , dividi

apmvte

3

aaifbm

mcac

3

t.Afcpea

posrPtgjei

3

medr

ig. 4. The NPEDT model for the diabetes example with a temporal range of 24 hncluded.

bility of each cause Ai to produce the effect is independent of theresence of other causes. Binary noisy-OR is the simplest canonicalodel where all possible causes Ai of an effect Y can take only two

alues. The main advantage of the noisy-OR gate is that it reduceshe number of parameters for each family from exponential to lin-ar in the number of parents.

.2.2. Time representation and granularityIn these models, each temporal random variable A can take on

set of values v[i], i ∈ {a, . . ., b, never} indicating the presence orbsence of an event at the particular time interval, where a and b arenstants representing the limits of the temporal range of interestor A. Time is discretized and it is divided into a discrete num-er of intervals of constant duration. Time representation (seconds,inutes, weeks, etc.) depends on the particular problem.The main difference with a DBN model is that in the NPEDT

odel, each value of a variable represents the instant at which aertain event may occur within a certain temporal range of interestnd not the state of the real-world property. Therefore, events thatan take n state values are represented with n different variables.

.2.3. Inference method and computational demandsIn a temporal network with one parent A and one child B, where

he temporal range of interest of event A is {0, . . ., tA} and of B is {0, . ., tB}, Pr(B[j]|A[i]) is the probability that B happens at j (tB = j) when

had happened at i (tA = i). The CPT for B given A is computed asollows: Pr(b[j]|a[i]) = 0 when j < i (since the effect cannot precede itsause) and when i = never (the effect never happened). In all otherossible delays between parent and child, the probability can bestimated by a human expert or obtained from data by taking intoccount the delay between A and B.

In our diabetes example displayed in Fig. 4, we assume for sim-licity of the model that all variables apart from Meal can havenly two states: {high, normal}. Let us assume that the discretizedtate values of Meal are: a1: 0–30 g, a2: 30–60 g and a3: 60–90 gepresented by variables Meala1, Meala2 and Meala3 respectively.ij, in our example, represents the probability Pr(BG[i]|CHO[j]) i.e.he probability that glucose levels are high at time instant i (BG[i])iven that the rates of carbohydrate remaining in the gut at instant

(CHO[j]) are high (e.g. >50 g). The selected temporal range of inter-st for the given example is one day (24 h). This period is dividednto 4-h intervals.

.2.4. Applications in medicineTo the best of our knowledge, the NasoNet system [62] is the only

edical system developed using the NPEDT approach. It models thevolution of a nasopharyngeal cancer to assist oncologists in theiagnosis and prognosis of this type of cancer in a patient. Each arcepresents a causal relation between one parent event and one child

ed into 4-h intervals. A general CPT for BG|CHO over all possible time instants is

event. For instance, the appearance of infection in the nasopha-rynx may produce rhinorrhea. Temporal noisy OR-gates are usedto model causal interactions in the network. The property of timeinvariance is assumed for the computation of conditional proba-bilities. Therefore, one parameter for each possible delay betweencause and effect is required for computing the conditional probabil-ity tables for each variable. The selected temporal range of interestfor the NasoNet dataset is three years following the appearanceof primary tumor. This period is divided into trimesters, accordingto the experts. Each event represented in the network has its owntypical period of occurrence within the temporal range of interest.

3.3. Continuous time Bayesian networks

Continuous time Bayesian networks (CTBNs) [72] are graphi-cal models which represent structured stochastic processes whosestates change continuously over time. In this subsection, we discussthe CTBN model with respect to the four chosen criteria.

3.3.1. Knowledge representation and acquisitionLet A be a set of local variables A1, . . ., An. Each Ai has a finite

domain of values Val(Ai). A continuous time Bayesian network overA consists of two components: the first is an initial distribution J1,specified as a Bayesian network over A, the second is a continuoustransition model, specified as:

• A directed graph G whose nodes are: A1, . . ., An and Par(Ai) denotesthe parents of Ai in G.

• A conditional intensity matrix (CIM), QAi |Par(Ai) for each variable

Ai ∈ A, which represents the state changes of variables throughtime. If Ai = xi then it stays in state xi for an amount of time expo-nentially distributed with parameter qx

i, where qx

i=

∑j /= iq

xij.

Intuitively, the intensity qxi

gives the probability of leaving statexi and the intensity qx

ijgives the probability of transitioning from

xi to xj.

Fig. 5 displays the CTBN model for the diabetes example. The struc-ture of the CTBN is the same as in atemporal BNs. A conditionalintensity matrix, (QAi |Par(Ai)) represents the transient behavior overtime for a particular variable e.g. BG.

3.3.2. Time representation and granularityTime is explicitly represented in a CTBN and it is able to rep-

resent processes evolving at different granularities. Nodes in thenetwork represent variables that evolve in continuous time andthe values of each variable depend on the state of its parents inthe graph. Each variable in the network represents a finite state

Page 10: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

142 K. Orphanou et al. / Artificial Intelligenc

Fig. 5. Graph structure of a CTBN model for the diabetes example. The QAi |Par(Ai )

msv

ci

3

miCai

aamtpca

3

crDlseviv

pdipeutmfe

3

saplr

atrix describes the transient behavior of a particular variable Ai in the exampleuch as BG whose domain is two values (v1, v2) conditioned on a set of its parentariables CHO, each of which can also evolve over time.

ontinuous time Markov process, which is a matrix of transitionntensities between every pair of states.

.3.3. Inference and computational demandsInference in CTBNs can be performed by exact and approxi-

ate algorithms. Full amalgamation [72] is an exact algorithm thatnvolves the combination of two CIMs to produce a single, largerIM. As the number of states is exponential to the number of vari-bles, exact inference in CTBNs is intractable, and thus approximatenference techniques have been proposed.

The clique tree inference algorithm [72], marginalization [72]nd expectation propagation (EP) algorithm [98] are some of thepproximate inference techniques. In the clique tree algorithm,essages passed between cliques are distributions over the entire

ime-ordered set of states represented as a homogeneous Markovrocess. In EP, the messages are passed over all the variables in eachluster graph. Sampling algorithms, such as Gibbs sampling, werelso introduced for approximate inference [99].

.3.4. Applications in medicineOne application of CTBNs in medicine is for the domain of colon

ancer. The model was applied to extract information about recur-ence of cancer from a sample of colon cancer patient records [100].omain knowledge is acquired both from physicians and medical

iterature. The dwell time (amount of time a variable remains in atate si) is assumed to be exponential to a parameter qsi

i. Data for

ach patient monitored through time (T = {t1, t2, . . ., tn}) are con-erted into time series of observations Oi for each ti. For each timenstant of interest, the conditional probability distribution for eachariable is computed given all the observations at that time.

Another application of the CTBN was as a joint diagnostic andrognostic model for diagnosing cardiogenic heart failure and pre-icting its likely evolution [6]. The topology of the CTBN is based

n the current medical literature in the field of cardiology. CIMarameters were computed based on a medical expert’s knowl-dge. Conditional probabilities for each variable were computedsing noisy-or-gate [101] and multivariate Gaussian [102] func-ions. The model is based on a time granularity of 10 s. Inference

ethods are used to predict the occurrence of heart failure and itsuture complications (shock, myocardial infraction) based on thevidence provided at each instant of time.

.4. Irregular time Bayesian networks

Irregular-time Bayesian networks (ITBNs) [10] generalize DBNsuch that each time slice may span over a time interval. The goal of

n ITBN is to model, learn and reason about structured stochasticrocesses that produce qualitative and quantitative data irregu-

arly along time. In this subsection, we discuss the ITBN model withespect to the four chosen criteria.

e in Medicine 60 (2014) 133–149

3.4.1. Knowledge representation and acquisitionVariable states are represented through a random vector Tj :

j ∈ N indexed by the order of time points of interest. Any func-tion �X from T, such that �Xt is a random variable for each t ∈ T, iscalled an irregular-time stochastic process. A vector of time off-sets denoted by: (ıt ∈ R : 1 ≤ t ≤ m) represents the time differencesexpressing a delayed effect between nodes of the same time slice.Knots (unused time points) in the network are used for changingdynamics between consecutive slices.

ITBNs have the ability to compute probabilities given anevidence from the far past in one step, which expresses the long-distance effects. It is the first model which applies semiparametricmethods [103] and in particular time-varying coefficient models[104] to longitudinal data analysis. A semiparametric conditionalprobability distribution for each variable is computed by using vec-tors to parameterize the varying coefficient model of the predictor(see varying coefficient models for more details [105]).

Knowledge about relations between nodes and time offsets isacquired only from data. Knowing time offsets is essential for learn-ing relations between nodes. Learning the structure of an ITBNwhen fixed time offsets between nodes are known or when thestructure is fully observed, is the same as learning the structure ofany Bayesian network. In other cases, it is very hard to learn timeoffsets from irregular incomplete data.

3.4.2. Time representation and granularityITBNs use a new method to represent random time points in a

vector Tj where only the time-points of interest need to be consid-ered and treated as evidence. Thus, ITBNs do not require a constanttime granularity for representing the changes of process states.

Fig. 6 represents the ITBN model for the diabetes example. Eachpatient eats five times a day thus five time slices are used, each onerepresenting the process of meal intake and the estimates of bloodglucose levels based on the rates of carbohydrate remaining in thegut. Time offsets (ı) represent the time taken for the carbohydrateCHO to be absorbed by the gut.

3.4.3. Inference and computational demandsInference in an ITBN may generally involve either estimating

unobserved nodes in a time slice, or finding the representative timepoint of a time slice. Irregular-time models are more compact thandiscrete-time models. Compact representation results in less mem-ory and time consumption during the learning and inference tasks.However, the computational demands for inference depends on thenumber of knots needed for learning the structure of the model.

Learning parameters of the ITBNs does not require to spec-ify a temporal granularity or constant observation rate. Since anirregular-time model may contain less points than a discrete timemodel, it maintains a longer hindsight for parameter learning. Dueto the longer hindsight for learning, there is a lower probabilis-tic model complexity which leads to a better fit of the learnedmodel. In comparison with DBNs, ITBNs need less time slices forrepresenting data, which means less transitions, which imply betterperformance of inference algorithms for estimating the unobservedstates. In order to find at what point in time a new time slice is likelyto take place, stochastic root finding algorithms are used [106].

3.4.4. Applications in medicineThe ITBN model was applied to a diabetes dataset [107], mon-

itoring the glucose levels of patients [10]. Glucose levels werecollected at six time points in a day. The number of hours since thelast meal had started (Time) is also included in each patient record.

The ITBN learning scheme selects the appropriate knot locationsper variable: at most three knots (unused time points) betweenevery two consecutive slices. An R tool [108] was created for learn-ing and inference purposes. Another ITBN application concerns the
Page 11: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

K. Orphanou et al. / Artificial Intelligence in Medicine 60 (2014) 133–149 143

slices

ioab

d[un

3

parc

asmarbttda

ticDavOeasrt

ttttdtdkmwms

Fig. 6. An ITBN model for the diabetes example represented in five time

nvestigation of the pharmacokinetics of drug cefamandole. Thebservations of the dataset were taken at irregular time pointsnd then represented through an ITBN with at most three knotsetween two consecutive slices.

The ITBN model was also applied to count large ovarian folliclesetected in different mares at several times in their estrus cycles10]. Both the measurements and the ovulations may occur at irreg-larly spaced time-points, thus were represented in an ITBN witho knots between two consecutive slices.

.5. Comparison of temporal extensions of BNs

In previous sections, we have discussed several alternative tem-oral probabilistic models that were proposed in the literature andpplied to clinical domains. The models discussed differ in timeepresentation, knowledge acquisition, inference methods and theomputational demands of the network.

In general, Bayesian networks consist of a graphical structurend an associated set of probabilities (parameters). Regarding thetructure of the proposed temporal extensions of BNs, in the ITBNodels, structure and temporal range change dynamically while in

ll other models, structure is invariant through time and the tempo-al range is constant. In the ITBN, time offsets represent the delayetween cause and effect, but in the NPEDT, CTBN and DBN, theime delay between cause and effect is not represented throughhe structure. All models integrate domain expert knowledge withata to create the structure, however, the DBN, CTBN and ITBN canlso learn the structure only from data using learning algorithms.

DBN and ITBN represent discrete time data capturing the evolu-ion of processes in a sequence of time slices. Their main differences that ITBN time slices represent the state of processes whichhanges at irregular time points. Moreover, each time slice in aBN represents the state of a real life property at time t. The NPEDTlso uses discrete-time representation, but in NPEDT each value of aariable represents the time instant at which a certain event occurs.n the other hand, CTBNs are able to represent processes thatvolve at continuous time. Temporal data representation requiresn appropriate choice of a time granularity for inference purposesuch as weeks, months, years. The CTBN and the ITBN are able toepresent data with different granularities, thus it is not essentialo choose a single granularity for these models.

Comparing the computational demands of the proposed models,he NPEDT approach uses temporal canonical models to computehe parameters and therefore, the learning process is linear tohe number of parents of each variable. The ITBN requires lessime-slices and less transitions than the DBN for representingata. Consequently, the ITBN has a better inference computa-ional demands than the DBN. In addition, the computationalemands for inference in the ITBN model depend on the number ofnots per variable. The CTBN inference process requires constant

onitoring of the hidden process for computing the parametershich increases the inference computational demands. Further-ore, exact inference in CTBNs is intractable since the number of

tates is exponential to the number of variables.

where ı is the time taken for the carbohydrate absorption from the gut.

Regarding their application domains, the NPEDT approach ismore appropriate for domains that involve temporal fault diagnosisand prediction since it leads to less complex networks than thoseobtained from the formalism of DBNs, by assuming that each eventoccurs only once. On the other hand, DBNs are more appropriate formonitoring tasks e.g. therapy monitoring, since they clearly definethe state of a system at each time-point and they can representreversible processes with one event. For the prediction task, wherefuture states are estimated given the present state, ITBNs are moreappropriate especially when the time point of interest is far into thefuture. In contrast to regular probabilistic models, irregular-timemodels have the ability to compute probabilities given evidencefrom the far past in one step, which expresses the long-distanceeffects. CTBNs are appropriate models for domains where data haveno natural time slices such as in the diagnosis for cardiogenic heartfailures. A comparison of temporal extensions of Bayesian networksbased on the features described above is displayed in Tables 2 and 3.

In the following section, we explore the integration of tempo-ral abstraction with each of the temporal extensions of Bayesiannetworks discussed in this section.

4. Discussion: thoughts about the integration

The aim is to develop a methodology for integrating tem-poral abstractions with temporal Bayesian networks, and toshow/explain why the integration has important benefits for var-ious medical tasks: diagnosis/classification, prediction/prognosis,monitoring, etc. The key benefit is that the relevant reasoning cantake place at higher and multiple levels of abstraction, thus control-ling computational overheads associated with reasoning at a singledetailed level, as well as allowing for more conceptual modeling ofthe given reasoning processes.

In the previous sections, we provided a comparative analysisof the systems that use TA techniques and of the various tempo-ral extensions of BNs that were applied in clinical domains basedon certain criteria. Considering the integration of the two areas,temporal abstraction methodologies will be used to extract bothbasic (i.e. state abstractions where a given property/parameteris considered steady at a given qualitative value over a maximalvalidity interval, or single trend abstractions where a given prop-erty/parameter is either increasing or decreasing over a maximalvalidity interval) and complex (i.e. any combination of basic andcomplex abstractions, involving temporal and structural relationsand periodicity). A system of temporal abstractions can deal withcontinuous or discrete time. Although real time is continuous, dis-crete time is closer to the spirit of temporal abstractions, whichis to yield out (abstract) from low level, uncertain and incompletedata, the essential information. When discrete time is used, thereis a basic time-unit (granularity) and/or other higher granularitiesgiving rise to a time model with multiple granularities.

The derived concepts will then be used for TBN modeldevelopment and deployment. For model development, tem-poral abstractions involve multiple cases, since the aim is toinduce knowledge from the generated temporal abstractions of a

Page 12: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

144 K. Orphanou et al. / Artificial Intelligence in Medicine 60 (2014) 133–149

Table 2Comparison of temporal extensions of Bayesian networks.

DBN NPEDT ITBN CTBN

Time representation Discrete time (time slices) Interval-based Irregular time (using a timevector)

Continuous time

Absolute time Absolute time Absolute time Absolute timeComputational demands High memory consumption in

learning parameters and inferencemethods

Learning parameters process islinear to the number of parentsof each variable

Inference depends on thenumber of knots per variable

Exact inference isintractable since thenumber of states isexponential to thenumber of variables

Less time slices thanDBN, thus less time andmemory consumption

Applications in medicine Head injury management Modeling the evolution of anasopharyngeal cancer

Monitoring patients withdiabetes

Prediction andmanagement ofrecurrence of coloncancer

Colorectal cancer management(PCCM)

Prediction and diagnosis ofpatients with nasopharyngealcancer (NasoNet system)

Diagnosis ofcardiogenic heartfailure and predictionof its likely evolution

ICU monitoringPalate managementForecasting sleep apneaDiagnosis of patients sufferingfrom renal failure and treated byhemodialysisPredict future state of patientswith cancer

Knowledge acquisition Data (learning algorithms) or/andhuman expert

Database (compoundparameters), expert (structureand net parameters)

Learning from data Learning from data andexpert

Knowledge representation Each time slice represents the stateof event at each time point

Nodes represent the presenceor absence of event at each

nt

Nodes represent processeschange state in irregular time

Nodes represent thestate of processes and

rotacs

hpIasfhtagtT

TC

time insta

Granularity Constant Constant

epresentative set of cases. For model deployment, for purposesf problem solving, temporal abstractions refer to single cases, i.e.he cases under consideration. Learning parameters and inferencelgorithms will be applied to the constructed model to providelinicians with probabilistic diagnosis, prediction or therapy clas-ification results.

In the following paragraphs, we discuss some ideas as to howigh-level abstracted concepts could be represented through tem-oral extensions of BNs using the diabetes example for illustration.

n this example, temporal abstracted concepts may express statebstractions (e.g. high glucose levels on four consecutive mea-urements) or trend abstractions (e.g. increasing glucose levels forour consecutive measurements) or even complex abstractions (e.g.yperglycemia present ‘meets’ glycosuria absent). Let us assumehat the model is to be used to advise on the adjustment of meal

nd carbohydrate intake in order to minimize the risk of hyper-lycaemia (but not hypoglycaemia) and abstracted concepts inhe example network represent only temporal state abstractions.herefore, the abstracted variables: BG, Meal and CHO can take two

able 3omparison of learning algorithms used in temporal extensions of Bayesian networks.

DBN NPEDT

Learning structure EM +scoring function (e.g. Brier score) Using exknowled

Learning parameters Expectation maximization algorithm(EM), smoothing and filteringalgorithms, maximum likelihood (ML)

Temporamodels

Inference Forward – backward algorithm,stochastic sampling algorithms, Boyen– Koller algorithm

Belief pralgorithm

points the CIM the statechanges through time

Multiple Multiple

values v1 and v2 each. For BG, v1 represents that glucose levels arenormal and v2 represents that glucose levels are high; v1 for Mealrepresents a high intake of meal (e.g. >50 g) and v2 represents anormal intake of meal (e.g. ≤50 g); v1 for CHO represents that highrates of carbohydrate remained in the gut (e.g. >4 mmol/kg) and v2represents that normal rates of carbohydrate remained in the gut(e.g. ≤4 mmol/kg).

A DBN can represent the abstracted concepts via a set of vari-ables in a sequence of time slices. Nodes will represent each stateof abstracted variables and time slices represent the transitionbetween states. Arcs represent the local or transitional depend-encies between abstracted concepts. Since temporal abstractedconcepts evolve at different time granularities, a DBN will repre-sent the whole system at the finest possible granularity. The finestgranularity is the smallest time interval period during which the

variable state value remains the same and it can be acquired fromexperts’ knowledge or abstracted data. In the diabetes example,as displayed in Fig. 7, by representing abstracted concepts, timeslices are reduced, since each time slice represents the time interval

ITBN CTBN

pertge

Learning offset, scoringfunction

Scoring function

l canonical Not stated Maximum likelihood

opagations

Belief propagationalgorithms, stochasticroot finding algorithms

Full amalgamation,clique tree inference,marginalization,expectationpropagation

Page 13: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

K. Orphanou et al. / Artificial Intelligence in Medicine 60 (2014) 133–149 145

Fig. 7. Dynamic Bayesian network in the diabetes example domain with three time-slo

wttrtfil(dsat

toeartuvmiaordceioc

tpg

Fe(

Fig. 9. Graph structure of a CTBN representing abstracted concepts of the diabetes

lices (N = 3). Each time slice represents the smallest time period needed for glucoseevels to change state from normal to abnormal and vice versa (e.g. an interval periodf 8 h).

here the state of the glucose levels remains the same. Assuminghat the finer granularity of the given example is 8-h, only threeime slices are necessary. Thus, the DBN needs less time slices toepresent high-level abstracted concepts than low-level data andhis reduces the inference complexity substantially. However, if thenest granularity of the problem is equal or finer than the granu-

arity representing the delay between two consecutive time pointse.g. 4 h), the number of time slices of a DBN representing low-levelata will be the same as the number of time slices of a DBN repre-enting high-level data. The integration of DBNs with TA is ideallypplied in monitoring the evolution of a patient’s disease throughime or predicting the patient state at a future time period.

By integrating NPEDT with TA, the network will represent basicemporal abstracted concepts at each of their possible time ofccurrence. Nodes represent the temporal abstracted variables andach value of a variable (e.g. normal[a]) represents the instant (e.g.) at which a certain event may occur within a certain temporalange of interest (24 h). In case the abstracted concepts representrend abstractions, each variable can take three or more state val-es (e.g. increasing, decreasing, steady) and three or more differentariables will be used to represent them, thus the model becomesore complex. Time instants represent a fixed number of time

ntervals with fixed duration for all variables, thus for representingbstracted variables, a finer granularity will be used for the durationf time intervals. Fig. 8 displays the NPEDT representing tempo-al state abstracted concepts for the diabetes example where theuration of time intervals is 8 h. Representing high level abstractedoncepts rather than representing low level data reduces the infer-nce computational demands. The integration of NPEDT with TAs more appropriate for diagnosing and predicting the evolutionf a disease in domains where the events are temporal abstractedoncepts that occur only once (irreversible processes).

A CTBN is a well-suited model for representing asynchronousemporal abstracted concepts, since each variable stays in aarticular state for a different interval duration (different timeranularities). Nodes in the network represent the temporal

ig. 8. NPEDT for the diabetes example representing abstracted concepts of eachvent. A general CPT P(BG|CHO) where the temporal range of both variables is threei.e. they can change state at three different time instants).

example. QAi |Par(Ai ) matrix describes the transient behavior of a particular variable,e.g. BG whose domain is two values (v1, v2) conditioned on a set of its parent variablesCHO, each of which can also evolve over time.

abstracted variables and their values represent the state of eachvariable at each time interval. It is assumed that a variable staysin a particular state (e.g. high glucose levels) for an amount oftime exponentially distributed with a particular parameter. A con-ditional intensity matrix (CIM) will model the local dependenceof one abstract variable on a set of others (e.g. the state or trend ofglucose levels depends on the state/trend value of the rate of carbo-hydrate remaining in the gut). Fig. 9 displays the CTBN representingstate temporal abstraction concepts of the diabetes example whereunits of time are hours. The CTBN structure for representing tem-poral abstracted concepts is the same as for representing low leveldata for this particular example. The integration of CTBN with TAcan be applied for diagnosing a disease or for predicting the evolu-tion of a disease.

The integration of ITBN with TA involves the representation oftemporal abstractions evolving at irregular time points. The ITBNdoes not require constant time granularity which makes the repre-sentation of temporal abstracted concepts easier. Abstract variablesevolve through time slices but each time slice spans over a timeinterval where only time points of interest are included and thetime interval duration may be different for each time slice. A delaybetween cause and effect is represented through the same timeslice, which is beneficial for monitoring the effect of a drug ortreatment success. The states of abstracted variables are repre-sented through a random vector. Time differences between twoconsecutive time slices are not constant, thus long distance effectsbetween abstracted concepts are able to be expressed given anevidence from the far past in one step. Let us assume that theinterval based abstracted concepts of the diabetes example requireonly three time slices for their representation, with at most twoknots (unused time points) between two consecutive slices as rep-resented in Fig. 10. Time points of interest are the time intervals

when variables change state from high to normal and vice versa.Inference algorithms can be used to find the time points of specificinterest. This integration will be preferable for prognosis wherefuture states are estimated given the present, especially if the time

Fig. 10. An ITBN model for the diabetes example representing temporal stateabstracted concepts.

Page 14: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

146 K. Orphanou et al. / Artificial Intelligenc

F

ienttt

itIottiaaostaehwcLtfdve

••

ig. 11. Representing asynchronous temporal abstraction concepts in a TBN.

nterval of interest is far into the future. It can also be applied tostimate the current state of a disease online (filtering), by addingew evidence incrementally. However, if the change of states ofemporal abstracted concepts is constant in time, the ITBN struc-ure and the inference computational demands are the same as forhe DBN.

DBN and NPEDT networks represent the temporal abstractionsn the network based on the finer granularity of the whole sys-em, however, temporal abstractions have varying temporal spans.t would therefore be restrictive to constrain the temporal spans,f all abstractions, to intervals of fixed duration (e.g. an integral ofhe basic time-unit), whereby imposing synchronicity amongst theemporal abstractions. In fact, it could be argued that synchronicitys counter-intuitive to the notion of temporal abstractions. This isn important point to be considered in the integration of temporalbstractions with BNs. BNs operate largely in a synchronous fashionn fixed time-slices, where the reasoning at the current time-lice cannot be influenced by past happenings unless they occur inhe immediately preceding time-slice. To lift this restriction, thusllowing for asynchronous temporal abstractions, it would be nec-ssary to expand the formalism of Bayesian networks, so that pastappenings of varying temporal and structural complexity (withinhat can be defined as a relevant past for a given domain and task)

an, if necessary, influence the reasoning at the current time-slice.ikewise, if the outcome of the reasoning has to do with predictinghe future, such predictions cannot be restricted, in a synchronousashion, to the immediately following time-slice, but could be pre-ictions about asynchronous future happenings/observations ofarying temporal and structural complexity within a perceived rel-vant future. The above considerations are illustrated in Fig. 11:

Let ts be a fixed time-slice and tsc be the current time-slice.Let rp be the relevant past expressed as an integral number of ts,preceding tsc.Let rf be the relevant future, also expressed as an integral numberof ts, following tsc.Finally let tw be the current time-window. For diagnosis andmonitoring tasks tw is the concatenation of rp and tsc and forprediction task, tw is the concatenation of rp, tsc and rf. This is amoving time-window since all its constituents continually move

forward by one ts (the basic granularity is an integral fraction ofts, and in fact the two could coincide − additional granularitiescan be expressed with respect to the basic time-unit and/or asgranules of conceptual periods within rp or rf).

e in Medicine 60 (2014) 133–149

In order to integrate the above temporal abstractions model with aBayesian model, the span of time for the repeated instances of theBayesian network would be the concatenation of rp with tsc, mean-ing that the relevant inferencing is influenced by the present andthe most recent relevant past, assuming a higher (>1) Markov order.Likewise, any derived inferences for the future are constrained bythe immediately relevant future as defined by rf. As such the nodesof the Bayesian network cannot be just atomic variables but com-plex concepts (temporal abstractions) as well.

In order to get more reliable results, it is important to modelthe causal relationships between abstracted variables based onprior knowledge. Knowledge is usually acquired through a domainexpert or medical literature or as a combination of learning algo-rithms applied in BNs with expert knowledge. For more validresults, it is better to integrate expert knowledge with empiri-cal data. Moreover, the acquisition or automated construction ofthe probabilistic transition links among different interval-basedTA concepts can be greatly assisted by current methods such astemporal association chart (TAC) [109,110]. TAC is a user-drivenknowledge-based visualization technique that supports the inves-tigation of probabilistic links over various temporal granularitieswithin multiple patient records among both raw and abstract tem-poral concepts.

Many automated methods were introduced for discovery andenumeration of frequent intervals relation patterns [111–115] andthey can be useful for deriving complex TAs by discovering the tem-poral relationships between the time intervals of basic abstractions.The KarmaLego algorithm proposed by Moskovitch and Shahar[115], is currently the fastest among the algorithms trying to dis-cover frequent temporal patterns from intervals, and in particularsymbolic intervals (i.e., TAs). The KarmaLego algorithm, gets asinput symbolic time intervals (TAs) which are then mined to findthe most frequent symbolic time interval related patterns (TIRPs).The TIRPs are constructed by mining all the possible temporal rela-tions between a pair of symbols (TAs) and then, the most frequentTIRPs are expanded to generate an explicit pattern tree, in whicheach pair of symbols is related by one or more of Allen’s tempo-ral relations. Each TIRP is described by two measures: the verticalsupport which refers to the number of entities that the TIRP occursat least once and the mean horizontal support, which refers to thenumber of its occurrences in each entity. These measures can beuseful in computing the weight of the probabilistic links betweentemporal patterns.

From the above discussion, it is apparent that the integrationof temporal extensions of BNs with TAs is showing a promisingresearch direction. As mentioned, the key benefit of the proposedintegration, irrespective of the medical domain and task, is thatthe inferencing would be more conceptual as it would be gearedat multiple levels of temporal abstraction, at the same time con-trolling the computational overheads associated with inferencing,restricted at a single, detailed level.

The choice of a TBN model for incorporating this integrationwith TA would depend on the clinical domain and the clinical taskof the problem at hand. The integration will be investigated furtherin our future work.

5. Conclusions

In this paper we have described and reviewed two broad areasof research applied to medical decision support systems, namelytemporal abstraction (TA) and temporal extensions of Bayesian

networks (TBNs). The purpose of this review was to explore thepotential integration of these two areas as a future research direc-tion. Although a Bayesian network is a popular data representationand inference approach in clinical domains, it has not yet been
Page 15: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

ligenc

iatwwtndTttwtiwr

dsbdidTp

R

K. Orphanou et al. / Artificial Intel

ntegrated with temporal abstraction techniques. We propose tochieve this by representing basic and complex temporal abstrac-ions through the nodes of a TBN. A unified example of diabetesas used in our study to illustrate the possible integration of TAith each TBN model presented in this paper. Our observation is

hat all TBN models reviewed here can be integrated with TA tech-iques and the choice of a particular TBN depends on the clinicalomain and the clinical task of the problem at hand. The features ofBNs that will assist on selecting the most appropriate model are:he knowledge representation and acquisition methods used, theime representation and the computational demands of the net-ork. Furthermore, for the integration process it is also important

o select the most appropriate knowledge acquisition method driv-ng the process of deriving both basic and complex abstractions, as

ell as the techniques for deriving both basic and complex tempo-al abstractions from the data.

In addition, integration can handle incomplete evidence in pre-icting disease outcomes, which is a usual challenge in clinicalystems. The interpretation of probabilistic results will be context-ased and will provide a concrete understanding of how causalependencies and temporal relationships (e.g. meets, starts, fin-

shes, overlaps) between abstract concepts influence a particularisease outcome. Our study demonstrates that the integration ofA and TBNs and its application to clinical problems and tasks is aromising research direction.

eferences

[1] Adlassnig K-P, Combi C, Das AK, Keravnou ET, Pozzi G. Temporal representa-tion and reasoning in medicine: research directions and challenges. ArtificialIntelligence in Medicine 2006;38(2):101–13.

[2] Dagum P, Galper A, Horvitz E. Dynamic network models for forecasting. In:Dubois D, Wellman MP, editors. Proceedings of the 8th conference on uncer-tainty in artificial intelligence, UAI 92. San Francisco, CA: Morgan KaufmannPublishers Inc.; 1992. p. 41–8. ISBN 1-55860-258-5.

[3] Galán SF, Díez FJ. Networks of probabilistic events in discrete time. Interna-tional Journal of Approximate Reasoning 2002;30(3):181–202.

[4] Rose C, Smaili C, Charpillet F. A dynamic Bayesian network for handling uncer-tainty in a decision support system adapted to the monitoring of patientstreated by hemodialysis. In: Proceedings of the 17th IEEE international con-ference on tools with artificial intelligence. 2005. p. 594–8.

[5] Peelen L, de Keizer N, Jonge E, Bosman R, Abu-Hanna A, Peek N. Usinghierarchical dynamic Bayesian networks to investigate dynamics of organfailure in patients in the intensive care unit. Journal of Biomedical Informatics2010;43(2):273–86.

[6] Gatti E, Luciani D, Stella F. A continuous time Bayesian network modelfor cardiogenic heart failure. Flexible Services and Manufacturing Journal2012;24(4):496–515. ISSN 1936-6582.

[7] Leibovici L, Paul M, Nielsen A, Tacconelli E, Andreassen S. The TREAT, project.International Journal of Antimicrobial Agents 2007;30(Suppl. 1):93–102. ISSN0924-8579.

[8] Bellazzi R, Larizza C, Riva A. Temporal abstractions for interpreting diabeticpatients monitoring data. Intelligent Data Analysis 1998;2(1–4):97–122.

[9] Shahar Y, Musen M. Knowledge-based temporal abstraction in clinicaldomains. Artificial Intelligence in Medicine 1996;8(3):267–98.

[10] Ramati M. Irregular time Markov models. Ben Gurion University; 2010 [Ph.D.thesis].

[11] Berzuini C, Bellazzi R, Quaglini S, Spiegelhalter D. Bayesian networks forpatient monitoring. Artificial Intelligence in Medicine 1992;4(3):243–60.

[12] Bellazzi R, Magni P, Nicolao GD. Bayesian analysis of blood glucose time seriesfrom diabetes home monitoring. IEEE Transactions on Biomedical Engineer-ing 2000;47(7):971–5. ISSN 0018-9294.

[13] Stacey M, McGregor C. Temporal abstraction in intelligent clinical data anal-ysis: a survey. Artificial Intelligence in Medicine 2007;39(1):1–24.

[14] Allen J. Towards a general theory of action and time. Artificial Intelligence1984;23(2):123–54.

[15] McDermott D. A temporal logic for reasoning about processes and plans.Cognitive Science 1982;6(2):101–55. ISSN 1551-6709.

[16] Kowalski R, Sergot M. A logic based calculus of events. New Generation Com-puting 1986;4(1):67–95. ISSN 0288-3635.

[17] Shoham Y. Temporal logics in AI: semantical and ontological considerations.Artificial Intelligence 1987;33:89–104.

[18] Bettini C, Wang XS, Jajodia S. Temporal granularity. In: Liu L, Özsu MT, editors.Encyclopedia of database systems. US: Springer; 2009. p. 2968–73. ISBN 978-0-387-35544-3, 978-0-387-39940-9.

[19] Augusto JC. Temporal reasoning for decision support in medicine. ArtificialIntelligence in Medicine 2005;33(1):1–24.

e in Medicine 60 (2014) 133–149 147

[20] Shahar Y, Musen M, et al. RÉSUMÉ: a temporal-abstraction system for patientmonitoring. Computers and Biomedical Research 1993;26:255–73.

[21] OConnor MJ, Grosso WE, Tu SW, Musen MA. RASTA: a distributed tempo-ral abstraction system to facilitate knowledge-driven monitoring of clinicaldatabases. Studies in Health Technology and Informatics 2001;1:508–12.

[22] Ho T, Nguyen C, Kawasaki S, Le S, Takabayashi K. Exploiting temporal rela-tions in mining hepatitis data. New Generation Computing 2007;25(3):247–62.

[23] Keravnou ET. A multidimensional and multigranular model of time for med-ical knowledge-based systems. Journal of Intelligent Information Systems1999;13(1–2):73–120.

[24] Lavrac N, Kononenko I, Keravnou E, Kukar M, Zupan B. Intelligent data analysisfor medical diagnosis using machine learning and temporal abstraction. AICommunications 1998;11(3,4):191–218. ISSN 0921-7126.

[25] Shahar Y. Dynamic temporal interpretation contexts for temporal abstraction.Annals of Mathematics and Artificial Intelligence 1998;22(1–2):159–92. ISSN1012-2443.

[26] Fagan LM, Kunz JC, Feigenbaum EA, Osborn JJ. Extensions to the rule-based formalism for a monitoring task. Rule-Based Expert Systems: TheMYCIN Experiments of the Stanford Heuristic Programming Project 1984:397–423.

[27] Kahn M, Fagan L, Sheiner L. Combining physiologic models and symbolicmethods to interpret time-varying patient data. Methods of Information inMedicine 1991;30(3):167–78.

[28] de Zegher-Geets I. IDEFIX: intelligent summarization of a time-oriented med-ical database. Stanford University; 1987 [Master’s thesis].

[29] Fries JF, McShane DJ. ARAMIS (The American Rheumatism Association Med-ical Information System): a prototypical national chronic-disease data bank.Western Journal of Medicine 1986;145(6):798–804.

[30] Shahar Y. A framework for knowledge-based temporal abstraction. ArtificialIntelligence 1997;90(1–2):79–133.

[31] Kuilboer MM, Shahar Y, Wilson DM, Musen MA. Knowledge reuse: temporal-abstraction mechanisms for the assessment of children’s growth. In:Proceedings of the annual symposium on computer application in medicalcare. 1993. p. 449–53.

[32] Ramati M, Shahar Y. Probabilistic abstraction of multiple longitudinal elec-tronic medical records. In: Proceedings of the 10th conference on artificialintelligence in medicine, AIME 05. 2005. p. 43–7.

[33] Batal I, Sacchi L, Bellazzi R, Hauskrecht M. Multivariate time series clas-sification with temporal abstractions. In: Lane HC, Guesgen HW, editors.Florida artificial intelligence research society conference. AAAI Press; 2009.p. 344–9.

[34] Verduijn M, Peek N, Voorbraak F, De Jonge E, De Mol B. Dichotomization of ICUlength of stay based on model calibration. Artificial Intelligence in Medicine2005:67–76.

[35] Musen MA, Gennari JH, Eriksson H, Tu SW, Puerta AR. PROTEGE-II: computersupport for development of intelligent systems from libraries of components.Medinfo 1995;8(Pt 1):766–70.

[36] Hatsek A, Shahar Y, Taieb-Maimon M, Shalom E, Klimov D, Lunenfeld E. Ascalable architecture for incremental specification and maintenance of pro-cedural and declarative clinical decision-support knowledge. Open MedicalInformatics Journal 2010;4:255–77. ISSN 1874-4311.

[37] Verduijn M, Sacchi L, Peek N, Bellazzi R, de Jonge E, de Mol BA. Temporalabstraction for feature extraction: a comparative case study in predic-tion from intensive care monitoring data. Artificial Intelligence in Medicine2007:1–12.

[38] Sacchi L, Larizza C, Combi C, Bellazzi R. Data mining with temporal abstrac-tions: learning rules from time series. Data Mining and Knowledge Discovery2007;15(2):217–47.

[39] Shahar Y. Knowledge-based temporal interpolation. Journal of Experimentaland Theoretical Artificial Intelligence 1996;11:123–44.

[40] Keogh EJ, Chu S, Hart D, Pazzani M. Segmenting time series: a survey and novelapproach. In: Last M, Kandel A, Bunke H, editors. Data mining in time seriesdatabases, vol. 57 of Series in machine perception and artificial intelligence,chap. 1. 2004. p. 1–22.

[41] Salatian A, Hunter J. Deriving trends in historical and real-time contin-uously sampled medical data. Journal of Intelligent Information Systems1999;13(1):47–71.

[42] Kahn M, Tu S, Fagan L. TQuery: a context-sensitive temporal query language.Computers and Biomedical Research 1991;24(5):401–19.

[43] Pearl J. Probabilistic reasoning in intelligent systems: networks of plausibleinference. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc; 1988.ISBN 0-934613-73-7.

[44] Koller D, Friedman N. Probabilistic graphical models: principles andtechniques. Cambridge, USA: The MIT Press; 2009. ISBN 0262013193,9780262013192.

[45] Weber P, Medina-Oliva G, Simon C, Iung B. Overview on Bayesian networksapplications for dependability risk analysis and maintenance areas. Engineer-ing Applications of Artificial Intelligence 2012;25(4):671–82.

[46] Spiegelhalter DJ, Franklin RCG, Bull K. Assessment, criticism and improve-ment of imprecise subjective probabilities for a medical expert system. In:

Proceedings of the 5th annual conference on uncertainty in artificial intelli-gence, UAI’89. 1990. p. 285–94.

[47] Abramson B, Brown J, Edwards W, Murphy A, Winkler RL. Hailfinder: aBayesian system for forecasting severe weather. International Journal of Fore-casting 1996;12(1):57–71.

Page 16: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

1 ligenc

48 K. Orphanou et al. / Artificial Intel

[48] Verduijn M, Peek N, Rosseel P, de Jonge E, de Mol B. Prognostic Bayesiannetworks. I: Rationale, learning procedure, and clinical use. Journal ofBiomedical Informatics 2007;40(6):609–18.

[49] Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ, Chung S, et al. A Bayesiannetworks approach for predicting protein-protein interactions from genomicdata. Science 2003;302(5644):449–53.

[50] Sun S, Zhang C, Yu G. A Bayesian network approach to traffic flow fore-casting. IEEE Transactions on Intelligent Transportation Systems 2006;7(1):124–32.

[51] Booker LB, Hota N. Probabilistic reasoning about ship images. In: LemmerJF, Kanal LN, editors. UAI’86: proceedings of the 2nd annual conference onuncertainty in artificial intelligence. Elsevier; 1986. p. 371–80.

[52] Stassopoulou A, Caelli T. Building detection using Bayesian networks.International Journal of Pattern Recognition and Artificial Intelligence2000;14(6):715–33.

[53] Charniak E, Goldman RP. A semantics for probabilistic quantifier-free first-order languages, with particular application to story understanding. In:Sridharan NS, editor. Proceedings of the 11th international joint conferenceon artificial intelligence (IJCA). Morgan Kaufmann Publishers Inc; 1989. p.1074–9.

[54] Stassopoulou A, Petrou M, Kittler J. Application of a Bayesian network ina GIS based decision making system. International Journal of GeographicalInformation Science 1998;14:23–45.

[55] Lee E, Park Y, Shin JG. Large engineering project risk managementusing a Bayesian belief network. Expert Systems with Applications2009;36(3):5880–7.

[56] Fenton NE, Neil MD. Risk assessment and decision analysis with Bayesiannetworks, 1. Queen Mary University of London, UK: CRC Press; 2012,November. ISBN 1439809100.

[57] Zweig G, Russell S. Probabilistic modeling with Bayesian networks forautomatic speech recognition. Australian Journal of Intelligent InformationProcessing 1999;5(4):253–60.

[58] Horvitz E, Breese J, Heckerman D, Hovel D, Rommelse K. The Lumiere project:Bayesian user modeling for inferring the goals and needs of software users.In: Cooper GF, Moral S, editors. Proceedings of the 14th conference onuncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc; 1998.p. 256–65.

[59] Charitos T, van der Gaag LC, Visscher S, Schurink KAM, Lucas PJF. A dynamicBayesian network for diagnosing ventilator-associated pneumonia in ICUpatients. Expert Systems and Applications 2009;36(2):1249–58. ISSN 15060957-4174.

[60] Arroyo-Figueroa G, Sucar L. Temporal Bayesian network of eventsfor diagnosis and prediction in dynamic domains. Applied Intelligence2005;23(2):77–86.

[61] Hauskrecht M, Fraser HSF. Planning treatment of ischemic heart diseasewith partially observable Markov decision processes. Artificial Intelligencein Medicine 2000;18(3):221–44.

[62] Galán S, Aguado F, Díez FJ, Mira J, NasoNet. Joining Bayesian networks, andtime to model nasopharyngeal cancer spread. In: Quaglini S, Barahona P,Andreassen S, editors. Artificial intelligence in medicine, lecture notes incomputer science. Berlin, Heidelberg: Springer; 2001. p. 207–16.

[63] Austin R, Onisko A, Druzdzel M. The Pittsburgh cervical cancer screeningmodel: a risk assessment tool. Archives of Pathology and Laboratory Medicine2010;134(5):744–50.

[64] Sandilya S, Rao RB. Continuous-time Bayesian modeling of clinical data. In:Berry MW, Skillicorn D, Kamath C, Dayal U, editors. Proceedings of the 4thSIAM international conference on data mining (SDM). 2004. p. 22–4.

[65] Dagum P, Galper A. Forecasting sleep apnea with dynamic network models.In: Heckerman D, Mamdani EH, editors. Proceedings of the 9th annual con-ference on uncertainty in artificial intelligence (UAI-93). San Francisco, CA:Morgan Kaufmann Publishers Inc; 1993. p. 64–71.

[66] Murphy KP. Dynamic Bayesian networks: representation, inference andlearning. UC Berkeley: Department of Computer Science; 2002 [Ph.D. thesis].

[67] Provan G. Tradeoffs in constructing and evaluating temporal influence dia-grams. In: Heckerman D, Mamdani EH, editors. Proceedings of the 9thinternational conference on uncertainty in artificial intelligence. MorganKaufmann Publishers Inc; 1993. p. 40–7.

[68] Shachter RD. Probabilistic inference and influence diagrams. OperationsResearch 1988;36(4):589–604.

[69] Peek N. Explicit temporal models for decision-theoretic planning of clinicalmanagement. Artificial Intelligence in Medicine 1999;15:135–54.

[70] Aliferis C, Cooper G. A new formalism for temporal modeling in medicaldecision-support systems. In: Proceedings of the annual symposium on com-puter application in medical care. 1995. p. 213.

[71] Santos E, Young JD. Probabilistic temporal networks: a unified framework forreasoning with time and uncertainty. International Journal of ApproximateReasoning 1999;20:263–91.

[72] Nodelman U, Shelton C, Koller D. Continuous time Bayesian networks. In:Darwiche A, Friedman N, editors. Proceedings of the 18th conference onuncertainty in artificial intelligence (UAI). 2002. p. 378–87.

[73] Berzuini C. Representing time in causal probabilistic networks. In: Henrion

M, Shachter RD, Kanal LN, Lemmer JF, editors. Proceedings of the 5th annualconference on uncertainty in artificial intelligence. North-Holland PublishingCo; 1990. p. 15–28.

[74] Ramati M, Shahar Y. Irregular-time Bayesian networks. In: Grünwald P,Spirtes P, editors. Proceedings of the 26th conference annual conference on

e in Medicine 60 (2014) 133–149

uncertainty in artificial intelligence (UAI-10). Corvallis, OR: AUAI Press; 2010.p. 484–91.

[75] Andreassen S, Benn JJ, Hovorka R, Olesen KG, Carson ER. A probabilisticapproach to glucose prediction and insulin dose adjustment: descrip-tion of metabolic model and pilot evaluation study. Computer Methodsand Programs in Biomedicine 1994;41(3–4):153–65. ISSN 0169-2607,doi:10.1016/0169-2607(94)90052-3.

[76] Dean T, Kanazawa K. A model for reasoning about persistence and causation.Computational Intelligence 1989;5(3):142–50. ISSN 0824-7935.

[77] Charitos T. Reasoning with dynamic networks in practice. Netherlands:Utrecht University; 2007 [Ph.D. thesis].

[78] Neapolitan R. Learning Bayesian networks, artificial intelligence. PearsonPrentice Hall; 2004. ISBN 9780130125347.

[79] Robert C, Richardson S. Markov chain Monte Carlo methods. Discretizationand MCMC Convergence Assessment 1998:1–25.

[80] Doucet A, Godsill S, Andrieu C. On sequential Monte Carlo sampling methodsfor Bayesian filtering. Statistics and Computing 2000;10(3):197–208.

[81] Friedman N, Murphy K, Russell S. Learning the structure of dynamic prob-abilistic networks. In: Cooper GF, Moral S, editors. Proceedings of the 14thconference on uncertainty in artificial intelligence. Morgan Kaufmann Pub-lishers Inc; 1998. p. 139–47.

[82] Heckerman D, Chickering DM. Learning Bayesian networks: the combinationof knowledge and statistical data. Machine Learning 1995;20(3):197–243.ISSN 0885-6125.

[83] Baum LE, Petrie T, Soules G, Weiss N. A maximization technique occurring inthe statistical analysis of probabilistic functions of Markov chains. The Annalsof Mathematical Statistics 1970;41(1):164–71. ISSN 00034851.

[84] Boyen X, Koller D. Tractable inference for complex stochastic processes. In:Cooper GF, Moral S, editors. Proceedings of the 14th conference on uncertaintyin artificial intelligence. Morgan Kaufmann Publishers Inc; 1998. p. 33–42.

[85] Gordon N, Salmond D, Smith A. Novel approach to nonlinear/non-GaussianBayesian state estimation. In: Radar and signal processing, IEE proceedings F,vol. 140, IET. 1993. p. 107–13.

[86] Zhang N, Poole D. A simple approach to Bayesian network computations.Proceedings of the 10th Canadian Conference on Artificial Intelligence1994;17:1–178.

[87] Zhang NL, Poole D. Exploiting causal independence in Bayesian network infer-ence. Journal of Artificial Intelligence Research 1996;5:301–28.

[88] Cao C, Leong T, Leong A, Seow F. Dynamic decision analysis in medicine:a data-driven approach. International Journal of Medical Informatics1998;51(1):13–28.

[89] Xiang Y, Poh K-L. Time-critical dynamic decision making. In: Proceedings ofthe fifteenth conference annual conference on uncertainty in artificial intel-ligence (UAI-99). 1999. p. 688–95.

[90] van Gerven M, Lucas P, van der Weide T. A generic qualitative characterizationof independence of causal influence. International Journal of ApproximateReasoning 2008;48(1):214–36. ISSN 0888-613X, special Section: PerceptionBased Data Mining and Decision Support Systems.

[91] Visscher S, Lucas P, Schurink K, Bonten M. Using a Bayesian-network model forthe analysis of clinical time-series data. In: Proceedings of the 10th conferenceon artificial intelligence in medicine, AIME05. 2005. p. 48–52.

[92] van der Gaag L, Renooij S, Witteman C, Aleman B, Taal B. Probabilities for aprobabilistic network: a case study in oesophageal cancer. Artificial Intelli-gence in Medicine 2002;25(2):123–48. ISSN 0933-3657.

[93] van der Gaag L, Renooij S, Witteman C, Aleman B, Taal B. How to elicit manyprobabilities. In: Laskey KB, Prade H, editors. Proceedings of the 15th confer-ence on uncertainty in artificial intelligence. Morgan Kaufmann Publishers;1999. p. 647–54.

[94] Scalzo F, Asgari S, Kim S, Bergsneider M, Hu X. Bayesian tracking ofintracranial pressure signal morphology. Artificial Intelligence in Medicine2012;54(2):115–23. ISSN 0933-3657.

[95] Heckerman D, Nathwani B. Toward normative expert systems. Part II:Probability-based representations for efficient knowledge acquisition andinference. Methods of Information in Medicine 1992;31(2):106–16.

[96] Díez FJ, Galán SF. Efficient computation for the noisy max. International Jour-nal of Intelligent Systems 2002;18:2003.

[97] Diez F. Parameter adjustment in Bayes networks. The generalized noisyOR-gate. In: Proceedings of the ninth conference annual conference on uncer-tainty in artificial intelligence (UAI-93). 1993. p. 99–105.

[98] Nodelman U, Koller D, Shelton C. Expectation propagation for continuous timeBayesian networks. In: Proceedings of the 21st conference on uncertainty inartificial intelligence. 2005. p. 431–40.

[99] El-Hay T, Friedman N, Kupferman R. Gibbs sampling in factorized continuous-time Markov processes. In: McAllester DA, Myllymäki P, editors. UAI 2008,proceedings of the 24th conference in uncertainty in artificial intelligence.2008. p. 169–78.

[100] Sandilya S, Rao RB. Continuous-time Bayesian modeling of clinical data. In:Berry MW, Dayal U, Kamath C, Skillicorn DB, editors. Proceedings of the 4thSIAM international conference on data mining (SDM). 2004. p. 22–4.

[101] Henrion M. Propagating uncertainty in Bayesian networks by probabilisticlogic sampling. In: Lemmer JF, Kanal LN, editors. Annual conference on uncer-

tainty in artificial intelligence (UAI-86). Amsterdam, NL: Elsevier Science;1986. p. 149–63.

[102] Gauvain J, Lee C. Maximum a posteriori estimation for multivariate Gauss-ian mixture observations of Markov chains. IEEE Transactions on Speech andAudio Processing 1994;2(2):291–8.

Page 17: Temporal abstraction and temporal Bayesian networks in clinical domains: A survey

ligenc

[

[

[

[

[

[

[

K. Orphanou et al. / Artificial Intel

103] Fahrmeir L, Raach A. A Bayesian semiparametric latent variable model formixed responses. Psychometrika 2007;72(3):327–46.

104] Fan J, Zhang W. Statistical estimation in varying coefficient models. The Annalsof Statistics 1999;27(5):1491–518.

105] Hastie T, Tibshirani R. Varying coefficient models. Journal of the Royal Statis-tical Society: Series B (Statistical Methodology) 1993:757–96.

106] Chen H, Schmeiser B. Stochastic root finding via retrospective approximation.IIE Transactions 2001;33(3):259–75. ISSN 0740-817X.

107] Hand DJ, Crowder MJ. Practical longitudinal data analysis, CRC texts in sta-tistical science, 1st ed. UK: Chapman & Hall, Imperial College, University ofLondon; 1996. ISBN 0412599406.

108] R Development Core Team. R: a language and environment for statisti-

cal computing. Vienna, Austria: R Foundation for Statistical Computing;2005.

109] Klimov D, Shahar Y, Taieb-Maimon M. Intelligent interactive visual explo-ration of temporal associations among multiple time-oriented patientrecords. Methods of Information in Medicine 2009;48(3):254–62.

e in Medicine 60 (2014) 133–149 149

[110] Kaminski A, Klimov D, Shahar Y. Dynamic exploration and analysis of tem-poral interrelations between medical concepts for multiple time-orientedpatient records. In: Proceedings of the 15th international workshop on intel-ligent data analysis in medicine and pharmacology (IDAMAP-2011). 2011.

[111] Papapetrou P, Kollios G, Sclaroff S, Gunopulos D. Mining frequentarrangements of temporal intervals. Knowledge and Information Systems2009;21(2):133–71.

[112] Winarko E, Roddick JF. ARMADA – an algorithm for discovering richer rela-tive temporal association rules from interval-based data. Data & KnowledgeEngineering 2007;63(1).

[113] Moerchen F. Algorithms for time series knowledge mining. In: Proceedings ofthe 12th international conference on knowledge discovery and data mining.

2006. p. 668–73.

[114] Morchen F, Fradkin D. Robust mining of time intervals with semi-intervalpartial order patterns. In: SDM, SIAM. 2010. p. 315–26.

[115] Moskovitch R, Shahar Y. Medical temporal-knowledge discovery via temporalabstraction. In: AMIA 2009. 2009.


Recommended