+ All Categories
Home > Documents > Inference dynamic sistems

Inference dynamic sistems

Date post: 01-Jun-2018
Category:
Upload: anon448372919
View: 219 times
Download: 0 times
Share this document with a friend

of 7

Transcript
  • 8/9/2019 Inference dynamic sistems

    1/7

    Inference for Nonlinear Dynamical SystemsAuthor(s): E. L. Ionides, C. Bretó and A. A. KingSource: Proceedings of the National Academy of Sciences of the United States of America,Vol. 103, No. 49 (Dec. 5, 2006), pp. 18438-18443Published by: National Academy of SciencesStable URL: http://www.jstor.org/stable/30051132 .Accessed: 11/02/2015 11:56

    Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

    .JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

    .

    National Academy of Sciences is collaborating with JSTOR to digitize, preserve and extend access toProceedings of the National Academy of Sciences of the United States of America.

    http://www.jstor.org

    http://www.jstor.org/action/showPublisher?publisherCode=nashttp://www.jstor.org/stable/30051132?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/30051132?origin=JSTOR-pdfhttp://www.jstor.org/action/showPublisher?publisherCode=nas

  • 8/9/2019 Inference dynamic sistems

    2/7

    nference o r nonlinear dynamical s y s t m sE. L. lonidests, C. Bretot, and A. A. KingS

    tDepartment of Statistics, University of Michigan, 1085 South University Avenue, Ann Arbor, MI 48109-1107; and SDepartment of Ecologyand Evolutionary Biology, University of Michigan, 830 North University Avenue, Ann Arbor, MI 48109-1048

    Edited by Lawrence D. Brown, University of Pennsylvania, Philadelphia, PA, and approved September 21, 2006 (received for review April 19, 2006)

    Nonlinear stochastic dynamical systems are widely used to model

    systems across he sciences and engineering. Such models are naturalto formulate and can be analyzed mathematically and numerically.However, difficulties associated with inference rom time-series dataabout unknown parameters n these models have been a constrainton their application. We present a new method that makes maximumlikelihood estimation feasible for partially-observed nonlinear sto-chastic dynamical ystems (also known as state-space models) wherethis was not previously he case. The method is based on a sequenceof filtering operations which are shown to converge to a maximumlikelihood parameter estimate. We make use of recent advances innonlinear iltering n the implementation of the algorithm. We applythe method to the study of cholera in Bangladesh. We constructconfidence intervals, perform residual analysis, and apply otherdiagnostics. Our analysis, based upon a model capturing he intrinsicnonlinear

    dynamicsof the

    system,reveals some effects overlooked

    byprevious studies.

    maximum likelihood I cholera I time series

    tate space models have applications n many areas, ncludingsignal processing 1), economics (2), cell biology (3), mete-

    orology (4), ecology (5), neuroscience 6), and various others(7-9). Formally, a state space model is a partially observedMarkov process. Real-world henomena are often well modeledas Markov processes, constructed according o physical, chem-ical, or economic principles, about which one can make onlynoisy or incomplete observations.

    It has been noted repeatedly 1, 10) that estimating arametersfor state

    spacemodels s

    simplestf the

    parametersre ime-varying

    random ariables hat can be included n the state space. Estimationof parameters hen becomes a matter f reconstructing nobservedrandom variables, nd inference may proceed by using standardtechniques or filtering nd smoothing. his approach s of limitedvalue f the true parameters re thought not to vary with time, orto vary as a function f measured ovariates ather han as randomvariables. A major motivation or this work has been the observa-tion that he particle ilter 9-13) is a conceptually imple, lexible,and effective iltering echnique or which he only major drawbackwas the lack of a readily applicable echnique for likelihoodmaximization n the case of time-constant arameters. he contri-bution of this work s to show how time-varying arameter lgo-rithms may be harnessed or use in inference n the fixed-parametercase. The key result, Theorem 1, shows hat an appropriate imit of

    time-varying arameter models can be used to locate a maximumof the fixed-parameter ikelihood. This result s then used as thebasis or a procedure or finding maximum ikelihood stimates orpreviously ntractable models.

    We use the method to further our understanding of themechanisms of cholera transmission. Cholera is a disease en-demic to India and Bangladesh hat has recently become rees-tablished n Africa, south Asia, and South America (14). It ishighly ontagious, and he direct ecal-oral route of transmissionis clearly important during epidemics. A slower transmissionpathway, ia an environmental eservoir f the pathogen, Vibriocholerae, s also believed to be important, particularly n theinitial phases of epidemics 15). The growth rate of VIcholeraedepends strongly on water temperature and salinity, which canfluctuate

    markedlyn both seasonal and interannual imescales

    18438-18443 I PNAS I December 5, 2006 I vol. 103 1 no. 49

    (16, 17). Important climatic fluctuations, such as the El NifioSouthern Oscillation ENSO), affect temperature and salinity,and operate on a time scale comparable o that associated withloss of immunity 18, 19). Therefore, t is critical o disentanglethe intrinsic dynamics associated with cholera transmissionthrough he two main pathways nd with loss of immunity, romthe extrinsic orcing associated with climatic luctuations 20).

    We consider a model or cholera dynamics hat s a continuous-time version of a discrete-time model considered by Koelle andPascual (20), who in turn followed a discrete-time model formeasles (21). Discrete-time models have some features hat areaccidents f the discretization; orking n continuous ime avoidsthis, and also allows nclusion f covariates measured t disparatetime intervals. Maximum ikelihood nference has various onve-nient asymptotic properties: t is efficient, standard errors are

    available based on the Hessian matrix, and likelihood can becompared between different models. The transformation-invariance f maximum ikelihood stimates llows modeling at anatural cale. Non-likelihood pproaches ypically equire a vari-ance-stabilizing ransformation f the data, which may confusescientific nterpretation f results. Some previous ikelihood-basedmethods have been proposed (22-25). However, the fact thatnon-likelihood-based tatistical riteria uch as least square predic-tion error 26) or gradient matching 27) are commonly pplied oecological models of the sort considered here is evidence thatlikelihood-based ethods ontinue o be difficult o apply. Recentadvances n nonlinear nalysis avebrought o the fore the need forimproved statistical methods for dealing with continuous-timemodels with measurement rror and covariates 28).

    Maximum Likelihood via Iterated FilteringA state space model consists of an unobserved Markov rocess, ,called he state process nd an observation rocessy,. Here, t takesvalues n the state space Rdx, ndy, n the observation pace Rdy. heprocesses epend on an (unknown) ector of parameters, , in Rd'.Observations ake place at discrete imes, = 1,..., T; we write hevector of concatenated observations as Y1:T = (yl, .. ., YTr); l:0 sdefined o be the empty vector. A model s completely pecified ythe conditional ransition ensityf(xtc-1, 0),

    the conditional dis-tribution f the observation

    processf(ytya1:t-1,xl:t,0) = f(ytk 0),and

    the initial densityf(xol . Throughout, e adopt he convention hat

    f('l|)is a generic density pecified by its arguments, nd we assume

    that all densities exist. The likelihood s given by the identityf(Yl:TlO) I[0=1(yt[l:t-1, 0). The state process, t, may be definedin continuous or discrete time, but only its distribution t thediscrete times t = 1,..., T directly affects the likelihood. The

    challenge s to find the maximum f the likelihood as a functionof 0.

    Author ontributions: E.L.I., .B., nd A.A.K. erformed esearch, nalyzed data, and wrotethe paper.

    The authors declare no conflict of interest.

    This article is a PNAS direct submission.

    Abbreviations: ENSO, El Niro Southern Oscillation; MIF, maximum likelihood via iterated

    filtering; MLE, maximum likelihood estimate; EM, expectation-maximization.

    *To whom correspondence should be addressed. E-mail: [email protected].

    C 2006 by The National Academy of Sciences of the USA

    www.pnas.org/cgi/doi/10.1073/pnas.0603181103

    This content downloaded from 148 .235.65.253 on Wed, 11 Feb 2 015 11:56:09 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp

  • 8/9/2019 Inference dynamic sistems

    3/7

    The basic dea of our method s to replace the original modelwith a closely related model, in which the time-constant param-eter 0 is replaced by a time-varying process Ot.The densitiesf(xt xt-1, 0),f(yt xt, 0), andf(xo| ) of the time-constant model arereplaced

    byf(xtxtx-a, t1), f(ylxt, 0), and (xol

    00).The process

    Ot s taken to be a random walk in Rd,. Our main algorithm(Procedure 1 below) and its justification (Theorem 1 below)depend only on the mean and variance f the random walk, whichare defined to be

    E[Ott_1] = ,1 Var(0

    t-0_1)

    = O2E[00] = 0 Var(0o) = a2c2E. [1]

    In practice, we use the normal distributions pecified y Eq. 1.Here,o-and c are scalar quantities, nd he new model n Eq. 1 is identicalto the fixed-parameter odelwhen a = 0. The objective s to obtainan estimate f 0by taking he imit as a -->0. 1 is typically diagonalmatrix iving he respective cales of each component f 0; moregenerally, t can be taken to be an arbitrary ositive-definitesymmetric matrix. Procedure below s standard o implement, sthe computationally hallenging tep 2(ii) requires sing only wellstudied iltering echniques 1, 13) to calculate

    Ot= O(0, a) = E[Ot Yi:t][21Vt = V,(O, o-) = Var(Otlyl:t-1)

    for t = 1,..., T. We call this procedure maximum ikelihood viaiterated filtering MIF).

    Procedure . (MIF)

    1. Select starting values 0(1),a discount factor 0 < a < 1, aninitial variance multiplier 2, and the number of iterations N.

    2. Fornin 1, ...,N(i) Set an = an-1. For t = 1, .. ., T, evaluate 0(n) = 0t(e(n),aon) nd Vtn

    = Vt,((n), an).(ii) Set

    0(n+1)(n) + V tT=l rnl(~ n)-Vn)),)where 8(n) = f(n)

    3. Take f(N+1) to be a maximum likelihood estimate of theparameter 0 for the fixed parameter model.

    The quantities 0n )can be considered local estimates of 0, inthe sense that they depend most heavily on the observationsaround ime t. The updated estimate s a weighted average of thevalues tin), as explained below and in Supporting Text, which ispublished as supporting nformation on the PNAS web site. Aweighted average of local estimates s a heuristically easonableestimate or the fixed global parameter 0. In addition, akinga weighted average and terating o find a fixed point obviates heneed for a separate optimization algorithm. Theorem 1 assertsthat (under uitable conditions) he weights n Procedure 1resultin a maximum ikelihood estimate n the limit as ao-> 0. Takinga weighted average is not so desirable when the information

    about a parameter s concentrated n a few observations: hisoccurs for initial value parameters, nd modifications o Proce-dure 1 are appropriate or these parameters Supporting ext).

    Procedure 1, with step 2(i) implemented using a sequentialMonte Carlo method (see ref. 13 and Supporting ext), permitsflexible modeling n a wide variety f situations. he methodologyrequires only that Monte Carlo samples can be drawn fromf(xtlxt,1), even if only at considerable computational expense, andthat (ytlxt, 0) can be numerically valuated. We demonstrate hisbelow with an analysis f cholera data, using a mechanistic on-tinuous-time model. Sequential Monte Carlo is also known as

    particle iltering ecause each Monte Carlo realization an beviewed as a particle's rajectory hrough he state space. Eachparticle filtering step prunes particles n a way analogous toDarwinian election. Particle

    ilteringor fixed

    parameters,ike

    lonides et al.

    mst mt\ mRI mRB AtS, yt rkRt rkR

    I rkRFig. 1. Diagrammatic representation of a model for cholera populationdynamics. Each ndividual s in S (susceptible), (infected), or one of the classesRi (recovered). Compartments B, C, and D allow for birth, cholera mortality,

    and death from other causes, respectively. The arrows how rates, nterpretedas described in the text.

    natural election without mutation, s rather neffective. This ex-plains heuristically hy Procedure is necessary o permit nferencefor fixed parameters ia particle iltering. However, Procedure 1and the theory given below apply more generally, nd could beimplemented sing any suitable ilter.

    Example: A Compartment Model or CholeraIn a standard pidemiological pproach 29, 30), the population sdivided nto disease tatus lasses. Here, we consider lasses abeledsusceptible S), infected and nfectious I), and recovered (R1,...,Rk). The k recovery lasses allow flexibility n the distribution fimmune periods, a critical omponent of cholera modeling 20).Three additional classes B, C, and D allow for birth, choleramortality, nd death rom other causes, espectively. tdenotes henumber f individuals n S at time t, with similar otation or otherclasses. We write Ns' for the integer-valued rocess or its real-valued approximation) ounting transitions rom S to I, withcorresponding efinitions of NM NSD, etc. The model is showndiagrammatically n Fig. 1. To interpret he diagram n Fig. 1 as aset of coupled stochastic quations, we write

    dS, = dNBs - dNSI- dNsD + dNRkS

    dI = dNs' - dNlR' - dNfc - dNlD

    dR]=

    dNIR1 dN]RR2 dNRID

    dRk dNRk-lRk- dNRks dNRkD.

    The population size Pt is presumed known, nterpolated romcensus data. Transmission s stochastic, driven by Gaussian whitenoise

    dNs' = AStdt + e(It/Pt)S, dW, [31

    h, = tIt/Pt + (o

    In Eq. 3, we ignore stochastic effects at a demographic cale

    (infinitesimalvariance

    proportionalo

    St).We model the re-

    maining transitions deterministically

    dNIR' = Itdt; dNR-'Rj = rkRJ dt;dN~ts = rkRidt; dNsD = mStdt;dNID = mldt; dNRD = mRtdt;

    [4]dNIC = mcltdt; dNfs

    = dPt + mPtdt.

    w

    u U

    10

    In

    Time is measured in months. Seasonality of transmission smodeled by log(3t) = 15=0 bkSk(t), where {sk(t)} is a periodiccubic B-spline basis (31) defined so that sk(t) has a maximum tt = 2k and normalized so that 1i=50 Sk(t) = 1; e is anenvironmental stochasticityparameter resulting n infinitesimalvariance proportional o St); w corresponds o a non-humanreservoir of

    disease; 3tIt/Ptis human-to-human

    ransmission;

    PNAS I December , 2006 i vol. 103 I no. 49 I 18439

    This content downloaded from 148 .235.65.253 on Wed, 11 Feb 2 015 11:56:09 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp

  • 8/9/2019 Inference dynamic sistems

    4/7

    oO

    03

    S -B

    CC

    1890 1900 1910 1920 1930 1940

    Time

    Fig. 2. Data and simulation. (A) One realization of the model using theparameter values in Table 1. (B) Historic monthly cholera mortality data forDhaka, Bangladesh. C)Southern oscillation ndex (SOI), moothed with localquadratic regression (33) using a bandwidth parameter (span) of 0.12.

    1 y gives mean ime to recovery; /r and 1 (kr2) are respectivelythe mean and variance of the immune period; 1/m is the lifeexpectancy xcluding holera mortality, and m, is the mortalityrate for infected ndividuals. The equation or dNfs in Eq. 4 isbased on cholera mortality being a negligible proportion f totalmortality. The stochastic ystem was solved numerically sing heEuler-Maruyama method (32) with time increments of 1/20month. The data on observed mortality were modeled as yt -

    X[Ct - Ct-1, T2(Ct - Ct_1)2], where Ct = NIc. In theterminology given above, the state process xt is a vector repre-senting counts in each compartment.

    Results

    Testing the Method Using Simulated Data. Here, we provide evi-dence that the MIF methodology successfully maximizes thelikelihood. Likelihood maximization s a key tool not just forpoint estimation, via the maximum ikelihood estimate (MLE),but also for profile ikelihood calculation, parametric ootstrapconfidence ntervals, and likelihood ratio hypothesis ests (34).

    We present MIF on a simulated data set (Fig. 2A), withparameter vector 0* given in Table 1, based on data analysis

    Table 1. Parameters used for the simulation in Fig. 2A togetherwith estimated parameters and their SEs where applicable

    0* 0 SE(0)bo -0.58 -0.50 0.13

    bi 4.73 4.66 0.15

    b2-5.76 -5.58 0.42

    b3 2.37 2.30 0.14

    b4 1.69 1.77 0.08

    bs 2.56 2.47 0.09

    coX104 1.76 1.81 0.26T 0.25 0.26 0.01

    e 0.80 0.78 0.06

    1/y 0.75mc 0.046

    1/m 600

    1/r 120k 3

    e -3,690.4 -3,687.5

    Log ikelihoods, e, evaluated with a Monte Carlo tandard deviation of 0.1,are

    alsoshown.

    18440 www.pnas.org/cgi/doi/10.1073/pnas.0603181103

    Log likelihood-4900 -4100

    CD

    0 10 20 30 40 50MIF teration

    D E0 CIO

    c o T

    -0.7 -0.5 -0.3 4.55 4.70 0.60 0.80bo bl E

    Fig. 3. Diagnostic plots. (A-C) Convergence plots for four MIFs, hown forthree parameters. The dotted line shows 0*. The parabolic ines give the slicedlikelihood hrough 0,

    with the axis scale at the top right. (D-F) Correspondingclose-ups of the sliced likelihood. The dashed vertical ine is at 0.

    and/or scientifically plausible values. Visually, he simulationsare comparable o the data in Fig. 2B. Table 1 also contains heresulting estimated parameter vector 0 from averaging fourMIFs, together with the maximized ikelihood. A preliminaryindicator hat MIF has successfully maximized he likelihood sthat ((0) > ((0*). Further evidence that MIF is closely approx-imating the MLE comes from convergence plots and slicedlikelihoods described below), shown n Fig. 3. The SEs in Table1 were calculated via the sliced likelihoods, as described belowand elaborated n Supporting ext. Because nference on initialvalues is not of primary relevance here, we do not presentstandard errors for their estimates. Were they required, wewould recommend profile likelihood methods for uncertaintyestimates of initial values. There s no asymptotic ustification fthe quadratic pproximation or initial value parameters, incethe information n the data about such parameters s typicallyconcentrated n a few early time points.

    Applying he Method o Cholera Mortality Data. We use the data in

    Fig. 2B and the model in Eqs. 3 and 4 to address wo questions:the strength of the environmental reservoir effect, and theinfluence of ENSO on cholera dynamics. ee refs. 19 and 20 formore extended analyses of these data. A full investigation f thelikelihood unction s challenging, ue to multiple ocal maximaand poorly dentified combinations f parameters. Here, theseproblems are reduced by treating wo parameters m and r) asknown. A value k = 3 was chosen based on preliminary nalysis.The remaining 15 parameters the first eleven parameters nTable

    1and the initial values

    So, lo, R1, RI, R3,constrained o

    lonides et ai.

    This content downloaded from 148 .235.65.253 on Wed, 11 Feb 2 015 11:56:09 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp

  • 8/9/2019 Inference dynamic sistems

    5/7

    o 0

    0 1 2 3 4 5 6( xlO

    4

    ox10

    Fig. 4. Profile likelihood for the environmental reservoir parameter. Thelarger of two MIF eplications was plotted at each value of w (circles), maxi-mizing over the other parameters. Local quadratic regression (33, 35) with abandwidth parameter span) of 0.5 was used to estimate the profile ikelihood(solid line). The dotted lines construct an approximate 99% confidence inter-val (ref. 34 and Supporting Text) of (75 x 10-6, 210 x 10-6).

    sum to Po) were estimated. There is scope for future work byrelaxing hese assumptions.

    For cholera, the difference between human-to-human rans-mission and transmission ia the environment s not clear-cut. n

    the model, he environmental eservoir ontributes componentto the force of infection which s independent of the number ofinfected individuals. Previous data analysis or cholera using amechanistic model (20) was unable o include an environmentalreservoir because it would have disrupted the log-linearityrequired by the methodology. Fig. 4 shows he profile ikelihoodof o and resulting confidence interval, calculated using MIF.This translates to between 29 and 83 infections per millioninhabitants per month from the environmental eservoir, be-cause the model implies a mean susceptible raction of 38%. Atleast in the context of this model, there is clear evidence of anenvironmental eservoir ffect (likelihood ratio test, P < 0.001).Although our assumption hat environmental ransmission asno seasonality s less than fully reasonable, his mode of trans-

    mission is only expected to play a major role when choleraincidence s low, typically uring and after the summer monsoonseason (see Fig. 5). Human-to-human ransmission, ycontrast,predominates during cholera epidemics.

    Links between holera ncidence nd ENSO have been dentified(18, 19, 46). Such arge-scale limatic henomena may be the besthope for forecasting isease burden 36). We looked for a rela-tionship between ENSO and the prediction residuals definedbelow). Prediction esiduals re robust o the exact form of themodel: hey depend only on the data and the predicted alues, andall reasonable models hould usually make imilar redictions. helow-frequency omponent f the southern scillation ndex SOI),graphed n Fig. 2C, is a measure of ENSO available during heperiod 1891-1940 19); low values of SOI correspond o El Nifio

    o 1

    Month

    J AMJ A OMot

    Fig. 5. Superimposed annual cycles of cholera mortality in Dhaka, 1891-1940.

    events. Rod6 et al. (19) showed that low SOI correlates withincreased holera asesduring he period 1980-2001 ut ound nlyweak evidence of a link with cholera deaths during he 1893-1940period. Simple correlation analysis of standardized esiduals rmortality with SOI reveals no clear relationship. reaking ownbymonth, we find hat SOI s strongly orrelated ith he standardizedresiduals or August and September in each case, r = -0.36, P =0.005), at which time cholera mortality istorically egan ts sea-sonal increase ollowing the monsoon (see Fig. 5). This resultsuggests narrow window of opportunity ithin which ENSO canact. This s consistent with the mechanism onjectured y Rod6 etal. (19) whereby he warmer urface emperatures ssociatedwithan El Nifio event lead to increased human contact with theenvironmental eservoir nd greater pathogen rowth ates n thereservoir. Mortality tself did not correlate with SOI n August r =-0.035, P = 0.41). Some weak evidence of negative orrelationbetween SOI and mortality ppeared n September r = -0.22, P =0.063). Earlier work 20), based on a discrete-time model and withno allowance or an environmental eservoir, ailed o resolve hisconnection etween ENSO and cholera mortality n the historicalperiod: o find clear evidence of the external limatic orcing f thesystem, t is essential o use a model capable of capturing heintrinsic ynamics f disease ransmission.

    DiscussionProcedure 1 depends on the viability of solving the filteringproblem, .e., calculating tand Vt n Eq. 2. This s a strength f themethodology, n that the filtering problem has been extensivelystudied. Filtering does not require stationarity f the stochasticdynamical ystem, nabling ovariates such asPt) to be included na mechanistically lausible way. Missing observations nd datacollected at irregular ime intervals also pose no obstacle forfiltering methods. Filtering can be challenging, particularly nnonlinear ystems with a high-dimensional tate space (dx arge).One example s data assimilation or atmospheric nd oceano-graphic cience, where observations satellites, weather stations,etc.) are used to inform arge spatio-temporal imulation models:approximate iltering methods developed or such situations 4)could be used to

    applyhe methods of this

    paper.The goal of maximum ikelihood estimation or partially b-served data s reminiscent f the expectation-maximization EM)algorithm 37), and indeed Monte Carlo EM methods have beenapplied o nonlinear tate pace models 24).The Monte Carlo EMalgorithm, nd other standard Monte Carlo Markov Chain meth-ods, cannot be used for inference on the environmental oiseparameter for the model given above, because hese methods elyupon different ample paths of the unobserved rocess t havingdensities with respect o a common measure 38). Diffusion pro-cesses, such as the solution o the system of stochastic ifferentialequations above, are mutually ingular or different alues of theinfinitesimal ariance. Modeling using diffusion processes as inabove) s by no means necessary or the application f Procedure1, but continuous-time models for large discrete populations re

    well approximated y diffusion processes, o a method that canhandle diffusion rocesses may be expected o be more reliable orlarge discrete populations.

    Procedure is well suited or maximizing umerically stimatedlikelihoods or complex models argely ecause t requires eitheranalytic derivatives, which may not be available, nor numericalderivatives, hichmay be unstable. The terated iltering ffectivelyproduces stimates f the derivatives moothed at each iterationover he scale at which he ikelihood scurrently eing nvestigated.Although general stochastic ptimization echniques do exist formaximizing unctions measured with error 39), these methods areinefficient n terms of the number f function valuations equired(40). General stochastic ptimization echniques have not to ourknowledge been successfully pplied o examples omparable o

    that presented ere.

    lonides et al. PNAS December , 2006 I vol. 103 I no. 49 I 18441

    o

    oLU

    1 0

    In

    This content downloaded from 148 .235.65.253 on Wed, 11 Feb 2 015 11:56:09 AMAll use subject to JSTOR Terms and Conditions

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp

  • 8/9/2019 Inference dynamic sistems

    6/7

    Each teration of MIF requires imilar omputational ffort toone evaluation of the likelihood unction. The results n Fig. 3demonstrate he ability of Procedure 1 to optimize a function of13 variables using 50 function evaluations, with Monte Carlomeasurement rror and without knowledge of derivatives. Thisfeat is only possible because Procedure 1 takes advantage f thestate-space structure of the model; however, this structure sgeneral enough to cover relevant dynamical models across abroad range of disciplines. The EM algorithm s similarly only

    an optimization rick, butin

    practicet has led

    tothe consider-

    ation of models that would be otherwise ntractable. The com-putational efficiency of Procedure 1 is essential for the modelgiven above, where Monte Carlo unction evaluations ach take-15 min on a desktop computer.

    Implementation of Procedure 1 using particle filtering con-veniently requires ittle more than being able to simulate pathsfrom the unobserved dynamical ystem. The new methodologyis therefore readily adaptable to modifications of the model,allowing relatively rapid cycles of model development, modelfitting, diagnostic analysis, and model improvement.

    Theoretical Basis or MIFRecall the notation above, and specifically he definitions n Eqs.

    1 and 2.Theorem .Assuming onditions R1-R3) below,

    T

    lim Vtl(O- t)= V logf(ylTlO,= 0), [5]

    -r-- t= 1

    where Vg is defined by [Vg]i = ag/a Oiand Oo0 0. Furthermore,for a sequence o-n > 0, define 0(n) recursively by

    T

    ^(n+l) =(n) +Vln

    ( n) [6]t,n Vt t- [61

    t=l

    where O( ) = &t(f(n), a,) andVt,,

    = V,((), on). If there s a 0with 10(n) 08/o - 0 then V log f(y 0 , 0 = 0) = 0.

    Theorem 1 asserts hat (for sufficiently mall o-,), Procedure1 iteratively updates the parameter stimate n the direction ofincreasing ikelihood, with a fixed point at a local maximum ofthe likelihood. Step 2(ii) of Procedure 1 can be rewritten as

    (n+t1) V, 1 (Vt,n- ,n)t + (VT,)n) }. This

    makes (n+ 1)a weighted average, n the sense that VIf Et1 (Vt-

    V+,z)+ V ) } = Ido, where Idois the do x do identity matrix.The weights are necessarily positive for sufficiently small 0n,

    (Supporting ext).The exponentially decaying 0on

    in step 2(i) of Procedure 1 isjustified by empirical demonstration, provided by convergence

    plots (Fig. 3).Slower decay,

    0-=

    n-with 0

  • 8/9/2019 Inference dynamic sistems

    7/7

    which eads to Eq. 5. To see the second part of the theorem, notethat Eq. 6 and the requirement that ^(n) - l/o --+0 imply that

    T

    2 Vt-l((n), oen)(At(b(n), (n) - t-1( (n), On))= o(1).

    t=l

    Continuity hen gives

    T

    lim V,'(, o.n) t,(, on)-

    0-l0,oUn))= 0.

    n t=l1

    which, together with Eq. 5, yields the required result.

    Heuristics, Diagnostics, nd Confidence ntervalsOur main MIF diagnostic s to plot parameter estimates as afunction of MIF iteration; we call this a convergence plot.Convergence s indicated when the estimates reach a singlestable imit rom various tarting points. Convergence lots werealso used for simulations with a known true parameter, tovalidate the methodology. The investigation of quantitativeconvergence measures might ead to more refined mplementa-tions of Procedure 1.

    Heuristically, can be thought of as a cooling parameter,analogous o that used in simulated annealing 39). If a is toosmall, the convergence will be quenched nd fail to locate amaximum. f a is too large, he algorithm will fail to converge na reasonable ime interval. A value of a = 0.95 was used above.

    Supposing that Oihas a plausible range [010,0hi] based on priorknowledge, hen each particle s capable of exploring his rangein early terations of MIF (unconditional n the data) provided

    >SiiT s on the same scale as ih' - 010.We use 1'/2= (0h0 o)/2\/T with >ij = 0 for i j.

    Although the asymptotic arguments do not depend on theparticular alue of the dimensionless onstant c, looking at con-vergence plots ed us to take c2 = 20 above. Large values c2 - 40resulted n increased algorithmic nstability, s occasional argedecreases n the prediction ariance Vtresulted n large weights nProcedure step 2(ii). Small alues 2 10were diagnosed o resultin appreciably lower onvergence. We found t useful, n choosingc, to check hat [Vj]ii lotted against was fairly table. n principle,a different alue of c could be used or each dimension f 0; or ourexample, a single choice of c was found to be adequate.

    If the dimension f 0 is even moderately arge say, do 10), itcan be challenging o investigate he likelihood urface, o check

    that a good local maximum as been found, and to get an idea ofthe standard eviations nd covariance f the estimators. usefuldiagnostic, he sliced ikelihood Fig. 3B), plots 0(0 hSi) against&i h, where 6i is a vector of zeros with a one in the ith position.If 0 is located at a local maximum f each sliced ikelihood, hen 0is a local maximum f e(0), supposing (0) is continuously iffer-entiable. Computing liced ikelihoods equires moderate ompu-tational ffort, inear n the dimension f 0. A local quadratic it ismade to the sliced og likelihood as suggested y ref. 35), because

    f(0+

    h6i) s calculated with a Monte Carlo error. Calculatinghe

    sliced ikelihood nvolves valuating ogf(ytlyi:t-1, + hsi), whichcan then be regressed gainst to estimate ala/0) ogf(ytjyl:t-1, ).These partial derivatives may then be used to estimate he Fisherinformation ref. 34 and Supporting ext)and corresponding Es.Profile likelihoods 34) can be calculated by using MIF, but atconsiderably more computational xpense han sliced ikelihoods.SEs and profile ikelihood onfidence ntervals, ased on asymp-totic properties f MLEs, are particularly seful when alternateways o find standard rrors, uch as bootstrap imulation rom hefitted model, are prohibitively xpensive o compute. Our experi-ence, consistent with previous advice (45), is that SEs based onestimating Fisher information rovide a computationally rugalmethod o get a reasonable dea of the scale of uncertainty, ut

    profilelikelihoods and associated ikelihood based confidence

    intervals re more appropriate or drawing areful nferences.As in regression, esidual nalysis s a key diagnostic ool for state

    space models. The standardized rediction esiduals are {ut(0)}where 0 is the MLE and u,(O) = [Var(ytlyl:t-1, 0)]-1/2 (yt

    -

    E[ytlyl:t-,, 0]). Other residuals may be defined for state spacemodels 8), such as Elfftl dWslyl:T, ]for the model n Eqs. 3 and4. Prediction esiduals have the property hat, if the model iscorrectly pecified with true parameter ector 0*, {ut(0*)} s anuncorrelated equence. This has two useful consequences: t givesa direct diagnostic check of the model, i.e., {ut(0)} should beapproximately ncorrelated; t means hat prediction esiduals rean (approximately) rewhitened ersion f the observation rocess,which makes hem particularly uitable or using correlation ech-

    niquesto look for

    relationshipsith other variables

    7),as dem-

    onstrated bove. n addition, he prediction esiduals re relativelyeasy to calculate using particle-filter echniques Supporting ext).

    We thank he editor and two anonymous eferees or constructivesuggestions,Mercedes ascual nd Menno Bouma or helpful iscus-sions nd Menno ouma or he cholera ata hown nFig.2B.Thisworkwas supported yNational cience oundation rant 430120.

    1. Anderson BD, Moore JB (1979) Optimal Filtering (Prentice-Hall, Engelwood Cliffs, NJ).2. Shephard N, Pitt MK (1997) Biometrika 84:653-667.3. Ionides EL, Fang KS, Isseroff RR, Oster GF (2004) J Math Biol 48:23-37.4. Houtekamer PL, Mitchell HL (2001) Mon Weather Rev 129:123-137.5. Thomas L, Buckland ST, Newman KB, Harwood J (2005) Aust NZ J Stat 47:19-34.6. Brown EN, Frank LM, Tang D, Quirk MC, Wilson MA (1998) J Neurosci 18:7411-7425.7. Shumway RH, Stoffer DS (2000) Time Series Analysis and ItsApplications Springer, New York).8. Durbin J, Koopman SJ (2001) Time Series Analysis by State Space Methods (Oxford Univ

    Press, Oxford).9. Doucet A, de Freitas N, Gordon NJ, eds (2001) Sequential Monte Carlo Methods in Practice

    (Springer, New York).10. Kitagawa G (1998) JAm Stat Assoc 93:1203-1215.11. Gordon N, Salmond DJ, Smith AFM (1993) lEE Proc F 140:107-113.12. Liu JS (2001) Monte Carlo Strategies n Scientific Computing (Springer, New York).13. Arulampalam MS, Maskell S, Gordon N, Clapp T (2002) IEEE Trans Sig Proc 50:174-188.14. Sack DA, Sack RB, Nair GB, Siddique AK (2004) Lancet 363:223-233.15. Zo YG, Rivera ING, Russek-Cohen E, Islam MS, Siddique AK, Yunus M, Sack RB, Huq

    A, Colwell RR (2002) Proc Natl Acad Sci USA 99:12409-12414.16. Huq A, West PA, Small EB, Huq MI, Colwell RR (1984)ApplEnviron Microbiol48:420-424.17. Pascual M, Bouma MJ, Dobson AP (2002) Microbes Infect 4:237-245.18. Pascual M, Rod6 X, Ellner SP, Colwell R, Bouma MJ (2000) Science 289:1766-1769.19. Rod6 X, Pascual M, Fuchs G, Faruque ASG (2002) Proc NatlAcad Sci USA 99:12901-12906.20. Koelle K, Pascual M (2004) Am Nat 163:901-913.21. Finkenstidt BF, Grenfell BT (2000) Appl Stat 49:187-205.22. Liu J, West M (2001) Sequential Monte Carlo Methods in Practice, eds Doucer A, de Freitas

    N, Gordon JJ (Springer, New York), pp 197-224.23. Hilrzeler M, Kiinsch HR (2001) in Sequential Monte Carlo Methods in Practice, eds Doucer

    A, de Freitas N, Gordon JJ (Springer, New York), pp 159-175.

    24. Capp O0,Moulines E, Ryd6n T (2005) Inference n Hidden Markov Models (Springer, New York).25. Clark JS, Bjornstad ON (2004) Ecology 85:3140-3150.26. Turchin P (2003) Complex Population Dynamics: A Theoretical/Empirical Synthesis (Prince-

    ton Univ Press, Princeton).27. Ellner SP, Seifu Y, Smith RH (2002) Ecology 83:2256-2270.28. BjOrnstad ON, Grenfell BT (2001) Science 293:638-643.29. Kermack WO, McKendrick AG (1927) Proc R Soc London A 115:700-721.30. Bartlett MS (1960) Stochastic opulation Models n Ecology and Epidemiology Wiley, New York).

    31. Powell MJD (1981)Approximation heory nd Methods Cambridge Univ. Press, Cambridge, UK).32. Kloeden PE, Platen E (1999) Numerical Solution of Stochastic Differential Equations(Springer, New York), 3rd Ed.

    33. Cleveland WS, Grossel E, Shyu WM (1993) in Statistical Models in S, eds Chambers JM,Hastie TJ (Chapman & Hall, London), pp 309-376.

    34. Barndorff-Nielsen OE, Cox DR (1994) Inference andAsymptotics Chapman & Hall, London).35. Ionides EL (2005) Stat Sin 15:1003-1014.36. Thomson MC, Doblas-Reyes FJ, Mason SJ, Hagedorn SJ, Phindela T, Moore AP, Palmer

    TN (2006) Nature 439:576-579.37. Dempster AP, Laird NM, Rubin DB (1977) J R Stat Soc B 39:1-22.38. Roberts GO, Stramer O (2001) Biometrika 88:603-621.39. Spall JC (2003) Introduction to Stochastic Search and Optimization (Wiley, Hoboken, NJ).40. Wu CFJ (1985) JAm Stat Assoc 80:974-984.41. Press W, Flannery B, Teukolsky S, Vetterling W (2002) Numerical Recipes in C++

    (Cambridge Univ Press, Cambridge, UK), 2nd Ed.42. Cram6r H (1946) Mathematical Methods of Statistics (Princeton Univ Press, Princeton).43. Jensen JL, Petersen NV (1999) Ann Stat 27:514-535.44. McLachlan G, Peel D (2000) Finite Mixture Models (Wiley, New York).45. McCullagh P, Nelder JA (1989) Generalized Linear Models (Chapman & Hall, London), 2nd Ed.46. Bouma MJ, Pascual M (2001) Hydrobiologia 460:147-156.

    w00I-.

    LU

    uS-

    lonides et al. PNAS I December 5, 2006 1 vol. 103 I no. 49 1 18443

    This content downloaded from 148 .235.65.253 on Wed, 11 Feb 2 015 11:56:09 AMAll bj JSTOR T d C di i

    http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp

Recommended