+ All Categories
Home > Documents > Approaches to evaluate water quality model parameter ...methods that have been used with water...

Approaches to evaluate water quality model parameter ...methods that have been used with water...

Date post: 09-Sep-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
9
APPROACHES TO EVALUATE WATER QUALITY MODEL PARAMETER UNCERTAINTY FOR ADAPTIVE TMDL IMPLEMENTATION 1 Craig A. Stow, Kenneth H. Reckhow, Song S. Qian, Estel Conrad Lamon III, George B. Arhonditsis, Mark E. Borsuk, and Dongil Seo 2 ABSTRACT: The National Research Council recommended Adaptive Total Maximum Daily Load implementa- tion with the recognition that the predictive uncertainty of water quality models can be high. Quantifying pre- dictive uncertainty provides important information for model selection and decision-making. We review five methods that have been used with water quality models to evaluate model parameter and predictive uncer- tainty. These methods (1) Regionalized Sensitivity Analysis, (2) Generalized Likelihood Uncertainty Estimation, (3) Bayesian Monte Carlo, (4) Importance Sampling, and (5) Markov Chain Monte Carlo (MCMC) are based on similar concepts; their development over time was facilitated by the increasing availability of fast, cheap com- puters. Using a Streeter-Phelps model as an example we show that, applied consistently, these methods give compatible results. Thus, all of these methods can, in principle, provide useful sets of parameter values that can be used to evaluate model predictive uncertainty, though, in practice, some are quickly limited by the ‘‘curse of dimensionality’’ or may have difficulty evaluating irregularly shaped parameter spaces. Adaptive implementa- tion invites model updating, as new data become available reflecting water-body responses to pollutant load reductions, and a Bayesian approach using MCMC is particularly handy for that task. (KEY TERMS: total maximum daily load; water quality model; ecological forecasting; uncertainty analysis; parameter estimation; adaptive management; Bayesian; Streeter-Phelps; equifinality; computational methods; optimization.) Stow, Craig A., Kenneth H. Reckhow, Song S. Qian, Estel Conrad Lamon III, George B. Arhonditsis, Mark E. Borsuk, and Dongil Seo, 2007. Approaches to Evaluate Water Quality Model Parameter Uncertainty for Adap- tive TMDL Implementation. Journal of the American Water Resources Association (JAWRA) 43(6):1499-1507. DOI: 10.1111 j.1752-1688.2007.00123.x INTRODUCTION Water quality models provide an essential framework for scientific assessment in support of water quality management and decisions such as total maximum daily load (TMDL) determinations (NRC 2001). Models allow decision makers to evalu- ate the logical outcomes of alternative management actions based on informed speculation about system behavior captured in a set of equations. 1 Paper No. J06104 of the Journal of the American Water Resources Association (JAWRA). Received August 1, 2006; accepted March 30, 2007. ª 2007 American Water Resources Association. Discussions are open until June 1, 2008. 2 Respectively (Stow) Senior Scientist, NOAA Great Lakes Environmental Research Laboratory, 2205 Commonwealth Blvd, Ann Arbor Michigan 48105; (Reckhow, Qian, Lamon) Professor, Associate Research Professor, and Research Scientist, Nicholas School of the Environ- ment and Earth Sciences, Duke University, Durham, North Carolina; (Arhonditsis) Assistant Professor, Department of Physical & Environ- mental Sciences, University of Toronto, Ontario, Canada; (Borsuk) Assistant Professor, Thayer School of Engineering, Dartmouth College, Hanover, New Hampshire; (Seo) Professor, Department of Environmental Engineering, Chungnam National University, 220 Gung-dong, Yus- eong-gu, Daejeon, Korea 305-764 (E-Mail Stow: [email protected]). JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 1499 JAWRA JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION Vol. 43, No. 6 AMERICAN WATER RESOURCES ASSOCIATION December 2007
Transcript
Page 1: Approaches to evaluate water quality model parameter ...methods that have been used with water quality models to evaluate model parameter and predictive uncer-tainty. These methods

APPROACHES TO EVALUATE WATER QUALITY MODEL PARAMETERUNCERTAINTY FOR ADAPTIVE TMDL IMPLEMENTATION1

Craig A. Stow, Kenneth H. Reckhow, Song S. Qian, Estel Conrad

Lamon III, George B. Arhonditsis, Mark E. Borsuk, and Dongil Seo2

ABSTRACT: The National Research Council recommended Adaptive Total Maximum Daily Load implementa-tion with the recognition that the predictive uncertainty of water quality models can be high. Quantifying pre-dictive uncertainty provides important information for model selection and decision-making. We review fivemethods that have been used with water quality models to evaluate model parameter and predictive uncer-tainty. These methods (1) Regionalized Sensitivity Analysis, (2) Generalized Likelihood Uncertainty Estimation,(3) Bayesian Monte Carlo, (4) Importance Sampling, and (5) Markov Chain Monte Carlo (MCMC) are based onsimilar concepts; their development over time was facilitated by the increasing availability of fast, cheap com-puters. Using a Streeter-Phelps model as an example we show that, applied consistently, these methods givecompatible results. Thus, all of these methods can, in principle, provide useful sets of parameter values that canbe used to evaluate model predictive uncertainty, though, in practice, some are quickly limited by the ‘‘curse ofdimensionality’’ or may have difficulty evaluating irregularly shaped parameter spaces. Adaptive implementa-tion invites model updating, as new data become available reflecting water-body responses to pollutant loadreductions, and a Bayesian approach using MCMC is particularly handy for that task.

(KEY TERMS: total maximum daily load; water quality model; ecological forecasting; uncertainty analysis;parameter estimation; adaptive management; Bayesian; Streeter-Phelps; equifinality; computational methods;optimization.)

Stow, Craig A., Kenneth H. Reckhow, Song S. Qian, Estel Conrad Lamon III, George B. Arhonditsis, Mark E.Borsuk, and Dongil Seo, 2007. Approaches to Evaluate Water Quality Model Parameter Uncertainty for Adap-tive TMDL Implementation. Journal of the American Water Resources Association (JAWRA) 43(6):1499-1507.DOI: 10.1111 ⁄ j.1752-1688.2007.00123.x

INTRODUCTION

Water quality models provide an essentialframework for scientific assessment in support ofwater quality management and decisions such as

total maximum daily load (TMDL) determinations(NRC 2001). Models allow decision makers to evalu-ate the logical outcomes of alternative managementactions based on informed speculation about systembehavior captured in a set of equations.

1Paper No. J06104 of the Journal of the American Water Resources Association (JAWRA). Received August 1, 2006; accepted March 30,2007. ª 2007 American Water Resources Association. Discussions are open until June 1, 2008.

2Respectively (Stow) Senior Scientist, NOAA Great Lakes Environmental Research Laboratory, 2205 Commonwealth Blvd, Ann ArborMichigan 48105; (Reckhow, Qian, Lamon) Professor, Associate Research Professor, and Research Scientist, Nicholas School of the Environ-ment and Earth Sciences, Duke University, Durham, North Carolina; (Arhonditsis) Assistant Professor, Department of Physical & Environ-mental Sciences, University of Toronto, Ontario, Canada; (Borsuk) Assistant Professor, Thayer School of Engineering, Dartmouth College,Hanover, New Hampshire; (Seo) Professor, Department of Environmental Engineering, Chungnam National University, 220 Gung-dong, Yus-eong-gu, Daejeon, Korea 305-764 (E-Mail ⁄ Stow: [email protected]).

JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 1499 JAWRA

JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION

Vol. 43, No. 6 AMERICAN WATER RESOURCES ASSOCIATION December 2007

Page 2: Approaches to evaluate water quality model parameter ...methods that have been used with water quality models to evaluate model parameter and predictive uncer-tainty. These methods

Given a choice of models, a decision maker is likelyto choose the model that predicts most accurately.If a model were available that was 100% accurate(i.e., the model predicts correctly 100% of the time),this model would be a clear choice over one that was,say, 80% accurate. With 100% accuracy, managementactions could be chosen based only on the societalvalue of the consequences of those actions. Even mod-els of relatively low predictive accuracy can be useful,if the predictive accuracy is appropriately quantified.A model with only 80% accuracy is still informative,but applying such a model requires hedging decisionsby the relative probabilities of a range of possible out-comes and the societal value of those outcomes. Thus,model uncertainty quantification provides informa-tion useful in both model selection and application.

However, decision makers are often provided withmodels, or model results, and given no informationregarding forecast uncertainty. How then, can thesemodels be appropriately used for decision purposes?

Model uncertainty is typically quantified by inclu-sion of an error-term on the model, and estimating themodel’s structural and error-term parameter values.Often, however, modelers have little data to supportrigorous parameter estimation or assess parameteruncertainty; thus, modelers employ ‘‘judicious did-dling’’ (Hornberger and Spear, 1981) to select valuesof key model parameters, aided by the user’s manualor other established precedent. Among experiencedwater quality modelers, it is well-recognized thatmany ‘‘sets’’ of parameter values will fit the modelabout equally well; similar predictions can be obtainedby simultaneously manipulating several parametervalues in concert. This is plausible in part because allmodels are approximations of actual ecosystem pro-cesses, and because all parameters represent aggre-gate processes (spatially and temporally averaged atsome implicit scale) and are unlikely to be representedby a fixed constant across scales. Additionally, manymathematical structures impart extreme correlationamong model parameters, even when the model isoverdetermined. This condition, called ‘‘equifinality,’’is well-documented in the hydrologic sciences (Frankset al., 1997), but the concept has rarely been discussedin the water quality modeling research literature. Webelieve that the recognition of equifinality shouldchange the perspective of water quality modelers fromseeking a single ‘‘optimal’’ value for each modelparameter, to seeking a distribution of parameter setsthat all meet a pre-defined fitting criterion (Spear,1997). These acceptable parameter sets may then pro-vide the basis for estimating model prediction errorassociated with the model parameters.

Herein, we discuss several techniques that mightbe used for evaluating plausible parameter sets, andcompare their utility. We then illustrate the

approaches using a simple Streeter-Phelps dissolvedoxygen model. Though the rationale for uncertaintyanalysis in water quality modeling has been recog-nized for many years (Reckhow and Chapra, 1983;Beck, 1987), in practice rigorous uncertainty analysisis rare. Pappenberger and Beven (2006) suggestedthat one of the reasons modelers often fail to douncertainty analysis is that there are many ‘‘compet-ing methods’’ making it difficult to choose a methodand interpret the results. A primary goal of thispaper is to show that, though these techniques haveorigins in distinct disciplines, they will provide simi-lar inference if they are consistently applied. Accord-ingly, we encourage water quality modelers toconsider a refocus from single optimal parameterselection to estimation of complete parameter sets,leading to the multi-parameter distribution. Usingthe multi-parameter distribution to make predictionsthen provides a quantified estimate of predictiveuncertainty.

Regionalized (Generalized) Sensitivity Analysis

The development of methods for identifying plausi-ble parameter sets for large multi-parameter environ-mental models with limited observational databegan with the work of Hornberger and Spear (1981).Their method, called regionalized (or generalized)sensitivity analysis (RSA), is a Monte Carlo samplingapproach to assess model parameter sensitivity.Hornberger and Spear advocated the application ofthis method as a means to prioritize future samplingand experimentation for model and parameterimprovements.

Regionalized sensitivity analysis is simple in con-cept, and is a useful way to use limited information tobound model parameter distributions. Given a particu-lar model and a system (e.g., water body) being mod-eled, the modeler first defines the plausible range ofcertain key model response variables (e.g., chlorophylla, total nitrogen) as the ‘‘behavior.’’ Outside the rangeis ‘‘not the behavior.’’ The modeler then samples from(often uniform) distributions of each of the modelparameters and computes the values for the keyresponse variables. Each complete sampling of allmodel parameters, leading to prediction, results in a‘‘parameter set.’’ All parameter sets that result in pre-dictions of the key model response variables in the‘‘behavior’’ range are termed ‘‘behavior generating’’and thus become part of the model parameter distribu-tion. The parameter sets that do not meet this behav-ior criterion are termed ‘‘nonbehavior generating.’’

Hornberger and Spear (1981) proposed that thecumulative distribution function (cdf) of each para-meter distribution from these two classes of parame-

STOW, RECKHOW, QIAN, LAMON, ARHONDITSIS, BORSUK, AND SEO

JAWRA 1500 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION

Page 3: Approaches to evaluate water quality model parameter ...methods that have been used with water quality models to evaluate model parameter and predictive uncer-tainty. These methods

ter sets (behavior generating and nonbehavior gener-ating) be compared with evaluate model parametersensitivity. For a particular parameter, if the behav-ior generating and nonbehavior generating distribu-tions are substantially different, then prediction ofthe key response variables is sensitive to that param-eter. Hence, resources devoted toward modelimprovement might be preferentially allocated towardimproved estimation of that parameter.

In addition, we can consider the distribution of thebehavior generating parameter sets as reflectingequifinality. Thus, the empirical distribution charac-terizes the error (variance and covariance) structurein the model parameters, conditional on the modeland on the fitting criterion (the defined plausiblerange of key response variables).

Generalized Likelihood Uncertainty Estimation

The Generalized Likelihood Uncertainty Estima-tion (GLUE) approach is an extension of the originalRSA; the binary system of acceptance ⁄ rejection ofbehavioral ⁄ nonbehavioral simulations is replaced bya ‘‘likelihood’’ measure that assigns different levels ofconfidence (weighting) to different parameters sets(Beven and Binley, 1992; Zak and Beven, 1999; Pageet al., 2004). Unlike Bayesian Monte Carlo (BMC),Importance Sampling (IS), and Markov Chain MonteCarlo (MCMC), the term likelihood has a very broadmeaning in the GLUE methodology and it is specifiedas any measure of goodness-of-fit that can be used tocompare observed responses and model predictions(Zak et al., 1997). Herein, we will use ‘‘likelihoodmeasure’’ to distinguish this concept from ‘‘likelihoodfunction’’, a term that is well-defined and universallyapplied in the statistical literature. A wide variety oflikelihood measures can be found in the GLUE litera-ture [e.g., likelihood measures based on the sum ofsquared errors (Beven and Binley, 1992; Sorooshianand Gupta, 1995; Freer et al., 1997), fuzzy measures(Franks et al., 1998; Page et al., 2004) or even quali-tative measures for model evaluation (Beven, 2001)].

The GLUE procedure requires a large number ofMonte Carlo model runs sampled from (usually) uni-form distributions across plausible parameter ranges.Prior knowledge regarding the expected joint parame-ter distributions can be incorporated by assigningappropriate prior likelihood weights to each of theparameter sets (Schulz et al., 1999). The behavioralruns are selected on the basis of a subjectively chosenthreshold of the likelihood measure and are rescaledso that their cumulative total is 1.0. The weightingassigned to the retained behavioral runs is propa-gated to the model output and forms a likelihood-weighted cumulative distribution of the predicted

variable(s), which are then used for estimating theprediction uncertainty ranges (Beven and Binley,1992).

Bayesian Approaches – General

Bayesian approaches begin with the realizationthat model predictions will contain error; thus, a termrepresenting this error is explicitly incorporated inthe model. This prediction error is often written asan additive term (though other error structures arepossible),

Y ¼ g x; h½ � þ e; ð1Þ

where Y is the response variable (such as dissolvedoxygen or chlorophyll a), g is a general model form(such as a Streeter-Phelps dissolved oxygen model), xrepresents one or more state variables (such as tem-perature or nutrient concentration), h represents oneor more model parameters (such as rate coefficients),and e is the model error. Often e is assumed to benormally distributed with mean (denoted l) = 0, andvariance = r2. Under the assumption that e is distrib-uted normally, the likelihood function for this model is

fðyjhÞ ¼Yni¼1

1ffiffiffiffiffiffiffiffiffiffi2pr2p exp

Y� g x; h½ �ð Þ2

�2r2

" #; ð2Þ

where n is the number of observations. In this func-tion, h is regarded as an unknown quantity that canbe predicted from the observed data.

Bayes theorem combines Equation (2) with anyprior information a modeler has about the value of hresulting in

pðhjyÞ ¼ pðhÞfðyjhÞRh pðhÞfðyjhÞdh

; ð3Þ

where p(h|y) is the posterior probability of h (theprobability of the parameter vector, h, after observingthe data, y), p(h) is the prior probability of h, (theprobability of h before observing y), and f (y|h) is thelikelihood function. In water quality modeling p(h) isoften represented by a single fixed value based on theprior knowledge of the modeler, or chosen from theliterature or a compendium of such values (Bowieet al., 1985). In contrast, noninformative priors aretypically used if little prior knowledge about theparameter values is available, or if the modeler

APPROACHES TO EVALUATE WATER QUALITY MODEL PARAMETER UNCERTAINTY FOR ADAPTIVE TMDL IMPLEMENTATION

JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 1501 JAWRA

Page 4: Approaches to evaluate water quality model parameter ...methods that have been used with water quality models to evaluate model parameter and predictive uncer-tainty. These methods

prefers that the parameter values be estimated usingonly information conveyed by the data. When nonin-formative priors are used Bayesian approaches pro-vide results consistent with maximum likelihoodresults or, if the model error term is additive and nor-mally distributed, least-squares estimation. However,Bayesian approaches emphasize inference using theentire posterior parameter distribution, whereas max-imum likelihood and least-squares methods empha-size the choice of a single optimal value for eachparameter.

Bayesian Monte Carlo

The BMC approach (Dilks et al., 1992) is similar tothe Hornberger-Spear algorithm, but carries the addi-tional assumptions of an additive, normally distrib-uted error term, with mean = 0 and variance = r2

(Equation 1). Acceptable model behavior can beimplicitly constrained a priori by setting the value ofr2. Then, the modeler samples from uniform distribu-tions were chosen to represent plausible ranges ofvalues for each parameter. However, rather thangrouping parameter sets into two categories, ‘‘behav-ior generating’’ and ‘‘nonbehavior generating’’, param-eter sets are weighted using the likelihood function.Parameter sets that result in more likely model pre-dictions (closer to the maximum of the likelihoodfunction) are weighted more heavily than thoseresulting in unlikely predictions. The result is analo-gous to a multivariate probability density function forthe model parameters.

Importance Sampling

The Hornberger-Spear algorithm, GLUE, and theBMC all run the risk of becoming limited by the‘‘curse of dimensionality’’; in high-dimensional models(models with many unknown parameters) the plausi-ble parameter space can become an extremely smallproportion of the space defined a priori by a set ofindependent uniform distributions. When thisoccurs sampling may be, at best, inefficient and, atworst, ineffective. Additionally, some combinations ofparameter values may provide plausible modelresults, though these combinations may include val-ues for the individual parameters that would not bedeemed plausible when the parameters are consid-ered one at a time. This latter situation is particu-larly problematic when the parameters are highlycorrelated (Figure 1). In this case, the joint parame-ter space defined a priori by uniform distributions(solid box) for each individual parameter may excludeimportant regions in the tails of the parameter space

(ellipse). Enlarging the space by increasing the widthof each of the uniform distributions may incorporatethese regions (dashed box), but this approach exacer-bates the curse of dimensionality. Using this tactic,the volume of the parameter space to be sampled islikely to increase more rapidly than the importantparameter space, making it even less likely that theplausible region will be sampled effectively.

Thus, IS and its variations (Sampling ⁄ ImportanceResampling – SIR) is premised on the idea that sam-pling effectiveness can be increased by choosing asampling distribution (for which pseudorandom num-ber generators exist) that more closely approximatesthe important region of the parameter space. In aBayesian context, this means choosing a surrogate,such as a multivariate normal or t-density that clo-sely approximates the posterior parameter distribu-tion. Often this can be done by first finding themaximum of the posterior distribution, and thenusing Fisher information (the negative expectation ofthe Hessian of the log of the posterior) to estimatethe parameter covariance structure (Geweke, 1989).

Like BMC, the SIR algorithm often includes a nor-mally distributed additive error term, but the param-eters of this error term are included in the set ofmodel parameters to be estimated. SIR is most usefulwhen a good surrogate exists to the posterior distri-bution, when this surrogate is easy to sample, and

0

10

20

30

40

50

60

70

80

90

100

0 10 20 30 40 50q1

q2

60 70 80 90 100

FIGURE 1. Illustration of How a Priori Independent Uniform Dis-tributions Can Miss Important Regions of the Parameter Space.Ellipse depicts important parameter space of two positivelycorrelated parameters, h1 and h2 while the solid box shows the areaencompassed by a priori plausible ranges of 30-75 for h1 and 20-80for h2. The ellipse encompasses only a small proportion of the boxindicating that random sampling within the box will be inefficient.Concurrently, random sampling within the box will miss the upperand lower tails of the ellipse. If the box is enlarged (dashed box) tocapture the tails, then the efficiency of random sampling will befurther reduced because the area of the box increases more rapidlythan the additional area included in the tails.

STOW, RECKHOW, QIAN, LAMON, ARHONDITSIS, BORSUK, AND SEO

JAWRA 1502 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION

Page 5: Approaches to evaluate water quality model parameter ...methods that have been used with water quality models to evaluate model parameter and predictive uncer-tainty. These methods

when a limited number of samples is desired (Rubin,1988). The SIR algorithm takes more samples thanneeded (say M) from the surrogate distribution, thenresamples from this finite sample of size M, based onthe ratio of the true posterior to the surrogate, toobtain m final draws (where m>M).

Markov Chain Monte Carlo

An historical limitation in the application of Bayes-ian approaches was that, for many model forms,using the posterior parameter distribution requiredsolving analytically intractable integrals. Importancesampling addresses this limitation by using a surro-gate to provide a sample from the posterior distribu-tion; MCMC estimation (prediction) uses cleverlywritten algorithms to draw samples directly from theposterior distribution (more accurately – these sam-ples will converge, in distribution to the posterior)allowing precise numerical approximation of anyfunction of the posterior distribution (Gelfand andSmith, 1990; Smith and Roberts, 1993). There areseveral algorithms available; the Metropolis-Hastingsalgorithm (Chib and Greenberg, 1995) is general butless numerically efficient, while the Gibbs Sampler(Casella and George, 1992), a special case of Metropo-lis-Hastings, can take advantage of structural regu-larities present in some models to converge moreefficiently. Selecting the most appropriate algorithmis dependent on the model form and the distribu-tional structure chosen to represent the stochasticterms. Fortunately, there is freely available softwarefor this task; WinBUGS incorporates MCMC algo-rithms into a straightforward programming environ-ment (Gilks et al., 1994).

Summary

These five approaches can be thought as approxi-mately evolutionary, facilitated by the availability offast, inexpensive computers (Figure 2). The RSAapproach is completely general, assumes no structureassociated with model error and serves as a screening

approach to identify plausible regions of modelparameter values. RSA requires a priori determina-tion of the behavior-generating region for theresponse variables. This determination is very impor-tant and can be based on either expert-judgment, ormore empirically derived like it is using the otherfour procedures. The BMC builds on the RSAapproach by adding assumptions regarding modelerror structure and uses that added structure to deli-mit plausible parameter regions. GLUE is similar toBMC (and in some cases can be the same) but per-mits a broader range of functions that define themodel error structure. GLUE can also be ‘‘updated’’,much like a Bayesian procedure (Beven and Binley,1992). IS recognizes the problems associated with the‘‘curse of dimensionality’’ that can limit the effective-ness of sampling using RSA, BMC, and GLUE andemploys a well-chosen surrogate distribution insteadof sampling parameter values from independent uni-form distributions. MCMC employs a full Bayesianframework and uses clever algorithms to choose asample that approaches the posterior density functionin distribution.

EXAMPLE USING THE STREETER-PHELPS DISSOLVED OXYGEN MODEL

To illustrate and compare these five approaches,we simulated a dataset using the following form ofthe Streeter-Phelps stream dissolved oxygen model(Streeter and Phelps, 1925)

DO ¼ DOs �k1BODu

k2 � k1e�k1

xv � e�k2

xv

� ��Die

�k2xv; ð4Þ

where DO is the dissolved oxygen concentration(mg ⁄ l), DOs is the saturation oxygen concentration,k1 is the BOD decay coefficient (1 ⁄ day), k2 is the rea-eration coefficient (1 ⁄ day), BODu is the ultimate BOD(mg ⁄ l), x is the downstream distance (km), v isstream velocity (km ⁄ day), and Di is the initial DOdeficit (mg ⁄ l). Using simulated data allows compari-son of estimated (predicted) parameters with the trueparameter values that generated the data. For thisexample we set DOs = 8.0, k1 = 0.25, k2 = 0.8,BODu = 35, v = 10, and Di = 1.0. A random normalerror with l = 0 and r2 = 0.6 was added to each obser-vation. Thirteen x values between 0 and 100 km wererandomly generated from a uniform distributionresulting in a set (Figure 3) with observed DO rang-ing from 1.9 to 7.8 mg ⁄ l and a minimum at �20 km.For straightforward depiction on bivariate plots, we

RSA BMC GLUE

ISMCMC

slowexpensive

fastcheap

Computing SpeedAvailability

FIGURE 2. Conceptual Timeline Depicting the Availabilityof Fast, Cheap Computing and Parameter Evaluation Methods.

APPROACHES TO EVALUATE WATER QUALITY MODEL PARAMETER UNCERTAINTY FOR ADAPTIVE TMDL IMPLEMENTATION

JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 1503 JAWRA

Page 6: Approaches to evaluate water quality model parameter ...methods that have been used with water quality models to evaluate model parameter and predictive uncer-tainty. These methods

treated k1 and k2 as the unknown model parameters,though various combinations of the other modelinputs could also be predicted (estimated) from thedata.

To illustrate the application of RSA, we defined theplausible DO range as ‡0 mg ⁄ l. No upper bound onDO was necessary because this form of the Streeter-Phelps model has no oxygen source term that wouldpush DO above saturation, thus no combination ofvalues for k1 and k2 will cause model predictions toexceed the 8 mg ⁄ l saturation value.

Choosing candidate ranges for k1 and k2 was some-what trickier; Bowie et al. (1985) listed k1 valuesranging from 0.004 to �5 and k2 values from �0.01 to�100, while in our experience, values between 0 and1.0 are most common for each. Selecting differentparameter spaces can strongly affect the inferencemade; the parameter ranges suggested by Bowieet al. (1985) result in an acceptable parameter region.95% of the total parameter space (Figure 4a),whereas ranges from 0 to 1 for k1 and k2 result in anacceptable region .22% of the total space (Figure 4b).Considering the larger parameter space, we wouldconclude that the model was more sensitive to k2, asindicated by large difference (relative to k1) betweenthe cdfs for the behavior and nonbehavior generatingsets (Figure 5, panels a and b). Conversely, when theparameter space for both parameters is constrainedto range from 0 to 1, the behavior and nonbehaviorgenerating cdfs are more similar to each other for k2

than for k1 indicating that the model will be sensitiveto choices of k1 (Figure 5, panels c and d).

We chose to illustrate the GLUE procedure usingan error sum of squares likelihood measure defined as

ess ¼Xni¼1

Pi �Oið Þ2; ð5Þ

where ess is error sum of squares, n is the number ofobservations, Pi is the ith of n predicted values, andOi is the ith of n observed values. Using the errorsum of squares provides a result that is closely analo-gous to Bayesian estimation with a normal, additivemodel error, and a noninformative prior distribution.The result (Figure 6) is consistent with the RSAresult (Figure 5), but provides more informationabout the location of the most likely parameter val-ues. The plot contours depict parameter sets that areequally likely, given the chosen likelihood measure.While the most likely values are near the center ofthe contour ellipse, many other sets that are almostas likely are also identified.

Qian et al. (2003) indicated that a priori specifica-tion of a precise value for r2, using BMC, canstrongly influence the variance of the posteriorparameter distribution and thus prediction variance.However, if r2 is treated as a parameter to be esti-mated from the data, then the main differencebetween BMC and IS is that IS uses a well-chosensurrogate for the posterior distribution, to concen-trate sampling effort near the most probable para-meter values. To compare these two approaches, we

Dis

solv

ed O

xyge

n (m

g/L)

1

2

3

4

5

6

7

8

Distance Downstream (km)

0 10 20 30 40 50 60 70 80 90

FIGURE 3. Depiction of the Example Streeter-PhelpsModel (blue line) and Simulated Observations (red dots).

Rea

erat

ion

Coe

ffici

ent

0

10

20

30

40

50

60

70

80

90

100

Decay Coefficient0 1 2 3 4 5

Rea

erat

ion

Coe

ffici

ent

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Decay Coefficient

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

FIGURE 4. Behavior-Generating (green) and Nonbehavior-Generating Regions Using RSA Using the Streeter-PhelpsExample. Top panel depicts a priori ranges for reaeration anddecay coefficients that are very wide, and bottom panel depictsnarrower ranges.

STOW, RECKHOW, QIAN, LAMON, ARHONDITSIS, BORSUK, AND SEO

JAWRA 1504 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION

Page 7: Approaches to evaluate water quality model parameter ...methods that have been used with water quality models to evaluate model parameter and predictive uncer-tainty. These methods

chose 2000 samples from two uniform [0,1] distribu-tions for the BMC, and two normal distributions, esti-mated from the example data using nonlinear leastsquares, for the IS distribution. The results (Figure 7)indicate the relative inefficiency of the BMC, withonly about 4% of the BMC samples falling within thearea IS sampled. This inefficiency is exacerbated

when the parameters are highly correlated, particu-larly in higher dimensional models (Qian et al.,2003). Similarly, a poor choice for the IS surrogatecan cause inefficient or nonrepresentative sampling.In this example, we used independent, normal distri-butions; though incorporating parameter correlationinto the IS sampling distribution can increase

a

Per

cent

ile

0

20

40

60

80

100

Reaeration Coefficient

0 10 20 30 40 50 60 70 80 90 100

c

Per

cent

ile

0

20

40

60

80

100

Reaeration Coefficient

0.00 0.25 0.50 0.75 1.00

b

Per

cent

ile

0

20

40

60

80

100

Decay Coefficient

0 1 2 3 4 5

d

Per

cent

ile

0

20

40

60

80

100

Decay Coefficient

0.00 0.25 0.50 0.75 1.00

FIGURE 5. Cumulative Density Functions of the Behavior Generating (green) and NonbehaviorGenerating (red) Parameter Values for the Two RSA Sets of Results. Panels a and b depict a priori ranges

for reaeration and decay coefficients that are very wide, and bottom panel depicts narrower ranges.

Rea

erat

ion

Coe

ffici

ent

0.0

0.2

0.4

0.6

0.8

1.0

Decay Coefficient

0.0 0.2 0.4 0.6 0.8 1.0

FIGURE 6. GLUE Results for the Streeter-PhelpsExample Using Error Sum of Squares Likelihood

Measure. Each successive contour from the inner circlerepresents and interval of 1.5 times the previous contour.

Rea

erat

ion

Coe

ffici

ent

0.0

0.2

0.4

0.6

0.8

1.0

Decay Coefficient

0.0 0.2 0.4 0.6 0.8 1.0

FIGURE 7. IS Sample (black dots) and BMC Sample(green and red dots). BMC sample is depicted in two colors

to illustrate correspondence with the RSA behavior-generating(green) and nonbehavior-generating (red) regions.

APPROACHES TO EVALUATE WATER QUALITY MODEL PARAMETER UNCERTAINTY FOR ADAPTIVE TMDL IMPLEMENTATION

JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 1505 JAWRA

Page 8: Approaches to evaluate water quality model parameter ...methods that have been used with water quality models to evaluate model parameter and predictive uncer-tainty. These methods

efficiency and accuracy. However, choosing a goodsurrogate can be difficult for high-dimensional modelsor highly nonlinear models, where the tails of theposterior distribution are often irregularly shaped.

Comparison of Figures 6 and 7, however, revealsthat these methods provide consistent results, withthe most likely values for k1 and k2 near the true val-ues that were used to generate the dataset. AnMCMC sample (Figure 8) using a noninformativeprior distribution, generated using WinBUGS, is alsosimilar to the IS sample (Figure 7) and the mostlikely region of the GLUE (Figure 6). The advantageof using MCMC is that a well-written algorithmquickly converges to provide a sample from the pos-terior parameter distribution and does not requireindependent information regarding a surrogate distri-bution to sample. This is particularly advantageouswhen extreme parameter correlation and nonlinearmodel structure make choosing a good surrogate dis-tribution difficult.

CONCLUSIONS

Our simple two-dimensional Streeter-Phelps exam-ple illustrates the capabilities and limitations of thesemethodologies. RSA is completely general, but onlyseparates parameter sets into two groups: in or out.Adding structural assumptions about the model errorterm, either implicitly, applying GLUE, or explicitly,using Bayesian approaches, yields considerablymore information; the resultant parameter sets areexpressed probabilistically. MCMC methods make itfeasible to generate large samples from these probabi-listic parameter sets, which can be used in model pre-

dictions, thus resulting in a straightforwardcalculation of model prediction uncertainty. We delib-erately chose an example using a simple low dimen-sional model, where all properties of the model areknown, for easy depiction. In real applications andhigher dimensional models, the concepts are analo-gous but problems resulting from the ‘‘curse ofdimensionality’’ become more difficult. Thus, using anapproach capable of effectively and efficientlysampling the appropriate parameter space becomesincreasingly important.

The National Research Council (NRC 2001) TMDLreport recommended ‘‘Adaptive Implementation’’ ofTMDLs, an approach based on the ‘‘Adaptive Man-agement’’ concept (Holling, 1978). Using adaptiveimplementation water quality models are an integralcomponent of the TMDL assessment phase in whichalternative management actions are evaluated basedon the probability of attaining water quality stan-dards. To fully implement this NRC recommendation,it will be imperative to routinely incorporate uncer-tainty analysis approaches, such as those we havereviewed, into model development. Within the Adap-tive Management framework, TMDL implementationis regarded as a ‘‘learning by doing’’ opportunity – anecosystem-scale experiment (Carpenter et al., 1995),that can provide data and information about systembehavior not available by other means. Bayesianmethods are particularly useful for model develop-ment under adaptive management because they pro-vide a straightforward, rigorous basis for dataassimilation and model updating using Bayestheorem.

ACKNOWLEDGMENTS

This work was partially supported by EPA STAR Grant No.R830883. This is GLERL contribution number 1448.

LITERATURE CITED

Beck, M.B., 1987. Water Quality Modeling: A Review of the Analy-sis of Uncertainty. Water Resources Research 23:1393-1442.

Beven, K.J., 2001. Rainfall-Runoff Modeling: The Primer. JohnWiley & Sons Ltd, West Sussex, England, pp. 360.

Beven, K. and A. Binley, 1992. The Future of Distributed Models-Model Calibration and Uncertainty Prediction. HydrologicalProcesses 6:279-298.

Bowie, G.L., W.B. Mills, D.B. Porcella, C.L. Campbell, J.R. Page-nkopf, G.L. Rupp, K.M. Johnson, P.W.H. Chan, S.A. Gherini, andC.E. Chamberlin, 1985. Rates, Constants, and Kinetic Formula-tions in Surface Water Quality Modeling. EPA ⁄ 600 ⁄ 3-85 ⁄ 040,U.S. Environmental Protection Agency, Washington, D.C.

Carpenter, S.R., S.W. Chisolm, C.J. Krebs, D.W. Schindler, andR.F. Wright, 1995. Ecosystem Experiments. Science 269:324-327.

Rea

erat

ion

Coe

ffici

ent

0.0

0.2

0.4

0.6

0.8

1.0

Decay Coefficient

0.0 0.2 0.4 0.6 0.8 1.0

FIGURE 8. MCMC Sample (purple dots) and BMC Sample(green and red dots). BMC sample is depicted in two colors

to illustrate correspondence with the RSA behavior-generating(green) and nonbehavior-generating (red) regions.

STOW, RECKHOW, QIAN, LAMON, ARHONDITSIS, BORSUK, AND SEO

JAWRA 1506 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION

Page 9: Approaches to evaluate water quality model parameter ...methods that have been used with water quality models to evaluate model parameter and predictive uncer-tainty. These methods

Casella, G. and E.I. George, 1992. Explaining the Gibbs Sampler.American Statistician 46:167-174.

Chib, S. and E. Greenberg, 1995. Understanding the Metropolis-Hastings Algorithm. American Statistician 49:327-335.

Dilks, D.W., R.P. Canale, and P.G. Meijer, 1992. Development ofBayesian Monte Carlo Techniques for Water Quality ModelUncertainty. Ecological Modelling 62:149-162.

Franks, S.W., K.J. Beven, P.F. Quinn, and I.R. Wright, 1997. Onthe Sensitivity of Soil-Vegetation-Atmosphere Transfer (SVAT)Schemes: Equifinality and the Problem of Robust Calibration.Agricultural and Forest Meteorology 86:63-75.

Franks, S.W., P. Gineste, K.J. Beven, and P. Merot, 1998. On Con-straining the Predictions of a Distributed Moder: The Incorpora-tion of Fuzzy Estimates of Saturated Areas Into the CalibrationProcess. Water Resources Research 34:787-797.

Freer, J., J. McDonnell, K.J. Beven, D. Brammer, D. Burns, R.P.Hooper, and C. Kendal, 1997. Topographic Controls on Subsur-face Storm Flow at the Hillslope Scale for Two Hydrologically Dis-tinct Small Catchments. Hydrological Processes 11:1347-1352.

Gelfand, A.E. and A.F.M. Smith, 1990. Sampling Based Approachesto Calculating Marginal Densities. Journal of the American Sta-tistical Association 85:398-409.

Geweke, J., 1989. Bayesian Inference in Econometric Models UsingMonte Carlo Integration. Economtrica 57:1317-1339.

Gilks, W.R., A. Thomas, and D.J. Spiegelhalter, 1994. A Languageand Program for Complex Bayesian Modelling. The Statistician43:169-177.

Holling, C.S., 1978. Adaptive Environmental Assessment and Man-agement. International Institute for Applied Systems Analysis.Blackburn Press, Caldwell, New Jersey.

Hornberger, G.M. and R.C. Spear, 1981. An Approach to the Preli-minary Analysis of Environmental Systems. Journal of Environ-mental Management 12:7-18.

NRC (National Research Council), 2001. Assessing the TMDLApproach to Water Quality Management. National ResearchCouncil, National Academy Press, Washington, DC.

Page, T., K.J. Beven, and J.D. Whyatt, 2004. Predictive Capabilityin Estimating Changes in Water Quality: Long-Term Responsesto Atmospheric Deposition. Water Air and Soil Pollution151:215-244.

Pappenberger, F. and K.J. Beven, 2006. Ignorance is Bliss: OrSeven Reasons Not to Use Uncertainty Analysis. WaterResources Research 42: WO5302, doi: 10.1029/2005WR004820.

Qian, S.S., C.A. Stow, and M.E. Borsuk, 2003. On Monte CarloMethods for Bayesian Inference. Ecological Modelling 159:269-277.

Reckhow, K.H. and S.C. Chapra, 1983. Engineering Approaches forLake Management, Vol. 1 and 2. Butterworth Publishers. Bos-ton, Vols. 1 and 2.

Rubin, D.B., 1988. Using the SIR Algorithm to Simulate PosteriorDistributions. In: Bayesian Statistics 3, J.M. Bernardo,M.H. DeGroot, D.V. Lindley, and A.F.M. Smith (Editors).Oxford University Press, Oxford, pp. 395-402.

Schulz, K., K. Beven, and B. Huwe, 1999. Equifinality and theProblem of Robust Calibration in Nitrogen Budget Simulations.Soil Science Society of America Journal 63:1934-1941.

Smith, A.F.M. and G.O. Roberts, 1993. Bayesian Computation Viathe Gibbs Sampler and Related Markov-Chain Monte-CarloMethods. Journal of the Royal Statistical Society Series B-Meth-odological 55:3-23.

Sorooshian, S. and V.K. Gupta, 1995. Model Calibration. In: Com-puter Models of Watershed Hydrology, V.P. Singh (Editor).Water Resources Publications, Highlands Ranch, Colorado, pp.23-68.

Spear, R.C., 1997. Large Simulation Models: Calibration, Unique-ness and Goodness of fit. Environmental Modelling and Soft-ware 12:219-228.

Streeter, H.W. and E.B. Phelps, 1925. A Study in the Pollution andNatural Purification of the Ohio River. U.S. Public Health Ser-vice, Public Health Bulletin No. 146, Washington, DC.

Zak, S.K. and K.J. Beven, 1999. Equifinality, Sensitivity and Pre-dictive Uncertainty in the Estimation of Critical Loads. Scienceof the Total Environment 236:191-214.

Zak, S.K., K. Beven, and B. Reynolds, 1997. Uncertainty in theEstimation of Critical Loads: A Practical Methodology. WaterAir and Soil Pollution 98:297-316.

APPROACHES TO EVALUATE WATER QUALITY MODEL PARAMETER UNCERTAINTY FOR ADAPTIVE TMDL IMPLEMENTATION

JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 1507 JAWRA


Recommended