+ All Categories
Home > Documents > Using the generalized F distribution to model limnetic temperature profile and estimate thermocline...

Using the generalized F distribution to model limnetic temperature profile and estimate thermocline...

Date post: 05-Sep-2016
Category:
Upload: victor-chan
View: 228 times
Download: 6 times
Share this document with a friend
12
Ecological Modelling 188 (2005) 374–385 Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth Victor Chan a, , Robin A. Matthews b a Department of Mathematics, Western Washington University, Bellingham, WA 98225, USA b Department of Environmental Sciences, Western Washington University, Bellingham, WA 98225, USA Received 7 June 2004; received in revised form 5 April 2005; accepted 13 April 2005 Available online 17 June 2005 Abstract A reasonably precise estimate of the thermocline depth (or other well-defined boundaries) in the temperature profile of a lake is of interest to the limnological community. In this article, we propose an empirical model based on the generalized F distribution to describe the temperature profile. Using the model, the thermocline depth can be easily estimated. We also discuss the Gauss–Newton method to fit the model to the profile data. The model fitting procedure is illustrated using data from Lake Whatcom, WA, USA. © 2005 Elsevier B.V. All rights reserved. Keywords: Thermocline; Temperature profile; Curve-fitting; Nonlinear regression 1. Introduction A well-known phenomenon that occurs within most temperate lakes during the summer is the development of temperature stratification. The stratification divides the temperature profile into three distinct regions: an upper warm region called the epilimnion, a bottom colder region called the hypolimnion, and an interme- diate zone between the two called the metalimnion. Both epilimnion and hypolimnion have approximately constant temperatures, whereas in the metalimnion the temperature decreases rapidly with depth. Taken as a whole, the temperature profile can be viewed as a Corresponding author. E-mail address: [email protected] (V. Chan). smooth continuous curve with two bends of varying degrees of curvature; the two bends approximately de- marcate the three distinct regions of the profile. According to widely accepted limnological conven- tion, the thermocline is defined as an imaginary plane located at the depth where the rate of temperature de- crease (temperature gradient) in the temperature profile is maximum. The thermocline depth is useful to lim- nologists because it can be regarded as the dividing plane that separates warm waters at the top layer from colder waters lying below, and because it partitions the lake into two strata with distinctly different biological and chemical features. Because of the importance of the thermocline, a model that mathematically describes (and parameter- izes) the shape of temperature profile and allows fairly 0303-2647/$ – see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.ecolmodel.2005.04.018
Transcript
Page 1: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

Ecological Modelling 188 (2005) 374–385

Using the generalizedF distribution to model limnetictemperature profile and estimate thermocline depth

Victor Chana,∗, Robin A. Matthewsb

a Department of Mathematics, Western Washington University, Bellingham, WA 98225, USAb Department of Environmental Sciences, Western Washington University, Bellingham, WA 98225, USA

Received 7 June 2004; received in revised form 5 April 2005; accepted 13 April 2005Available online 17 June 2005

Abstract

A reasonably precise estimate of the thermocline depth (or other well-defined boundaries) in the temperature profile of alake is of interest to the limnological community. In this article, we propose an empirical model based on the generalizedFdistribution to describe the temperature profile. Using the model, the thermocline depth can be easily estimated. We also discussthe Gauss–Newton method to fit the model to the profile data. The model fitting procedure is illustrated using data from LakeWhatcom, WA, USA.© 2005 Elsevier B.V. All rights reserved.

Keywords: Thermocline; Temperature profile; Curve-fitting; Nonlinear regression

1

totucdBcta

ingde-

n-lanee de-rofileim-idingfromthe

al

e, aeter-airly

0d

. Introduction

A well-known phenomenon that occurs within mostemperate lakes during the summer is the developmentf temperature stratification. The stratification divides

he temperature profile into three distinct regions: anpper warm region called the epilimnion, a bottomolder region called the hypolimnion, and an interme-iate zone between the two called the metalimnion.oth epilimnion and hypolimnion have approximatelyonstant temperatures, whereas in the metalimnion theemperature decreases rapidly with depth. Taken as

whole, the temperature profile can be viewed as a

∗ Corresponding author.E-mail address: [email protected] (V. Chan).

smooth continuous curve with two bends of varydegrees of curvature; the two bends approximatelymarcate the three distinct regions of the profile.

According to widely accepted limnological convetion, the thermocline is defined as an imaginary plocated at the depth where the rate of temperaturcrease (temperature gradient) in the temperature pis maximum. The thermocline depth is useful to lnologists because it can be regarded as the divplane that separates warm waters at the top layercolder waters lying below, and because it partitionslake into two strata with distinctly different biologicand chemical features.

Because of the importance of the thermoclinmodel that mathematically describes (and paramizes) the shape of temperature profile and allows f

303-2647/$ – see front matter © 2005 Elsevier B.V. All rights reserved.oi:10.1016/j.ecolmodel.2005.04.018

Page 2: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

V. Chan, R.A. Matthews / Ecological Modelling 188 (2005) 374–385 375

precise estimation of the thermocline depth would beof interest to limnologists. Although there are a numberof models that provide such a description, we have notcome across a model that has sufficient flexibility to fitstratified temperature profiles adequately. Nor have weseen one that allows straightforward estimation of thethermocline depth (according to the definition givenabove), given field measurements of temperature anddepth.

In this paper, we propose an empirical model thatrepresents the stratified temperature profile with rea-sonable accuracy. This model is superior to other em-pirical models that we have seen in terms of better fitand description of the actual profile data. Furthermore,with a good fit, it estimates the thermocline depth eas-ily and precisely. The model is based on the statisticalgeneralizedF distribution and was developed from andtested on the monthly data from Lake Whatcom, WA,USA, collected from 2000 to 2003. We also provide adetailed procedure to fit the model to the temperatureprofile data and to estimate the thermocline depth.

Section2 presents two temperature profiles fromLake Whatcom, defines the thermocline depth, and pro-vides a brief discussion on finding an appropriate em-pirical model to fit the temperature profile data. Section3 introduces the generalizedF distribution. We presentthe fitting procedure and results of fitting in Section4.In this section, we also discuss some potential conver-gence problems with the fitting procedure and suggestsome possible solutions. Finally, in Section5, a discus-s r pro-pm lined encei

2

tes ated ndh rec onS eterf the1 erec on

Fig. 1. Temperature profile of Site 4 at Lake Whatcom on September2, 2003. The circles correspond to actual observations; the lines areinterpolations.

the same lake on August 10, 2000, with observationsat every meter until 20 m. The lines connecting theobservations are point-to-point interpolations. Theboundaries separating the three regions for each profilewere obtained by subjective “eyeballing”; a widelyaccepted convention for defining the boundary lineshas not yet been established.

Let T andz denote the temperature and the depth,respectively. The thermocline depthz0 is defined as thedepth where

d2T (z)

dz2

∣∣∣∣z=z0

= 0.

Fig. 2. Temperature profile of Site 1 at Lake Whatcom on August10, 2000. The circles correspond to actual observations; the lines arei

ion on the advantages and disadvantages of ouosed model is given. In theAppendix A, we provide aethod to quantify the uncertainty in the thermocepth estimate by constructing approximate confid

ntervals.

. Empirical modeling of temperature profile

Figs. 1 and 2show two examples of typical laummer temperature profiles, with approximelineations of the epilimnion, metalimnion, aypolimnion. The data for the first profile weollected at a site named Site 4 on Lake Whatcomeptember 2, 2003, with observations at every m

rom the surface to 10 m, and at every 5 m after0-m depth. The data for the second profile wollected at Site 1 (which is shallower than Site 4)

nterpolations.
Page 3: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

376 V. Chan, R.A. Matthews / Ecological Modelling 188 (2005) 374–385

In mathematical terms, the thermocline depth is theinflection point of the temperature curve, i.e., the depthwhere the temperature gradient changes concavity. Inthis formulation, the temperature curve or profile isassumed to be twice differentiable everywhere and topossess only one inflection point. InFigs. 1 and 2, thethermocline appears to be somewhere near 15 m and8 m, respectively.

An empirical model (i.e., a mathematical curve) thatadequately fits a typical temperature profile must beable to accommodate the varying degrees of curvatureof the two “knees” (i.e., the two bends) of the profile,as illustrated inFigs. 1 and 2. This requirement rulesout a large number of simple curves, such as polyno-mials of small degrees. Furthermore, the number of pa-rameters for the model should be small, following theprinciple of parsimony (fewer parameters imply eas-ier interpretation, easier fitting, and more reasonableinterpolation).

The search for an appropriate empirical model canbe facilitated by noticing that there is a close resem-blance between the plot of a typical temperature pro-file (when rotated 90◦ and flipped over) and the plot ofa nonlinear or sigmoidal growth model. These growthmodels are a family of curves frequently used in biol-ogy and other sciences to describe the growth of somequantity over time. For more information, see Chapter7 of Seber and Wild (2003)or Section 24.7 ofDraperand Smith (1998).

We note that the cumulative distribution function( itss rowthm 90c linep ther al, ita dis-t turep ,t e ex-p

T

wt

delt the

c.d.f. of the four-parameter generalizedF distribution.In addition to possessing the required flexibility to han-dle the varying curvature at the two knees for differ-ent profiles, this model has a desirable feature in thatit has easily interpretable parameters, one of whichis exactly the inflection point, i.e., the thermoclinedepth.

3. Generalized F distribution

The generalizedF distribution is frequently encoun-tered in reliability and survival analysis applications. Itincludes many commonly used distributions as specialcases, such as Snedecor’sF distribution, loglogistic,Weibull, and Burr Type XII distributions. For more in-formation on the distribution, seeMeeker and Escobar(1998, p. 102)or Johnson et al. (1995, p. 348).

LettingT be the random variable of the generalizedF distribution, its probability density function (p.d.f.)is given by

fT (t) = 1

σtφ

(log(t) − µ

σ

), t > 0

where

φ(u) = Γ (κ + r)

Γ (κ)Γ (r)

(κ/r)κ exp(κu)

{1 + (κ/r) exp(u)}κ+r,

µ er,κ

gdif-

fg .o

f

F

d st bee gtt his

c.d.f.) of a continuous distribution, because ofhape, can be considered one general class of godels. If the temperature profile is rotated◦

ounter-clockwise, reflected about the horizontalassing through the maximum temperature, andange of temperatures rescaled to a (0, 1) intervppears that the c.d.f. of a well-chosen continuous

ribution could be used to approximate the temperarofile. If such a c.d.f., denoted byF (z), is available

he temperature as a function of depth can then bressed as

(z) = Ts − (Ts − Ta)F (z)

hereTs is the temperature at the surface andTa is theemperature at the bottom.

While there may well be more than one good mohat meets the criteria, in this paper we propose

is a location parameter,σ > 0 the scale paramet> 0 andr > 0 the shape parameters, andΓ (x) is theamma function.

To model the temperature profile, we will use aerent parameterization of the random variableT of theeneralizedF distribution. LetZ = log(T ). The p.d.ff Z is

(z) = Γ (κ + r)

Γ (κ)Γ (r)

(1/σ)(κ/r)κ exp[κ(z − µ)/σ]

{1 + (κ/r) exp[(z − µ)/σ]}κ+r. (1)

The c.d.f. ofZ,

(z) =∫ z

−∞f (w) dw, (2)

oes not have a closed-form expression, and so muvaluated numerically. With the variablez representinhe depth, this functionF (z) has the flexibility to fit theypical shape of the stratified temperature profile. T

Page 4: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

V. Chan, R.A. Matthews / Ecological Modelling 188 (2005) 374–385 377

Fig. 3. The c.d.f. of the generalizedF distribution for different parameter values ofκ andr, with σ = 1 andµ = 0.

can be seen inFig. 3, where the graphs of the c.d.f. forfour sets of values ofκ andr, with σ = 1 andµ = 0,are shown. The plot at the top left-hand corner showsthe graph of the c.d.f. forκ = 1 andr = 1. For thesevalues, the graph is symmetric about the center pointof the graph. The top right plot shows the graph whenκ = 1 andr = 0.4; it is apparent that a reduction inthe value ofr results in a more rounded curvature onthe upper knee of the profile. Similarly, lowering thevalue ofκ to 0.4 while maintainingr = 1 softens thecurvature of the lower knee, as can be seen in the bottomleft plot. Finally, when bothκ andr are increased, bothknees have sharper curvatures.

This model allows for easy interpretation of the pa-rameters. As just discussed, the parameterκ is a tuningparameter for the bottom knee of the curve, whiler isthe tuning parameter for the other knee. The scale pa-rameterσ represents the stretching and shrinking of thegraph along thez-axis aboutµ. The location parameterµ in Equation(1) turns out to be the inflection point ofthe c.d.f. curve and hence represents the thermoclinelevel.

4. Fitting algorithm

Because of the nonlinearity of the model in Equation(1), we employ nonlinear regression to find the ordinaryleast squares (OLS) estimates of the parametersκ, r, µ,andσ. A nonlinear regression model has the form

yi = F (zi, β) + εi, i = 1, 2, . . . , n (3)

wherez is the covariate,β the parameter vector,F thenonlinear function,n the total number of observations,εi the error component, and the subscripti refers to theith observation. In our case, the covariatez is the depth,and the parameter vector isβ = (κ, r, σ, µ)′.

To find the OLS estimate ofβ, i.e., one that mini-mizes the square of Euclidean distance (or the error sumof squares)‖y − F(β)‖2, wherey = (y1, y2, . . . , yn)′andF(β) = (F (z1; β), F (z2; β), . . . , F (zn; β))′, we usethe Gauss–Newton method. This method basically in-volves linear approximation of the functionF (zi, β) atan initial guessβ(0) and obtaining OLS estimate forβ,repeating this procedure iteratively until the estimate

Page 5: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

378 V. Chan, R.A. Matthews / Ecological Modelling 188 (2005) 374–385

converges to a stable value. More detailed discussionand examples on this method can be found inSeber andWild (2003)andBates and Watts (1988).

For a fixedzi, the first-order Taylor approximation ofthe function about thekth-iteration estimated parametervectorβ(k) is

F (zi; β) ≈ F (zi; β(k)) +

p∑j=1

(βj − β(k)j )

∂Fi

∂βj

∣∣∣∣β=β(k)

,

whereβj, j = 1, 2, . . . , p, is the jth element of theparameter vectorβ andFi is short forF (zi; β).

Letting g(k)i = F (zi; β(k)) − ∑p

j=1β(k)j

∂Fi

∂βj

∣∣β=β(k) be

the elements of the vectorg(k) andU(k)ij = ∂Fi

∂βj

∣∣β=β(k) be

the elements of the matrixU(k), the approximation canbe rewritten in vector-matrix form as

F(β) ≈ g(k) + U(k)β. (4)

The equation defines a tangent plane to the surface ofF(β) at pointβ(k) in the response space.

Supposekth-iteration parameter vectorβ(k) is given(k could be zero, in which caseβ(k) is the initial guess orstarting value). To obtain the next parameter estimates,i.e., the (k + 1)th-iteration parameter vectorβ(k+1), welook for the value ofβ that minimizes the error sum ofsquares

S 2∥∥ (k) (k)

∥∥2

T theo inE

β

Tt cei

thep nF db

re-g suc-c thei

difficulty with the inversion is due to ill-conditioning ornear-singularity of the matrix, caused by collinearity ofits columns. The usual consequence is that the param-eter estimates are cast into undesirable regions of theparameter space after one or two iterations, resulting inoverinflated values or values of the wrong sign. Mul-tiple iterations will result in increasingly aberrant pa-rameter estimates and eventual failure of the procedure.There are some prescribed methods or modificationsto the Gauss–Newton method, such as Levenberg–Marquardt, to deal with this problem. In our case, theproblem can be considerably reduced by

• logarithmic transformation ofκ, r, andσ;• use of step factorλ.

Using the logarithmic transformation of the threeparameters,κ, r, andσ, the parameter vector becomes

β = (log(κ), log(r), log(σ), µ)′. (6)

Since the values ofκ, r, andσ are obtained by expo-nentiating the first three elements ofβ, the parametertransformations have the effect of restricting the esti-mates ofκ, r, andσ to positive values only, which arethe proper ranges of the three parameters. This restric-tion of parameter space will help the iterative procedurezoom into and locate the true minimum, thus promotingconvergence.

ttera-ingof

is

atedt,

(β) = ‖y − F(β)‖ = ∥y − g − U β∥ .

his is done, as in linear regression, by projectingbservation vectory onto the tangent plane definedquation(4):

(k+1) = (U(k) ′U(k))−1U(k) ′(y − g(k)). (5)

his procedure is iterated, updating the vectorg(k) andhe matrixU(k) after each iteration, until convergens achieved.

Note that the partial derivatives with respect toarametersκ, r, σ, and µ of the nonlinear functio(z; β) defined in Equation(2)will have to be evaluatey numerical means.

As in many fitting procedures using nonlinearression, the most common potential obstacle to aessful fitting of the temperature profile involvesnversion of the matrixU(k) ′U(k) in Equation(5). The

The objective of using the step factorλ is to prevenovershoot of the parameter estimates from one ition to the next caused by either the ill-conditionof the matrixU(k) ′U(k) or the possible inaccuracythe linear approximation to the actual surface ofF(β)in Equation(4). An overshoot occurs when therean increase in the error sum of squaresS(β). Theidea is to take a smaller step toward the next iterparameter estimatesβ(k+1) if there is an overshooS(β(k+1)) > S(β(k)), using

β = β(k) + λ(β(k+1) − β(k))

as the next parameter estimate, rather thanβ(k+1) ob-tained in Equation(5). The step factorλ ≤ 1 is chosenso thatS(β) < S(β(k)). The typical choice ofλ is thelargest value in the sequence of halves 1,1

2, 14, 1

8, . . .

that does not produce an overshoot.

Page 6: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

V. Chan, R.A. Matthews / Ecological Modelling 188 (2005) 374–385 379

4.1. Outline of algorithm

1. Rescale the temperature values so thatTmaxandTmincorrespond to 0 and 1, respectively, i.e.,

yi = − Ti − Tmax

Tmax − Tmin.

2. Select starting values for each element in the param-eter vectorβ = (log(κ), log(r), log(σ), µ)′. At thispoint, the iteration indexk is zero. To facilitate con-vergence of the procedure, these values should bewithin the ballpark of the final values of the param-eters. In particular, the starting value for the thermo-cline depthµ should be within the metalimnion. ForLake Whatcom data, we found that starting valuesof 2.0, 0.5, and 2.5 forκ, r, andσ, respectively, workvery well in virtually all cases. As for the startingvalue for the thermocline depthµ, we used a valueclose to the “middle” of the metalimnion.

3. Compute the elements of vectorg(k) andU(k), whichare defined just before Equation(4). The c.d.f.F (z; β) is computed by numerical integration ofEquation(2), where

f (z) = Γ (eβ1 + eβ2)

Γ (eβ1)Γ (eβ2)

× e−β3(eβ1/eβ2)eβ1 exp[eβ1e−β3(z − µ)]

{1 + (eβ1/eβ2) exp[e−β3(z − µ)]}eβ1+eβ2,

be-

45

areext

fac-il

S(β) < S(β(k)) or until λ is equal to some smallnumber (e.g., 1/16), in which case the increment inthe estimate is deemed sufficiently small. Note thattheβ computed in this step will be used as the nextiterated parameter vectorβ(k+1), instead of the onecomputed in step 4.

6. Compute the relative change in the error sum ofsquares

|S(β(k+1)) − S(β(k))|S(β(k))

.

If this relative change is less than, say, 1%, stop.Otherwise, repeat steps 3–6 for the next iteration.Note that other criteria of stopping the procedurecould also be used, such as the relative change inβ. When the procedure stops, the plot of the fit-ted curve should always be checked to ensure thatthe procedure has yielded a minimum error sum ofsquares.

4.2. Examples

As examples of model-fitting using the algorithmdescribed previously, we will show the fitted resultsfor the two temperature profiles given in Section2.Fig. 4shows the fitted curve to the temperature profilegiven inFig. 1. Using the starting values given in theo ge( er off e thes mer-i tiald uredt int herea n theo 30–8 e oft is1

urep edf rallyg d byt

(7)

and eβ1 = κ, eβ2 = r, and e−β3 = 1/σ. The valueof the lower limit of integration can be taken tothe pointz wheref (z) corresponding to the temperature profile is virtually zero, sayz = −3. Thepartial derivative atz = z0 is calculated by

∂Fi

∂βj

∣∣∣∣z=z0

=∫ z0

−∞∂f (w)

∂βj

dw

using numerical integration as well.. Computeβ(k+1) using Equation(5).. Computeβ = β(k) + λ(β(k+1) − β(k)), starting with

λ = 1 and the corresponding error sum of squS(β). If there is not an overshoot, move to the nstep. Otherwise, multiply the value of the steptor λ by one half and recomputeβ, repeating unt

utline of algorithm, it took five iterations to converthe exact number of iterations depends on a numbactors; apart from the starting values, these includize of the step factor and the accuracy of the nucal integration calculation of the c.d.f. and the parerivatives). The fitted curve appears to have capt

he overall shape of the profile fairly well, especiallyhe metalimnion where the thermocline resides. Tppears to be a slight systematic deviation betweebservations and the fit over the depth interval of0 m. Based on the fitted curve, the estimated valu

he thermocline levelµ, shown as the horizontal line,4.5 m.

Fig. 5 shows the fitting result for the temperatrofile given inFig. 2. The number of iterations need

or convergence was five. As before, the fit is geneood. The thermocline depth is 7.3 m, represente

he horizontal line.

Page 7: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

380 V. Chan, R.A. Matthews / Ecological Modelling 188 (2005) 374–385

Fig. 4. Temperature profile and fitted curve of Site 4 at Lake What-com on September 2, 2003. The estimated thermocline level (hori-zontal line) is 14.5 m.

4.3. Some potential convergence problems andpossible remedies

4.3.1. Choice of starting valuesOur experience in fitting the model to Lake What-

com data indicates that the choice of starting valuefor the thermocline depthµ is crucial in ensuring ahigh likelihood of convergence for the algorithm. Thestarting value forµ should be within the metalimnionregion; a good choice would be the halfway pointbetween the depths of the two knees (in whichever waythe depth of each knee is defined.) Convergence failureis very likely to occur if the starting value is outside themetalimnion. As pointed out in the description of the

Fig. 5. Temperature profile and fitted curve of Site 1 at Lake What-c ontall

algorithm, the starting values of 2.0, 0.5, and 2.5 forκ, r, andσ, respectively, work well for Lake Whatcomdata, regardless of the actual depth of the thermocline.

4.3.2. Anomalous temperature profiles andpartial fit

Another type of convergence problems occurs whenthe temperature profile data show substantial devia-tion from the typical shape of a stratified profile withthe three distinct regions of epilimnion, metalimnion,and hypolimnion. For instance, there could exist a sec-ondary thermocline or multiple “steps” in the temper-ature curve. One such an anomalous behavior can beseen in the data from Site 2 of Lake Whatcom takenon July 10, 2002 and shown inFig. 6. The behavior isdue to intense heating from solar radiation on the lakesurface during a period of relatively calm wind.

Fitting the proposed model to such nontypical tem-perature profiles is inappropriate because the modellacks the flexibility to handle anomalous behaviors.Furthermore, attempts at fitting will usually result inconvergence failure. However, if part of the profileexhibits the typical profile shape, especially the partwhere the actual thermocline resides, then a partial fitof the model may be possible. Although such a partialfit does not capture the entire shape of the profile, itis still useful in that it allows for the estimation of thethermocline depth.

Partial fit is achieved by restricting the range ofdepth in the profile that can reasonably be fitted byt iss tod rmo-c

4the

d (asi oneo tiali iates thefi hallc

e 4t -v andt e

om on August 10, 2000. The estimated thermocline level (horizine) is 7.3 m.

he proposed model. An example of partial fittinghown inFig. 6. The fitting of the model is restrictedepths between 5 m and 20 m. The estimated theline depth from the model is 10.9 m.

.3.3. Nonconforming points and imputed valuesProblems in convergence may also occur when

ata points in metalimnion region are very sparsen some of the profile data for Lake Whatcom) andr more key data points, which are highly influen

n determining the profile curvature, happen to devufficiently far from the expected alignment withtted curve in relation to the other data points. We sall such pointsnonconforming points.

An example is the temperature profile from Sitaken on September 3, 2002, shown inFig. 7. Obserations are especially lacking in the metalimnion,he data point at depthz = 20 m is nonconforming. Th

Page 8: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

V. Chan, R.A. Matthews / Ecological Modelling 188 (2005) 374–385 381

Fig. 6. Temperature profile and partially fitted curve of Site 2 atLake Whatcom on July 10, 2002. The estimated thermocline level(horizontal line) is 10.9 m.

fact that it does not quite conform to the fitted curverelative to the other points is illustrated inFig. 8. Theplots correspond to the first four iterations of the fit-ting, with they-axis representing the standardized or

Fig. 7. Temperature profile of Site 4 at Lake Whatcom on September3, 2002.

rescaled temperature andx-axis representing the depth.In the first and second iterations, the data point at depthz = 20 m deviates considerably in terms of vertical dis-tance from the fitted curve in comparison with the otherpoints.

Fig. 8. The first four iterations of the fitting of temperature profile of S er

escaled temperature values, while thex-axis denotes the depthz.

ite 4 at Lake Whatcom on September 3, 2002. They-axis represents th

Page 9: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

382 V. Chan, R.A. Matthews / Ecological Modelling 188 (2005) 374–385

To get a better fit, the algorithm adjusts the param-eter values, as seen in the plots for third and fourthiterations, so that fitted curve gets closer to the point atz = 20 m. The net effect on the parameter estimates isthatr becomes smaller (the upper knee having a moregentle curvature) and 1/σ increases (the graph shrink-ing horizontally toward the the inflection pointz = µ).In the fifth iteration (not shown), the value ofr is re-duced even further (r = 0.00466) and 1/σ becomesvery large (1/σ = 39.7), resulting in very large valuesof the p.d.f. (see Equation(7)), thus causing numericaloverflow and convergence failure.

This example illustrates the typical problem of thisnature when a nonconforming observation lies consid-erably far from the expected alignment with the fittedcurve. In its attempts to improve the fit (through mini-mizing the error sum of squares), the algorithm drivesthe parameter estimates in the direction of increasinglylarge (or decreasingly small) values for some param-eters in the parameter space. Eventually when one ormore of the parameter values become very large (orsmall), the algorithm fails.

This nonconvergence problem affects about 16% ofLake Whatcom data. While it is possible that this prob-lem could be handled by modifying the entire algo-rithm (say, by using a constraint similar to Marquardt’sor Levenberg’s method), a simple ad hoc remedy thatworks in many cases, at least for Lake Whatcom data,is to introduce an “imputed” point into the data set forthe purpose of stabilizing the iterations.

et twod int.W hast top way,t

1 Site4 n thep ed,y m.

ran-t airlyw ithn o thep oft ore

Fig. 9. Fitted curve of Site 4 at Lake Whatcom on September 3, 2002.The asterisk represents the imputed value. The estimated thermoclinelevel (horizontal line) is 15.6 m.

studies are needed to better understand and improve theefficacy of such an ad hoc remedy.

Also, while it is clear that adding an imputed valueto the profile data will affect the fitted curve slightly, forpractical purposes the influence of an imputed value onthe OLS estimate of the thermocline depth is minimal.

5. Discussion

Our proposed model based on the generalizedF dis-tribution has the following new and beneficial featuresin comparison with other empirical models of the tem-perature profile:

• it provides a fairly accurate description, throughleast-squares fitting, of stratified temperature profilefor a large class of profile data;

• for an adequately fitted profile, it provides a goodestimate of the thermocline depth;

• its four model parameters can be easily interpreted.

As can be seen from the examples in Section4.2, our model represents the temperature profilereasonably well. One of the main reasons for thisis that the flexibility in its parameterization allowsadequate description of the curvature at each of thetwo knees of the temperature profile, something whichall other empirical models of the thermocline that wehave come across are lacking. As an example for com-p ed

A natural choice for the “imputed” point would bhe interpolated value that lies halfway betweenata points, one of which is the nonconforming pohen included in the data, this interpolated point

he effect of constraining the fit and is thus likelyrevent the parameter estimates from running a

hereby promoting convergence.As an example, an imputed value at the depthz =

7.5 m is added to the temperature profile data forSeptember 2002, and is marked by an asterisk olot in Fig. 9. The algorithm successfully convergielding an estimate of the thermocline depth of 15.6

While the imputed-value approach does not guaee convergence in all cases, it appears to work fell for Lake Whatcom data. Of all the profiles wonconvergence problems that can be attributed tresence of nonconforming point(s), about 70%

hem can be cured by inserting an imputed value. M

arison, consider the “self-similarity” model propos
Page 10: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

V. Chan, R.A. Matthews / Ecological Modelling 188 (2005) 374–385 383

by Kitaigorodskii and Miropolskii (1970). While it isknown that this model gives a good approximate de-scription of the temperature profile, the model equationhas a rigid form; its equation for “normalized” tem-peratureθ (ratio of the difference between temperatureand surface temperature to the difference between sur-face temperature and lake-bottom temperature) withinthe metalimnion as a function of depthz is given byθ = 8η/3 − 2η2 + η4/3 for θ, whereη = (z − h)/�h,h is the depth of the boundary between epilimnionand metalimnion, and�h is the thickness of the met-alimnion. Because of the rigidity, the model equationdoes not describe very well the curvatures at the twoknees of the profile, which tend to vary from site to siteand from one time point to another. In fact, even if theparameterization of this model is allowed to vary (i.e.,the coefficients of the powers ofη are allowed to takethe best “fit”), the “self-similarity” model would stillprovide a poor description because its model equation,based on a polynomial of fourth degree, does not pos-sess the necessary flexibility to account for the varyingcurvature of temperature profile measurements.

Another advantage of our model is that fitting themodel to the profile data also provides an estimate ofthe thermocline depth (as defined in Section2), giventhrough one of its parameter values. Models such as the“self-similarity” model with θ = 8η/3 − 2η2 + η4/3do not include an inflection point, and hence does notallow for the estimation of the thermocline depth.

It is difficult to make a fair and direct comparisonb ricalm ea-s thata ions.M n inMS la heatfl lsot ydro-d ap-p pre-s ture.F era-t od-e fromt s ourm

Despite our proposed model’s capability to describethe temperature profile, there are certain limitations.From the point of view of a limnologist, the maindrawback is that, since it was formulated from em-pirical data, the model lacks a thermo-physical basisand hence does not provide any insight into the ther-mal and dynamical aspects of the temperature stratifi-cation within a lake. Moreover, being a purely descrip-tive model, it lacks predictive power; given variablessuch as wind speed or surface temperature, the model isnot able to predict, for example, the thermocline depth.Nor can it be used to estimate future seasonal variabil-ity of the temperature profile. However, our model canconceivably be incorporated into sophisticated thermo-dynamical models, where its fairly accurate represen-tation of the temperature profile and the thermoclinedepth may be put to good use, and may possibly helpextend the theory of the thermal or dynamical processesof the temperature stratification.

Another disadvantage of our model stems from thecomplicated expression and the lack of a simple closed-form formula for the model (c.f. Equations(1) and(2)). This results in an elaborate fitting procedure, re-quiring intensive computation. However, in view of thespeed and processing power of modern computers, thecomputational aspects of fitting the model to data andthe model formula calculations are no longer a seriousproblem. The first author has written a routine in S-PLUS (and R) that fits the model to temperature profiledata, and it is available upon request.

oyedi r ev-e dera UseD pestdD idt r fit-t entt sionm

A

sti-t Uni-v om

etween our proposed model, which is an empiodel that depends only on temperature-depth m

urements, and other models of the thermoclinere based on thermal and dynamical consideratodels of the latter type, such as the ones giveunk and Anderson (1948), de Caprariis (1981), andchernewski et al. (1994), rely heavily on dynamicand thermal variables such as wind speed andux, which are difficult to measure accurately. Ahese models are developed using a synthesis of hynamical theory and heat equations, which, whenlied to a complicated system such as the lake, reents at best an approximation to the true state of naor these reasons, in terms of description of temp

ure profile or thermocline depth estimation, these mls produce results that may deviate substantially

rue values, and therefore are not as accurate aodel.

Because the nonlinear regression method empln this paper does not guarantee a successful fit fory set of profile data, it would be useful to consilternative fitting procedures such as the Doesn’terivative (DUD) method and the method of steeescent (see, for example,Seber and Wild, 2003, orraper and Smith, 1998). While these methods avo

he nonconvergence problems encountered in ouing procedure, they may not be easy to implemo fit our model compared to the nonlinear regresethod.

cknowledgments

We thank Mike Hilles and Joan Vandersypen, Inute for Watershed Studies, Western Washingtonersity, for providing the temperature profile data fr

Page 11: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

384 V. Chan, R.A. Matthews / Ecological Modelling 188 (2005) 374–385

Lake Whatcom. We also wish to thank two anonymousreferees and the Editor for reviewing our manuscriptand for providing valuable comments and suggestions.

Appendix A. Approximate confidence intervalfor thermocline depth

The standard inference techniques for nonlinear re-gression may not be used to construct a confidenceinterval for the thermocline depth estimate becausethe error termsεi based on the model in Equation(3)for temperature profile are non-normal and are corre-lated between adjacent levels of depth. The correla-tion can be seen in the fitted temperature profiles in theFigs. 4 and 5in Section4.2, where the residuals aroundor below the lower knee tend to bunch up on one side.The residuals from fitting the model to profile data alsoindicate that the errors are heteroscedastic, i.e., the vari-ance of the error depends on the depth.

The presence of both serial correlation and het-eroscedasticity in the error term complicates the in-ferential procedures on the estimates. Fortunately, aninference method that deals with such errors has beendeveloped. It is based on the normal approximationto the OLS estimators after accounting for the unequalstandard errors of the estimates. This method, discussedbriefly inSeber and Wild (2003, p. 274)and in more de-tail byGallant (1987, p. 137), estimates covariance ma-trix for the OLS estimatorβ that is asymptotically validw edas-t ildc

β

w

V

a

stn

B

where

w(t) ={

1 − 6|t|2 + 6|t|3, 0 ≤ |t| ≤ 12

2(1− |t|)3, 12 < |t| ≤ 1

and

Bnr =

∑ni=1+r εiεi−r

[∂Fi

∂β∂Fi−r

∂β′]β=β

, r ≥ 0

B′n,−r, r < 0.

Note that ∂Fi

∂βis a vector corresponding to theith

depth level such that thejth element of the vector is∂Fi

∂βj, i.e., ∂Fi

∂βevaluated atβ = β is equal to theith row

of the matrixU.An approximate 100(1− α)% confidence interval

of βi, theith parameter ofβ, is given by

βi ± zα/2√

Vii

wherezα/2 is the upper 100× α/2 percentile of thestandard normal distribution andVii is theith diagonalelement ofV. In particular, if the thermocline depthµis the fourth parameter in the parameter vectorβ, thenan approximate 95% confidence interval forµ is

µi ± 1.96√

V44.

Example

keno onfi-d m,1 ust1 rvalf hed alsc r ofo ondp ndh mateo

R

B and

hen the errors are both correlated and heteroscic. Gallant (1987, p. 559)shows that, under some monditions,

ˆ − β·∼ N(0, V),

henn is large, where

= (U′U)−1B(U′U)−1.

ndU is a matrix such thatUij = ∂Fi

∂βj

∣∣∣β=β

.

To define the matrixB, let 〈n〉 be the integer neare1/5. Then

ˆ =〈n〉∑

r=−〈n〉w

(r

〈n〉)

Bnr,

Using the data for the profile data from Site 4 tan September 2, 2003, an approximate 95% cence interval for the thermocline depth is (13.475.58 m). As for the data from Site 1 taken on Aug0, 2000, an approximate 95% confidence inte

or the thermocline depth is (6.97 m, 7.69 m). Tifference in the width of the two confidence intervan be explained by the fact that the numbebservations in the metalimnion region of the secrofile is much larger than that of the first profile, aence the greater precision in the thermocline estif the second profile.

eferences

ates, D.M., Watts, D.G., 1988. Nonlinear Regression AnalysisIts Applications. John Wiley, NY.

Page 12: Using the generalized F distribution to model limnetic temperature profile and estimate thermocline depth

V. Chan, R.A. Matthews / Ecological Modelling 188 (2005) 374–385 385

de Caprariis, P., 1981. A note on the development of the thermoclinein temperate lakes. Ecol. Model. 12, 213–219.

Draper, N.R., Smith, H., 1998. Applied Regression Analysis, thirded. John Wiley, NY.

Gallant, A.R., 1987. Nonlinear Statistical Models. John Wiley, NY.Johnson, N.L., Kotz, S., Balakrishnan, N., 1995. Continuous Uni-

variate Distributions, vol. 2. John Wiley, NY.Kitaigorodskii, S.A., Miropolskii, Y.Z., 1970. On the theory of active

layer of open ocean. Izvestiya (Atmos. Oceanic Phys.) 6, 97–102.

Meeker, W.Q., Escobar, L.A., 1998. Statistical Methods for Relia-bility Data. John Wiley, NY.

Munk, W.H., Anderson, E.R., 1948. Notes on a theory of the ther-mocline. J. Mar. Res. 7, 276–295.

Schernewski, G., Theesen, L., Kerger, K.E., 1994. Modelling thermalstratification and calcite precipitation of Lake Belau (northernGermany). Ecol. Model. 75-76, 421–433.

Seber, G.A.F., Wild, C.J., 2003. Nonlinear Regression. Wiley-Interscience, NY.


Recommended