A Comparison of Two Exact Confidence Regions for Partially Nonlinear Regression Models

The Canadian Journal of Srarisrics Vol. IS. No. 2. 1987, Pages 127- 135 Lo Revue Canadienne de Srarisrique

A comparison of two exact confidence regions for partially nonlinear regression models David HAMILTON

Dalhousie University

Key words and phrases: Efficient scores, exact confidence region, nonlinear regression,

AMS I980 subject classiJications: Primary 62302; secondary 62F25. partially nonlinear model, second-order approximation.

ABSTRACT

Exact confidence regions for all the parameters in nonlinear regression models can be obtained by comparing the lengths of projections of the error vector into orthogonal subspaces of the sample space. In certain partially nonlinear models an alternative exact region is obtained by replacing the linear parameters by their conditional estimates in the projection matrices. An ellipsoidal approximation to the alternative region is obtained in terms of the tangent-plane coordinates, similar to one previously obtained for the more usual region. This ellipsoid can be converted to an approximate region for the original parameters and can be used to compare the two types of exact confidence regions.

On peut obtenir des dgions de confiance exactes pour tous les parametres d’un modMe de dgression non lintaire en comparant la longueur des projections du vecteur des erreurs dans les difftrents sous-espaces orthogonaux de I’espace des observations. Pour certains moddes partielle- ment linkaires, on peut obtenir une autre rkgion exacte en remplacant les paradtres lineaires par des estimations conditionnelles dans les matrices de projection. Cette dgion de confiance peut alors &tre remplacee par une approximation ellipsoldale exprim& en fonction des coordonnks du plan tangent, met alors de construire une dgion approximative pour les parametres originaux et de comparer les deux types de rtgions de confiance exactes.

1. INTRODUCTION

The partially nonlinear regression model has the form

y = g + xe2 + c,

where y is the n X 1 vector of dependent variables, and c is the corresponding error vector, assumed to be spherically normal with covariance matrix 0 2 f . The mean vector 9(0) = g + X 0 2 is a function of p I nonlinear parameters 8, through the n x 1 vector g and n X p matrix X , and of p 2 = p - p I linear parameters 02. Both g and X usually also depend on some explanatory variables.

Approximate confidence regions for all the parameters in any nonlinear regression model 8 can be obtained using the likelihood-ratio criterion or the large-sample normality of the maximum-likelihood estimate 6. Regions obtained using the former approach correspond to contours of constant sum of squares, and correction factors to make the

127

128 HAMILTON Vol. 15, No. 2

nominal level more accurate have been proposed by Beale (1960) and Johansen (1983). The latter approach is equivalent to assuming the model is linear near 6 and produces ellipsoidal regions.

Exact joint confidence regions contain those values of 8 satisfying

{Y - ~ t w T v t v T v ) - i v T { Y - q(e)i 6, ( 1 . 1 )

where V = ( V , , X) is the n X p derivative matrix and 6 is u2x2(p; a) if u2 is known and p s 2 F ( p , v; a) if u2 is estimated independently using u degrees of freedom, for example using replications. If, as is usually the case, u2 is not known and cannot be independently estimated, the joint region contains values of 0 satisfying

where e(8) = y - q(8), P ( 8 ) = V(VTV)-’VT, and f = p F ( p , n - p ; a ) / ( n - p). Both ( I . 1) and (1.2) are derived from the efficient scores (Cox and Hinkley 1974, p. 324), which are Vre(0)/u2 for the nonlinear regression model.

Equation (1.2) states that 8 is in the joint confidence region if the usual F-test for the linear regression of e(8) on the columns of V is not significant at level a. In geometric terms, the left-hand side of (1.1) and (1.2) is the squared length of the projection of the error vector into the subspace spanned by the columns of V . This subspace is tangent to the expectation surface at q(8), which consists of all points q(8) obtained by varying 8. In (1.2) the dependence on u2 is eliminated by dividing the left side of (1.1) by the component of e(8) orthogonal to the tangent plane. The exact confidence region proposed by Halperin (1963) for parameters in partially nonlinear models is the same as (1.2). A similar exact region was proposed by Hartley (1964) using a predetermined subspace of sample space rather than the subspace spanned by the columns of V .

Confidence regions based on the sum of squares or on the scores are implicitly defined, and are extremely difficult to calculate and display even for small p . To overcome this difficulty, second-order approximations to these regions have been obtained by Bates and Watts (1981), Hamilton, Watts, and Bates (1982), and Hamilton (1986).

An alternative exact confidence region for partially nonlinear regression model parameters has been described by El-Shaarawi and Shah (1980). Its second-order approximation is derived in this paper and used to compare the two exact regions.

2. THE ALTERNATIVE EXACT REGION

El-Shaarawi and Shah (1980) discuss exact confidence regions for a more general class of model of the form ye, = X 8 , + E, where both the “observations” ye, and the matrix X can depend nonlinearly on 8, . For the partially nonlinear regression model with y,, = y - g , their joint region is the same as (1.2) except that Occurrences of the linear parameters O2 in the derivative matrix V are replaced by the conditional least-squares estimates 6, = ( X T X ) - ’ X T ( y - g). The same change in Equation (1.1) gives the alternative exact joint confidence region when u2 is known or independently estimated. Geo- metrically, this modification compares lengths of projections of e(8 ) into and orthogonal to the tangent plane at q@,, 6,) rather than at q(8), where q@,, 6,) is the point on the expectation surface which is closest to y when 8, is fixed.

Halperin (1963) noted that the projection matrix P ( 8 ) of Equation (1.2) is free of e2 if each nonlinear parameter Occurs in only one column of g and X. In this case the regions given by (1.2) by El-Shaarawi and Shah ( 1980) and by Halperin (1963) are the same. For

1987 NONLINEAR REGRESSION MODELS 129

the modelf(8, x,) = 0.49e-el(xl-8' + 02{1 - e-et(xl-8)} discussed by El-Shaarawi and Shah (1980) and by Draper and Smith (1981, p. 479, the usual and alternative regions are the same even though Halperin's (1963) condition is not satisfied, because Oz enters V only as a scale factor for one of the columns.

Some partially nonlinear models for which the two exact confidence regions are not the same are the model due to Khuri (1984),

f(8, X) = + e-'jx) - 02xe-03x

a reparametrization of the asymptotic regression model '2 -e3x f(e, X) = 8 , + - e 81

and the simple two-parameter model

f(8, X) = 8,x + e2e-'1*.

The latter model is used to illustrate the two exact regions and their second-order approximations in Section 5.

One reason for prefemng the alternative region is that, like the sum-of-squares region, it is quadratic in 8, for a given value of 8, , and this makes it easier to calculate and display than the usual exact region. The left side of (1.1) and (1.2) is (8, - 62)TXrX(82 - 6,) + ( y - g)T{P(8) - P Z ( 8 , ) } ( y - g) , where P 2 ( 8 , ) = X ( X T X ) - I X T is the projection onto the component of the tangent plane which is spanned by the columns of X. The right side of (1.2) involves O2 through the sum of squares, and is (8, - 62)TXTX(8, - 6,) + ( y - g)'{1 - P2(OI)}(y - g) . Therefore the regions (1.1) and (1.2) can be written

(2.1)

where p(8) is 6 - ( y - g)'{P(O) - P 2 ( 8 , ) } ( y - g ) for (1.1) and ( y - g)'[f{1 - P ( 8 ) } - (P(8) - P 2 ( 8 , ) } ] ( y - g ) for (1.2). Replacing BZ by 6, in p(8) makes these regions quadratic in 8, for a given choice of e l .

(8, - e,)TxTx(ez - e,) s p(8),

3. THE SECOND ORDER APPROXIMATION

The second-order approximation for exact confidence regions and sum-of-squares regions has two stages. The first stage approximates the region by an ellipsoid on the tangent plane at q(@), and the second stage approximately determines the corresponding values of 8. This method allows for accurate approximate regions to be calculated at a fraction of the cost of the actual region. In addition, different approaches to confidence-region construction can be compared through their ellipsoidal tangent-plane approximations.

The approximations depend on the model second derivatives, which are contained in the n X p x p array V. . , and the extent of nonlinearity is measured by the intrinsic and parameter-effects arrays, AN and AT (Bates and Watts 1980) and by the p X p effective residual curvature matrix B (Hamilton, Watts, and Bates 1982). For the partially nonlinear model, all quantities are partitioned to correspond to the partition in 8. The matrix B = L'[e(@)'][V..]L, where the square-bracket multiplication results in a p X p matrix with ijth entry equal to the product of e (@) and the n X vector of second derivatives with respect to Oi and O j . The p x p matrix L satisfies U = V L , a?d transforms t!e n p r a l basis for the tangent plane at q(@), consisting of the columns of V = V ( @ ) = ( V , , X ) , into the orthonormal basis consisting of the columns of U = (U, , Uz). Choosing L t? be lower triangular causes Uz to contain an orthonormal basis for the space spanned by X , and also causes BZ2 to be zero because V,, = 0.


The ellipsoidal tangent-plane approximations have the form

?'C? 6 p, (3.1)

where C and p depend on the type of region and 7 are transformed coordinates given by 7 = Cr'(q(8) - q(6)}. These approximations were derived using a quadratic approximation to the expectation surface and the resulting approximate expressions for the error vector

(3.2) e(8) = e(6) - UT - N(7'AN7)/2

p ( e ) = (U + N M ) ( Z + M W - I ( U + NM)'. and the projection matrix

(3.3)

In these equations, the columns of N form an orthonormal basis for the subspace orthogonal to the tangent plane at q(6), and M = AN? is an (n - p) X p matrix. Substituting (3.2) and (3.3) into (1.1) and (1.2), and eliminating terms in 7 of order greater than two, we obtain the ellipsoidal approximations ?'(I - B),T S 6 and 7'(Z - B){Z - (1 + f ) B h s f e @ ) ' e ( @ ) for the usual exact regions. Values of 7 on the boundaries of these ellipsoids can be readily obtained as the image of a sphere under a linear transformation. An all-purpose approximation to the mapping from 7 to 8 is the quadratic Taylor series approximation 8 = 6 + L(T - 7'A77/2) described by Bates and Watts (1981). The linear approximation ignores second-derivative information; it uses C = Z in the first stage, and the linear transformation 8 = 6 + LT in the second.

Similar approximations for the alternative exact regions are obtained by incorporating the restriction O2 = 6, into the above derivation. The parameter components 8, and 6, together satisfy the last p 2 normal equations X'{y - q(8,, 6,)) = 0, and a first-order Taylor series in 7 produces a similar constraint in the tangent-plane coordinates

7 2 = B2171. (3.4)

The constant term in the expansion is i ' e ( 6 ) = 0, and using the chain rule and the fact that d.r/dO 19 = L-I, the first derivative is

d X T { y - q(e)}lo = {-i+ + ( [ 3 ' ] [ V , , ] , O)}L dT

= -{(O, i 'U2) - ([~TI[v211LII, 0)).

The Taylor series expansion is obtained by postmultiplying by 7 , and equating to zero and premultiplying by L2* gives (3.4) after rearranging. An approximation to the modified projection matrix is obtained by substituting (3.4) into (3.3). Combining with (3.2) and eliminating terms in 7 of order greater than two results in the approximation

~ ' ( 1 - BF)'(Z - BF)? C 6

for the case where uz is known or independently estimated, and

?'{(I + f ) ( ~ - BF)'(Z - B F ) - f ( ~ - ~ ) ) 7 6 fe(6)'e(6>

for the case where u2 is eliminated using division, where

= 81- The same ellipsoidal approximations are obtained directly by expanding (1.1) or (1.2) as

1987 NONLINEAR REGRESStON MODELS 131

quadratic Taylor series in T, using the multivariate chain rule and inverse-function theorem, and taking into account the constraint B2 = 6,.

4. DISCUSSION

General insight into the effect of replacing the linear parameters by their conditional estimates in the projection matrices is obtained by comparing the approximating tangent- plane ellipsoids for the usual and alternative exact confidence regions. Because of the nonlinearity of the mapping from T to 8, regions of equal size on the tangent plane do not map into regions of equal size in the parameter space. Therefore some differences in these ellipsoids will be magnified and others will be diminished by the transformation. Regions of overlap of the two tangent-plane ellipsoids will be preserved by the nonlinear mapping, however.

If B = 0, the two sets of ellipsoids are the same, because BF = B . This will be true if the model satisfies Halperin's condition that each nonlinear parameter appears in only one column of g and X . As mentioned in Section 2, the usual and alternative exact regions are also identical in this case. For these models a second partial derivative with respect to one linear and one nonlinear parameter is directly proportional to the first partial derivative with respect to the nonlinear parameter. Therefore the second-derivative vector is entirely in the tangent plane at q(@) and orthogonal to the residual vector, so the corresponding entry in B is zero.

The two sets of ellipsoids are the same in the subspace determined by the constraint T , = B 2 1 ~ 1 r because FT = T for these T . As discussed in Section 3, this subspace corresponds to the condition 8, = 0,. An examination of Equation (2.1) shows that the exact confidence regions are also identical under this condition and the values of in the region satisfy (y - g)'{P(OI, 6,) - P2(01))(y - g) 6 and (y - g ) T { P ( 8 1 , 6,) - ~ ~ ( e , ) } ( y - g) s f (y - g)'{Z - P(e1, 0,)}(y - g ) respectively. The latter region is identical to the exact region for the subset of nonlinear parameters, given by El-Shaarawi and Shah (1980), except for the value off, which involves p instead of pI .

When T~ = 0, corresponding to O1 = 6, , the ellipsoids simplify to T:(Z + B Z 1 B l 2 ) ~ , =s E for the usual region, and to 7 : ~ ~ S E for the alternative region, where E = 6 when 02 is known or independently estimated and E = fe(@)'e(@) otherwise. Therefore the alternative region has a larger ellipsoidal approximation in this subspace, unless B I Z = 0, and the ratio of the volumes is I Z + B21B121i.

Hamilton, Watts, and Bates (1982) examined the eigenvalues of the matrix B to assess differences between the usual exact confidence region, the sum-of-squares region, and the linear-approximation region. The matrix C from (3.1) has the same eigenvectors as B for each of these regions, so the differences in eigenvalues of C reflect differences in the axis lengths of the ellipsoids. In contrast, the matrices in the ellipsoidal approximations for the alternative exact regions do not have the same eigenvectors, and neither has the same eigenvectors as B unless BI2v t = 0 for i = 1 , . . . , p , where v, = ( v : , v i ) is the ith eigenvector of B . In the direction v , , the length of the alternative region ellipsoid is shorter than the length of the ellipsoid for the usual region if

T

VZB2I B I2 V t

2VlVZt ' A, > -

where h, is the eigenvalue associated with v, . In particular, the alternative ellipsoid will be shorter in directions corresponding to positive eigenvalues. If there is only one linear and one nonlinear parameter, vt vanishes from the right-hand side of this inequality.


5.6

5.4

5.2

82

5.0

4.8

4.6

4.4

- usual _ _ - - alternative

I I I I I I

0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65

8,

FIGURE I : Usual and alternative exact 95% regions.

For the case where u2 is known or independently estimated, the volume of both approximating ellipsoids is the same and proportional to I I - B I I - B 12BZI I-', as can be verified using the formula for the determinant of partitioned matrices. Since the ellipsoidal approximation for the alternative region is larger than that for the usual region when T~ = 0, it must be smaller in some other direction. When u2 is unknown, the volume of the ellipsoid is proportional to I Z - B I I - B 12B21 1;' I I - ( 1 + f)B I I - ( 1 + f)B 12821 1-4 for the alternative region and I I - B I I - B lzBzl 1-2 1 I - ( 1 + f)B I I - ( 1 + f ) 'B 12B21 (-4 for the usual region. These are different unless BIZ = 0. When there is only one linear and one nonlinear parameter the approximating ellipsoid for the alternative exact confidence region will be smaller than that for the usual exact confidence region.

In summary, the two types of approximate regions differ to the extent that B 1 2 is not zero. The matrix B I 2 measures the component of the second derivative vectors from the cross partial second derivatives in the residual direction. These vectors are zero for models satisfying the condition of Halperin ( 1963).

5. EXAMPLE

Six observations were simulated for the model

1987 NONLINEAR REGRESSION MODELS 5'6c 133

- _ _ _ alternative . linear approximation

4.4 b I I 1 I 1 I

0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65

8,

FIGURE 2: Second-order approximate 95% regions.

f(0, X) = e , x + e2e-@lX

using O1 = 0.5, €I2 = 5.0, and u = 0.1. The data, X' = (0, 1 ,2 ,3 ,4 ,5) andy' = (5.0823, 3.4946, 2.8432, 2.6185, 2.4164, 2.8486), give parameter estimates 6, = 0.47671, & = 4.9865, and s2 = 0.0174. The parameter transformation matrix

= [::::;: -0.7852

and effective residual curvature matrix

[ -0.1891 -0.04351 -0.0435 0 B =

were obtained from the model first and second derivatives evaluated at 6. The analysis of the previous section indicates that the two ellipsoidal approximations

have the same length in the direction T~ = -0.0435~~. The alternative ellipsoid is longer when T , = 0, but the ratio of the lengths is { 1 + (-0.0435)2}j = 1.0009, indicating there is no practical difference. The eigenvalues of B are -0.1986 and 0.0095, so the alternative ellipsoid is longer than the usual ellipsoid in the direction of the first eigenvector v I =


(-0.9768, -0.2139)T and shorter in the direction of the second eigenvector v 2 = (0.2139, -0.9768)T. Usingf = (2/4)(6.94) = 3.47, the matrices governing the ellipsoidal tangent-plane approximations are

1 2.2027 0.2747 0.2747 1.0085 ( I - B)T{I - ( 1 + f)B} = [

and

1 2.1826 0.0435 0.0435 1.oooO ( 1 + f ) ( I - BF)T(I - B F ) - f (Z - B ) = [

The volumes of the usual and alternative approximate regions are proportional to 0.6826 and 0.6772 respectively. From these calculations we expect there to be little difference between the two exact regions. The 2 X 2 X 2 parameter effects array is

0.501 -0.0121 [-0.438 -0.1591 A T = L0.012 0 -0.159 0 ’

and the second-order approximations to the two regions, obtained by reexpressing the tangent-plane ellipsoids using the quadratic Taylor series, are shown in Figure 1 . The ellipsoidal linear-approximation region is also shown.

The exact usual and alternative 95% confidence regions, shown in Figure 2, confirm the approximate calculations. Overall there is little difference between the two regions, and the alternative region is slightly smaller. The inexpensive second-order approximations are similar to the exact regions, while the linear approximation region is not. Calculation of the exact regions was extremely time-consuming. For the usual exact region a separate regression of e(0) on V was done over a fine grid of @values, and the F-ratio was calculated. A contour-plotting function found and plotted the values of 8 in the grid for which F = F(2, 4; 0.05) = 6.94. Points on the alternative exact region were obtained by solving the quadratic equation (2.1) for a range of 8,-values.

ACKNOWLEDGEMENT The author acknowledges the support of the Natural Sciences and Engineering Research Council

of Canada.

REFERENCES Bates, D.M., and Watts, D.G. (1980). Relative curvature measures of nonlinearity (with discussion). J. Roy.

Bates, D.M., and Watts, D.G. (1981 ). Parameter transformations for improved approximate confidence regions

Beale, E.M.L. (1960). Confidence regions in nonlinear estimation (with discussion). J . Roy. Sturist. Soc.

Cox, D.R., and Hinkley, D.V. (1974). Theoretical Starisfirs. Chapman and Hall, London. Draper, N., and Smith, H. (1981). Applied Regression Analysis. Second Edition. Wiley, New York. El-Shaarawi, A,, and Shah, K.R. (1980). Interval estimation in nonlinear models. Sunkhva Ser. B , 42,

Halperin, M. (1963). Confidence interval estimation in nonlinear regression. J. Roy. Srutisr. Sor. Ser. B , 25,

Hamilton, D.C. ( 1986). Confidence regions for parameter subsets in nonlinear regression. Biornerriku, 73,

Hamilton, D.C.; Watts, D.G., and Bates, D.M. (1982). Accounting for intrinsic nonlinearity in nonlinear

Statist. Soc. Ser. B , 42, 1-25.

in nonlinear least squares. Ann. Srarisr., 9, 1152- 11C.7.

Ser. B , 22, 41-88.

229-232.

330-333.

57-64.

regression parameter inference regions. Ann. Statist., 10, 386-393.

1987 NONLINEAR REGRESSION MODELS 135

Hartley, H.O. (1964). Exact confidence regions for the parameters in nonlinear regression laws. Biometrika, 51,

Johansen, S . (1983). Some topics in regression. S c a d . J . Statist., 10, 161-194. Khuri, A.I. (1984). A note on D-optimal designs for partially nonlinear regression models. Technometrics, 26,

341-353.

59-61.

Received 31 August 1985 Revised 6 August 1986 Accepted 4 December 1986

Department of Mathematics. Statistics a d Computing Science Dalhousie University

Halifar, Nova Scoria B3H 4H8

Date post:	01-Oct-2016
Category:	Documents
Upload:	david-hamilton
View:	216 times
Download:	3 times

A Comparison of Two Exact Confidence Regions for Partially Nonlinear Regression Models

Documents