during UniversityUniversity ofCalifornia Berkeley, CA94720 Technical ReportNo. 99 June 1987 1...

Constrained Non-Linear Regularization with application to someSystem Identification Problems

Finbarr O'Sullivan 1

Department of StatisticsUniversity of CaliforniaBerkeley, CA 94720

Technical Report No. 99June 1987

1 Research supported in part by the National Science Foundation under Grant No.MCS-840-3239. Some of the work was done during my stay at the Institute forMathematics and its Applications, University of Minnesota, in January 1987.

Department of StatisticsUniversity of CalifomiaBerkeley, California

Constrained Non-Linear Regularizationwith application to some

System Identification Problems

Finbarr O'Sullivan1Department of StatisticsUniversity of CalifomiaBerkeley. CA 94720.

ABSTRACT

The paper studies consistency properties for constrained method of regu-larization estimators of 0 in the abstract non-linear regression model

Zi= rl(0;Xi) + ej i = 1,2,....

Tl(O ; x) are non-linear functionals of 0, an element of a Hilbert space E8, andsi are independent measurement errors with mean zero and bounded variance.Several inverse or system identification problems associated with the determi-nation of functional parameters in differential operators can be formulated inthis manner. Conditions are given under which the asymptotic properties ofmethod of regularization estimators may be approximated by linearized estima-tors obtained via the Gauss-Newton algorithm. Rates of convergence of thelinearized estimators detennine the rates of convergence of the non-linear esti-mators. The degree of non-linearity in the functionals Tj(; ) plays an impor-tant role.AMS 1980 subject classifications. Primary, 62-G05, Secondary, 62-J05, 41-A35, 41-A25, 47-A53, 45-L10, 45-M05.

Key words and phrases. Inverse problems, Non-linear Regression, Regulariza-tion, Constraints, Rates of Convergence, System Identification

Running Head: Constrained Non-Linear Regularization

1 Research supported in part by the National Science Foundation under Grant No. MCS-840-3239. Some of the workwas done during my stay at the Institute for Mathematics and its Applications, University of Minnesota, in January1987.

Constrained Non-Linear Regularization

with application to some

System Identification Problems

Finbarr O'Sullivan1

Department of Statistics

University of California

Berkeley. CA 94720.

1. Introduction

The 1-dimensional heat equation with variable conductivity and insulated boundary has

the form

Au(x,t) +{(x)+ u(xt)}=f(x9t) x e [0,1] and te [0,T

subject to

u (x,O) = uo(x) x e [0,1],

a-u(,At)=-au(1,t)=0 te [0,T]. (1.1)ax ax

If the conductivity 0 is known and the initial and forcing terms u0 and f are specified in

appropriate function spaces the equation can be solved (at least numerically) to obtain a tem-

perature profile u (,t) at any positive time t. An inverse system identification problem is to

determine the conductivity given measured infonnation about u. If u were observed continu-

1 Research supported in part by the National Science Foundation under Grant No. MCS-840-3239. Some of the workwas done during my stay at the Institute for Mathernatics and its Applications, University of Minnesota, in January1987.

- 2 -

ously in time and space without error then 0 could be found by integrating the relation

ax ax j at

and using the boundary condition a u (O,t) = 0. This gives

aa0(x)-a U(xj,) = Lf(X',t) - a@,a(tt] (1.3)

Thus if the spatial temperature gradients are non-zero almost everywhere and 0 is continuous

then (1.3) uniquely detennines 0. It is easy to appreciate that the best choices for f and uo

are ones which force large temperature gradients. If f is zero then for large time the tempera-

ture becomes essentially constant so the information at later times is not as useful as informa-

tion at earlier times. Time series analysts usually recommend the use of white noise input for

the identification of linear transfer function models, see chapter 11 of Box and Jenkins[3] for

example. In the present context this corresponds to taking a very variable uo. In practice the

information gathered about u is incomplete so that some form of regularization is needed to

estimate 0.

1.1. Constrained Non-Linear Least Squares Regularization

To estimate 0 using the data

Zij = U(Xi,tj) + eij , i = 1,2,3, n; j = 1,2,3, m . (1.4)

where xi are in [0,1], tj are in [O,T] and the eij's are random errors, is a classical ill-posed

inverse problem. For a regularized solution we let

m n

InX(O) = nm j=i[z= - u(; x,tj)]2 + kJ(0) , 2>> 0. (1.5)

where J is a penalty functional designed to be large for physically unreasonable conductivity

- 3 -

profiles. u (O; *,) solves the heat equation (1.1) with conductivity given by 9 - this is a non-

linear functional of 0. The estimation criterion is to chose 9 in some linear function space E)

so as to minimize l,,X subject to the constraint that 9 be positive.

on = argmin 1,,(O) (1.6)

C is the subset of e) consisting of positive functions. We call (1.6) a constrained non-linear

least squares regularization method.

An Illustration

Simulated data were generated according to the model in (1.4). The parameters were set

as follows: 0(x) = 0.5 + x - 5x4 + 6x5 - 2x6, uo(x) = 10 + 270X2 - 180x3, and the forcing

term f is zero. There were n = 20 measurement sites (xi = i/21, i =1,2, . 20) with m = 50

observations at each site (tj = j1l01, j =1,2, 50). Gaussian noise was added with mean

zero and standard deviation 3. Time series plots of the data at different sites and spatial plots

of the data at different times are given in Figure 1.1. Since there is no forcing term the later

time points are likely to provide little or no information about 9. Notice how the spatial plots

flatten out in time.

For numerical computation 9 is expanded in a p -dimensional B-spline basis,

p9(x) = F9jBj(x), and the coefficients O9 are determined to minimize the non-linear regulari-

j=1

zation criterion 4nX subject to the coefficients being positive. The penalty functional was taken

to be the Laplacian penalty J(O) = I[O(x)]2dx, used in cubic smoothing splines. In terms of

the B-spline coefficients O = (01,02, ,0p)' the penalty is O'LO where njk = (x)B*k(x)dx.

A constrained Gauss-Newton algorithm is used to obtain the solution, see O'Sullivan and

- 4 -

Wong[26]. The results are presented in Figure 1.2. There are three curves corresponding to:

(i) the true conductivity profile, (ii) an estimate obtained by selecting k to minimize the predic-

tive mean square error (optimal MSE estimate)

R(x) = 1mn

xi ,t)]2 , (1.7)nm j=lt=1

and (iii) an estimate obtained by chosing X to minimize the generalized cross-validation type

criterion (GCV estimate)

m n

2[Zij _ aU fnXistj)]2VQX) = j1li1l(1

(trace[I - [ X'X+ nmXQ]f1X'X])2 (1.8)

where X is the linearized design matrix, X(i)k = Du (O,X ; xi,tj), see O'Sullivan and

Wahba[24] for another use of a similar cross-validation function.

The example raises a number of questions: (i) Is the estimation technique consistent? In

what sense? (ii) Does the theory of cross-validation for linear estimators carry over to this

situation? (iii) Is there an optimal design theory for these sorts of experiments?

1.2. General System Identification and an Abstract Non-linear Regression Model

Inverse problems associated with the identification of functional parameters in partial

differential equations arise in several fields including diffraction tomography, reservoir

engineering and seismology. Some recent references are[7, 8, 10, 11, 14, 15, 16, 18,21, 27,30].

A generalized framework for a large class of these problems is the following (see §2 of Kra-

varis and Seinfeld[18]): U, F and (D are function spaces, there is an operator A such that for

each f in F and 0 in C c E, there is a locally unique u in U satisfying

A (u,0) = f (1.9)

Now let Xn be a linear mapping on U, X,, U -- Rn. We assume that f is known and we

- 5 -

are given a vector of measurements zn

Zn =XnlU +En (1.10)

where en is a vector of i.i.d. mean zero, finite variance random variables. With these data we

wish to estimate the functional parameter 0.

The problem can be put in the form of an abstract non-linear regression problem. Equa-

tion (1.9) gives an expression for u which in general is non-linear in f and 0. With f fixed

the measurements in (1.10) can thus be regarded as non-linear functionals of 0. We write the

data as

Zi==(0;Xi) + e, . i = 1,2,3,...n (1.11)

where the design points xi lie in some index set I, for each x in I, 1( ;x): 3 -4 R is a non-

linear functional of 0 (for the data in (1.4), I c [O,1]x[O,T]). The e,'s are measurement errors.

Applying a constrained non-linear least squares regularization method to these data gives the

estimation crterion

Onx = argmin I4 n [Z _1(0;X)]2 +XJ() } > 0.

where C is a subset of e and J is an appropriate penalty functional. In some circumstances it

is would be of interest to replace the least squares fitting by a more general M-type estimation

criterion. This is not considered here. We make the following basic assumptions.

Assumption. A.1.

(i) Let Fn be the empirical cumulative distribution function (cd.f) of the xi's and let F be the

limiting c.df. F has a density which is bounded away from zero and infinity. If

kn = sup I Fn (x)-F(x)Ix e I

then for some C > 0 and all n sufficiently large kn < Cn -/2.

- 6 -

(ii) The ei's have mean zero and variance bounded by a2. S4 is the partial sum process of the

measurement errors

S. (x) = n- 1 a F£i :xi x

where x 5 y means all the components of x are less than or equal to the corresponding com-

ponents ofy. Let s, = sup nS4(x)I thenOp(s4)=x CZ I

(iii) 0 is a Hilbert space with inner product <, >. For each x e I, rj( ; x) is three times

Frechet differentiable. J is also three times Frechet differentiable.

If the x,'s are random then A.1 (i) is straightforward. Kolmogorov's inequality gives A.l(ii) in

the case that the ei's are independent. A.1(iii) is needed to analyze the estimators.

1.3. Outline

Regularization makes the inverse problem well-posed in the sense of Tikhonov[33]. To

verify this there have been a number of theoretical investigations related to uniqueness

(identifiability) and continuous data dependence of the method, see for example[7, 8, 18].

Sharper approximation theoretic and statistical stability results most also be possible. Thus if

the true parameter is assumed to lie in a particular function space then we ought to be able to

give some description of the degree of approximation associated with the regularization method

as a function of n and X. This was the motivation for our investigation. In practice these

kinds of results provide some insight into how much data are required in order to get a desired

degree of resolution. Asymptotic linear approximations for the constrained regularzation esti-

mator are developed. The linearizations are obtained using the Gauss-Newton rather than

Newton-Raphson expansions. The justification for the linearizations uses various conditions on

the smoothness of the data functionals Tl(,x). Our results provide some justification for the

use of the Backus-Gilbert[2] averaging kemel calculus in non-linear situations,[25].

- 7 -

The method of analysis owes much to the techniques developed by Cox[13] for the

approximation of linear method of regularization estimators. The paper begins by reviewing

this theory. General Newton-Rahpson and Gauss-Newton approaches to the analysis of non-

linear estimators are described in §2. Gauss-Newton linearizations are more useful for our pur-

poses. §3 has results on the rates of convergence for the linearized estimators and in §4 we

verify that these results are true for certain classes of Hammerstein integral operators. In §5

we look at some system identification problems and develop a convergence result for the time

invariant version of (1.1). We conjecture that similar results are available for the time depen-

dent parabolic case and provide some numerical evidence to support this. The final section

theoretically justifies the asymptotic reliability of the Gauss-Newton expansion. There are a

number of interesting directions for future work. We have begun to look at parabolic and

hyperbolic problems in a more systematic theoretical fashion. Hyperbolic equations arise in

inverse scattering, diffraction tomography for example.

1.4. Some Notation

Norms and inner products are subscripted except for < , > which denotes the inner pro-

duct on 89 and ( , ) which is used for the L2 inner product. The symbol D is used for

differentiation. The notations < and = are used extensively: If U is a metric space then

g (u) c f (u) means that there is a constant K > 0 such that g(u) . K f (u) for all u whose

norn is less than unity. g(u) = f (u) if f (u) < g(u) and g(u) <f (u). If V(u) is some set

of quantities associated with u then g (u ,v) < f (u ,v) unifonnly for v e V (u) means

g (u,v ) s K f (u,v ) for v e V(u ) and for u whose norm is less than unity. g (u,v ) - f (u,v )

uniformly for v e V(u) if f (u,v) < g(u,v) and g(u,v) < f (u,v) both uniformly for

V E V(u).

- 8 -

The notations < and - are also used for sequences. If u -* uo in U then f (u) < g (u) as

u -o uo if there is a constant K > 0 such that in a neighborthood of u0, f (u ) < K g(u).

f (u ) = g (u ) and the idea of unifonnity are defined as one would expect.

Acknowledgements

Computing was primarily carried out on the Cray X-MP at San Diego. This was sup-

ported by the National Science Foundation under Grant No. MCS-840-3239. I thank Profes-

sors Victor DiPema, Grace Wahba and Hans Weinberger for some valuable discussions and

advice.

- 9 -

2. Spectral Analysis of Method of Regularization Estimators

2.1. Linear Estimators

Cox[13] has provided an elegant approach to the the asymptotic analysis of method of

regularization estimators. We briefly review this. The theory is based on an abstract linear

regression framework: we have data yn e Rn where

Yn =XnO+En (2.1)

X. is a continuous linear map from a Hilbert space e (with inner product <,>) into R" and e-

is an error process for which Ee£ = 0 and scaled so that Var(en') = ayIn (In is an nxn iden-

tity matrix and a. = cI2/n). The method of regularization estimator is defined to be the minim-

izer of

In(e) = jItY - xneii2 + x<e,we> , A.> 0 (2.2)

Ililln denotes the usual Euclidean norm in Rn. W is a linear positive semi-definite operator

(<0,WO> 2 0 for all ). Assuming a unique minimizer, OnX, exists then we can write

On = [Xn'X, + A fX ,,nYn (2.3)

where X* is the adjoint (transpose) of Xn. Letting Un = Xn*X, the bias or systematic error in

OnX is expressed as

0 - E0,n = [I - [Un + XWTIUn]30 9 (2.4)

the variability or stability operator (covariance matrix) is

Var(0,nX = [Un + kW]f1Un [Un + XW]' (2.5)

These expressions follow directly from (2.3) and the assumptions in the abstract linear regres-

sion model.

Asymptotically U,, converges to a limiting operator U. With regularity, the simultaneous

- 10 -

diagonalization of W relative to U yields a set of eigenfunctions {dv ; v = 1,2,3, I and

eigenvalues {Yv; v = 1,2,3, } for which

WOV = Yv'4v<OVUOII> = 8V) V , g = 1,2,3,.... (2.6)

where 5v4 is Kronecker's delta function. Convergence characteristics are most conveniently

analyzed in terms of this eigensystem. (Throughout the paper eigenvalues are arranged in

increasing order). A collection of convergence nonns are defined as follows:

llillIp = Z[1 + yV],<O,UOV>2 , p E R. (2.7)Vv

For each p there is a subset, GO, of elements of e for which lllp is finite. Let E3p be the com-

pletion of E&p in the norm 11 lip. This is a Hilbert space with inner productp

<0,4>p= J;[l + yv]P<O,UOv><4,UOv>. The convergence norms 1l1lip are sort of abstract, how-Vv

ever, for certain statistical smoothing problems function space interpolation theory[4] can be

used to relate the 11I lp norms to a collection of more interpretable Sobolev norms. Extensions

of these results to deconvolution problems are described in Nychka and Cox[23]. Bias and

variance formulas in 11IHlp-norms are given by

110- EOn,Wl2 II[1- [U + kW]-fU]3ll 2

= 1R[Y,/(l+Xyv)]2[1+yv]p<O,U4V>2 (2.8)v

E{ll[X:Xn + XWf1X*en 11p2} = E{[l+yv]p<[X *Xn + XW]FIX e 9U4.V>2}

1X[l+yv]p l+kyv]2E<X, E >2

^2j:[j+.yv]p[jl+;yV]2=--n2 C (Xgp) (2.9)vV

The = signs in these formulas involve the replacement of the operator Un = X*X, operator by

- 11 -

its limit U (this is justified in[13]). From (2.8) it is shown that if the true parameter is in

@p+2a for some a e (0,1) then IIE8,x - {311p = Xa as x -- 0. Here the = sign that the

sequences on the left and right are equivalent up to fixed constants. The behavior of the quan-

tity C (X,p) determines the asymptotic variance. If Yv Vr then C (k,p) is convergent for

p < 2-1/r and as X -+ 0, C(X,p) = %(p+l/r) for -h/r < p < 2-1/r. (See Theorem 2.4 of[13]).

Optimal stochastic convergence rates, in the sense of Stone[31], follow from these results.

2.2. Analysis of Non-linear Estimators

General method of regularizadon estimators may be obtained by minimizing a criterion of

the form

ln(e) = In(0) + J(0) , > 0 (2.10)

over some subset C of @. In is a data fit criterion and J is a penalty term. An example was

given in §1 above. Non-quadratic In may arise because of non-Gaussian measurement charac-

teristics or non-linear functional data. Cox and O'Sullivan[12] descrbe a Cramer style

approach to the analysis of these estimation schemes. We briefly describe this theory and then

indicate how the approach is modified to handle the abstract non-linear regression model.

2.2.1. Newton-Raphson Approach

The analysis in[12] uses a second order Taylor series approximation for the regularization

functional ln; which amounts to the Newton-Raphson method. Let D denote differentiation

with respect to 0. The variational equation for 0,,X is Z,x- 0 where Z,,X(O) = D4nX(0). In the

limit as n -+ co, ZnX(0) -+ Z)x(O). The derivative of the limiting variational equation is

D ZX(6) = U(O) + XW(O). Let 00 be a, locally unique, solution to ZX(O) = 0 at A = 0. For all

A less than a sufficiently small Ao there is a locally unique 0O satisfying Z(00) = 0. More-

over, with probability approaching unity there is a locally unique 0,X satisfying ZnX(0:n) = 0.

- 12-

The estimation error is decomposed into a systematic plus random component:

OnX - 00 = (OX - 030) + ((nX - OX) -(2.11)

Two linear approximations are obtained; the first is for analyzing the bias and the second is for

variability

Systematic Error:

ok- -00 _k 00 = - [U(00) + XW(O0)]yIZX(O0)Random Error:

3nX - O ( = - [U(8XO) + XW(0X)]l'ZnX(OX) (2.12)

Here = indicates that convergence rates are equivalent to first order. Rates of convergence

results are formulated in tenns of norms related to the operators U and W. For any 0 < x < 4

the simultaneous diagonalization of W(OX) relative to U(O%) yields a set of eigenfunctions

%Iv ; v = 1,2,3, } and eigenvalues (y,.v; v = 1,2,3, } such that

W(&)Axv = Yxvoxv<0V'U(OX)Xg> = , v. ,v = 1,2,3,.... (2.13)

where 8vg is Kronecker's delta function. The convergence norms are expressed as

veiiP = [1 + >2 . (2.14)v

A key result which applies to many non-parametric smoothing problems, including hazard esti-

mation, multivariate density estimation, estimation of regression functions in generalized linear

models and robust estimation of vector valued functions, is that the limiting (as v -e oo)

behavior of yxv does not depend on X and in fact yxv = v' for some r >0. Function space inter-

polation theory[4] may again be used to relate the II llxp norms to a collection of more interpret-

able Sobolev norms. Asymptotic convergence rates in a variety of such nonns are detailed

in[12].

- 13 -

2.2.2. Gauss-Newton Approach

Direct second order Taylor series expansion of 'nx does not yield results for regulariza-

tion in the abstract non-linear regression model. The main technical problem which anises is

that the "U-component" of D Z%(O%) - J (OX) becomes difficult to analyze when T1(0,x) are

non-linear functionals. The Gauss-Newton approach overcomes the difficulty; in general we let

K[x] = DT(0,x)o for x e 1. U(O) = KgKe so

,,U(O)0> = 4,K*K00> = fKt[x1]Ke0[x]dF(x) (2.18)

From assumption A.l(i), <0,U(0)4> = (K90,K94) (uniformnly in 0). The linear approximations

are

Systematic Error:

ok- 00 = 0o - 0o - [U(00) + %W(00)]f ZX(00)Random Error:

On- O - 0 = - [U(O) + XW(0X)f1-ZnXA(0) . (2.19)

In the case that J is quadratic with no linear term, so DJ (0) = WO where W is a fixed linear

operator, ZX(0O) = XW 0O. For most of the paper we will assume that this is so. In any event

we will assume that ZX(00) = XW (%O)O. The rates of convergence for the linearized estima-

tors are formulated in terms of norms related to the operators U(O) and W. Some general

results are described in the next section. As in the case of the Newton-Raphson approach we

can show that the convergence characteristics of the non-linear method of regularization esti-

mator can be described to first order by the behavior of the linearized estimators in (2.19).

The justification for the linearization uses various smoothness conditions on rl(0;x). Our con-

ditions imply that the eigensystems related to the simultaneous diagonalization of W relative to

K&,Ke, are uniformly equivalent for all x sufficiently small. Theoretical justification for the

linearizations is given in §6.

- 14 -

3. Linearized Convergence Theory

3.1. Introduction

To study convergence characteristics of the Gauss-Newton linearization we must first

consider some properties of the simultaneous diagonalization of W relative to U. More pre-

cisely the simultaneous diagonalization of W(O) relative to U(O*) = K K*.

Assumption. A.2. (Properties of K* and W(O )).

K* is compact and has zero null space on e). W(O*) is a positive operator and for 0 e E9

110112 = <O,U(* )O> + <,W(O* )O> > 110112

We will define several Hilbert spaces. For 0* e 93, let

@° =(0 e 93 : II0II2 = <0,U(0* )0> + <0,W(0* )0> < co) and let 8* be the completion of e02

with respect to the nonn 11 11*. E)* is a Hilbert space with inner product

<4,0>* = <4,U(O* )O> + <4,W(O* )O>

From Weinberger[36], if K* and W(O*) satisfy (A.2) there is an eigensystem in E3*, fy*,,y*v,v = 1,2, } satisfying

<0* 9,,W(O* )O* > = 7* v6,v

<O* VqU(O*)*> = SV1 v,p = 1,2, (3.1)

The eigenfunctions are complete in 9* so any 0 E 9* has a convergent representation of the

form 0 = ;<0,U(O*)*V>-4*V. For any x > 0 we can define [U(O*) + XW(O*)f-1 by the Spec-v

tral Theorem and we have

[U(o*) + xwW(o*)]-'U(O*)O*v = [1 + XY*V141O*V (3.2)

The convergence norms are defined as in (2.14); II0I*p = [ 1 + y* ]I<0,U(0* )4* v>2 in partic-v

- 15 -

ular

11* v12p = [1 +Y VI, (3.3)

Let 92° be the elements of E) for which 1iieii* is finite. The completion of @92p in the 1111*p

norm leads to another Hilbert space @* p with inner product

<0,4>*P = [1 + y V]P<O,U(O )O* ><4,U(O*)O*v> p e R . (3.4)v

Notice that @* = e* l. A stability condition will allow us compare the Hilbert spaces (9*p for

a range of e* values.

Assumption. A.3. (Local Stability Condition)

Let N90 be a convex subset of e containing 00. K* and W(O) are locally stable w.r.t. N90 if

there are fixed operators K and W satisfying (A.2) and constants cl > 0 and c2> 0 such that

for all 0* in N9o all 0 in e

C I<O.WO)> < <O,W(O*)O> < C2 <O.WO> (3.5)cl (KKXO) < <4,U(9*)4> . c2 (K4KO). (3.6)

The convexity of N9o is used in §6. Under assumption (A.3) the eigensystems exist for all

0* e NOO. By the Mapping Principle, Weinberger[36], y*, = y, uniformly in Noo, where y, are

the eigenvalues of W relative to U = K* K. Let X, be the eigenfunction corresponding to y,

and let E3 be the completion of (0 e 8: [I0IIp= [1 + y]P(K ,,K 0)2 < oo}. This is a Hil-

bert space with inner product < , .>p defined as in (3.4). From the definitions of 1I1ll1p and 111i1pis clear that .*p - 9p for p e R (the nonns are equivalent uniformly for 0* E N90). From

Theorem 2.3 (iii) of Cox[13], we have the result

- 16-

Theorem 3.1. (Linearized Bias).

Let N00 be any subset of EG containing 00 for which (A.3) holds then for a E [0,1], if

00 e E9p+2a

110- oollp < x' 11011p+2a as X -+ 0. (3.16)

the bound is tight in the sense that sW.7 K - °P is up to a constant (independent of X)

equal to %'(1 +o(1)) as x - 0.

Proof: From[13]

IIX[U(00) + XW(e0)r-1W(eO)0OIIOP = 110ex-0011Op < Xa iieo0o p+2a uniformily as X - 0.

but (A.3) implies lI'llop = 1IIIlp which proves the first statement. The tightness of the upper

bound follows in a similar fashion again using the results in [13] E

We could extend this result to obtain further asymptotic bias characteristics as in

Theorem 2.3 of[l3]. To study the second linearization or variability it will be necessary to use

integration by parts. To carry out this exercise it is convenient to have an identification of the

lI norms in ternms of Sobolev norms. We make the following assumption.

Assumption. A.4. (Sobolev Norm Identification)

(i) For 0 e Ge, KO: Q - R with Q c Rd a bounded set satisfying (7.10) and (7.11) of Lions

and Magenes[201.

(ii) For some M > 3d/2 (not necessarily an integer) there are constants cl > 0 and C2 > 0

such that

c1 |IK j11M < IIK4iI12 + <4,W4> = 11112 . c2< IK4tIWf

for 0 E e9.

- 17 -

By definition of 1llp and A.4(i), II1Ifo = IIKOIIL2 the L2 norm being over the set n. Moreover

A.4(i) allows Sobolev spaces W" (0) for s 2 0 to be defined by the K-method of interpolation,

see page 40 of[20]. From the K-method of interpolation assumption (A.4) implies that

IlIllp = IIK¢IIwmp for p e [0,1] and equivalently IIKOlw I-lQlltm for t E [O,M]. Also, fol-

lowing an identical argument to that given in Lemma A1.2(a) of[12], IIK(llwI = 1kMI. tlM-

These relations are most useful. Thus (A.4) gives that IIlip are identified with Sobolev norms

for p e [0,1]. For 2 > p > 1, II-lip can be identified with norms on Sobolev spaces satisfying

additional boundary conditions, see §3 of Cox[13] where similar results are worked out in

detail. A further stability condition is needed.

Assumption. A5. (Sobolev Norm Stability)

For some 3d/2 < s < M there are constants cl > 0 and C2 > 0 such that

c 111K 0112s < IIK 0112s :5 C2lK 2

for 0* E N60 and 0 e 8.

From (A.4) and (A.3), IIK4llw) - IIlIIs/M = IIII*s,/M, so (A.5) implies IIK*4llws IIlII*s,/M.

From the K-method of interpolation and (A.4)

IIK*4)lIwf 1111* pslM = IkIItpslM = IIKwIIws p E [0,1]

To analyze the second linearization we need some information concerning the limiting behavior

of the eigenvalues of W relative to U = K* K. We shall see in Theorem 4.3 that (A.4) will

imply Yv = v Id. In general we assume that there is an estimate of this form.

Assumption. A.6. Let Yv , v = 1,2, be the eigenvalues of W relative to U. Then yv Vr

for some r > 0.

- 18 -

Let Cr(X,ca) be defined as

Cr(X,(X) = ([1 + Vr]a[l + XVr ]-2 (3.14)v=1

From Theorem 2.4 of[13], Cr(A,P) is convergent for p<2 - llr and

{ -(p+l/r) -llr < p < 2 - rCr(X,P) = log(1/X) p = -1r (3.15)

1 p <-llr

There is the following general result concerning the behavior of the second linearization.

Theorem 3.2. (Linearized Variability).

Suppose assumptions 1 and 3 through 6 hold. If

(i) there is Ao > 0 such that O0 exists in Ne0 for X e [0,X0]

(ii) 0o e E)PO with po > 3d/2M andfor some 8 > 0

110x- 0OI8+3d/2M = 1106 - 011O8+3d/2MIl + o(1)) as X -+0.

Then for -llr 0 arbitrarily small

EII0onX II < n-'Cr(X,p) + n'knCr(X,p + dI2M) + k2C,(X,p)XS-

If k, is a sequence such that n-l4X-dM -+ 0 then

Elln>w~ oll2 -A-lk<p+l/r)EIIenX - <Xl

uniformlyfor X e [,n,ko] andfor -lr )2 (3.16)v

But

- 19 -

n

E<Znx(Ox) - Z%(Ox),O%v>2 = n-202~ K4x)2(.7z (0z, <> 2 = n-<52 2 (,¢V(xi ) 12 (3. 17)i=l

+ ;[ri(0o;x )-T1(0e;x )]KxO4v(x )d (Fn-F )(x ) 12

where a2 is the variance of the ei's. Therefore

< nIIK%xXvIIl2 + nl l)[KXOXv[x]]2d(Fn - F)(x) I (3.18)

+ {[Tl(fo;x )-rl(fx;x)]Kxoxv(x )d(F,,-F )(x )12

Using integration by parts, Holders inequality and Sobolev's Imbedding theorem to analyze the

terms involving Fn - F (there are several examples of this in[12] ) we obtain

E<Zn;L(OX) Z;L(OOiOXv> < n 'IIKXO;LvIIZ2 + n -'kn IlKovIL2lK,xvIlI2+ kn2jIT(%O; )-Tj(08;.)jj2 d IIKXvI12d (3.19)

Clearly

Il1(00o;) - 1(Ox; 1¾2 Sup I1,(Oo;x) - j(0X;x)I2 +Xsup2I '. (eo;x) - Zfl(OX;X)12x e Cl i=1X e Cl axi ~~~~ax,

Consider the first term here, by the mean value theorem

Tl((o&x) - 71(Ox;x) = Kx(x)(0O - Ox)[x] (3.20)

where X(x) e [O,XO]. Therefore

supIJn(o;X) - rI(9;;x) 12 sup I K(x(OX - (o)[x 2

 0 is arbitrarily small. The Sobolev norm stability assumption (A.5) was used to

obtain the second to last inequality, the last inequality uses Sobolev's Inequalities, see

Theorems 3.9 and 3.10 of[l]. By a similar analysis, the second term is bounded by a constant

times IIK(00 - OOjj3d246, with 6 such that 3d/2 + 5 < s in (A.5). Using hypothesis (ii) and

- 20 -

(A.4)

IIK(e ° lW24 IIKW(jd/2o)IId IIO- 0iI3d/21+E (3.22)

where e > 0 can be arbitrarily small. 00 e EPO for po > 3d/2M, so Theorem 3.1 gives

||n(e0;) - rl(e ;')Id < I--00113d/2M+e AI2d/M+8 (3.23)

where a > d/2M and 8 > 0 is arbitrarily small.

IIKXO)Xv t2I I=llvllxo = [1 +yv]° = 1 and using (A.5), (A.4), (A.3) (in that order)

IIKk¢xvIIdd -IIK4vIIW -II4AVIL3IM - IkxvIIXd/M = [1 + v]dM (3.24)

Thus (3.25)

(OO- ZX(0),O>2 < n + nk [1 +XV]d2M + kn 2XE[l + Y

where e> 0 is arbitrarily small. But from (A.5) and (A.6), yAV =V so for

-llr < p < 2 - llr - dIM

EIIO^4 - OX1j2 < nlCr(X,p) + nl1knCr(X,p+d/2M) + kp2X2xCr(X,p+d/M) (3.26)

Using the constraints on p and expression (3.15) for Cr(A,p), this reduces to

< n1Cr(k,p) + n lknX-d/2M Cr(X,p) + k2x2a-d/MCr (X,P)

= n 1Cr(X,p) + lkn XAd/2H2MCr(Xp) + k2XECr(X,P) (3.27)

where £ > 0 is arbitrarily small. This proves the first statement of the theorem. The last part

follows immediately. U

- 21 -

4. Linearizations Results for Hammerstein Equations

Here we consider data functionals which are evaluations in the range of a first kind Ham-

merstein integral operator

rn(O;x) =rk(u,x)f(e(u),u)du , x e n (4.1)

or

Tj(O,x) = N (e)[x] = KF ()[x] (4.2)

where K is an integral operator with kernel k (u x). K is assumed to be compact, invertible

with zero null space. F is not assumed to be compact, indeed in our examples the derivative

of F defines an isomorphism on L2(M). For Hammerstein operators K determines the smooth-

ness or compactness while F detemines the degree of non-linearity. In order for ri(Ox) to be

defined we assume f (x,y) is continuously differentiable in x and that f (O(u),u) and

af(O(u ),u) and are in L2(Q) for e in L2(Q). The linearized operator isax

Kh44xI = )k (u,x)h (u)(u)du (4.3)

where h (u) = af (O(u),u). Thus K* = K,,. with h (u) = afL(e (u),u).

In practice, integral operators typically arise as Green's operators for differential or

pseudodifferential operators. Boundary value problems give rise to integral equations of the

first kind, initial value problems to Volterra integral equations. Following Agmon[l] a

differential A operator of order I is defined as

A(x,D) = 1; aa(x)Da. (4.4)I al lI

where a = (al,c2, ad)' is a d dimensional integer multi-index with non-negative com-

ponents. I a I is the euclidean norn of a and D a is the differential operator

- 22 -

D a2 D Dadda

D =DD1 DX2 Dxs d IIa' (4.5)

Assumption. A.7. (Properties ofK)

(i) K is compact with zero null space.

(i) K is a Green's operator for a linear differential operator A with prescribed boundary con-

ditions; whenever K ) = v then A V = 4 and v satisfies the boundary conditions, conversely if

A vf = 4 and yf satisfies the boundary conditions then K ) = Nf.

Assumption. A.8. (Quadratic Penalty)

The penalty functional has the form, J()) = IL04l12 where L is an m'th order linear

differential operator with real coefficients

L(x,D) I; (x)D.111 5l

Assumptions (A.7) and (A.8) are in force throughout this section. Since J is quadratic, the

first half of the stability condition in (A.3) is automatically satisfied with W = L*L where L*

is the adjoint of L, see p 51 of[l] for the definition of the adjoint. W is a positive operator

(<O,We> = IILOil2 >0 ). If the coefficients of A are sufficiently smooth then the composition

of L and A is well defined. To establish (A.4) it is enough that

C II4)IIW,,+/ 5 IILA OIIl2 + II111L2 1c2II4)Iw, +i (4.6)

for 4 in the range of K. Sobolev's Inequality may be used to obtain the upper bound. The

lower bound is much more delicate. The natural tool is Garding's Inequality (see Theorem 7.6

of[l]). If L and A are uniformly elliptic with sufficiently smooth coefficients lILA )112 is a uni-

formly elliptic quadratic form of order 2(m+1) and Garding's Inequality gives that

||LA4IlL2 + Il4)lL2 - cil lWl +l (4.7)

- 23 -

but only for 0 e WO,2 (elements in Wm+' whose derivatives up to order m+l-l vanish on the

boundary of Q see definition 8.1 of[l]). In the case that K corresponds to a Dirichlet problem

for A we might have that KW e W0,2 (fl). This goes some way towards establishing the result

but what we need is that KW e wmjj (n) which typicaly would not be true. It seems that we

shall have to resign ourselves to establishing the norn equivalence on a case by case basis.

More significant progress is possible with regard to the other assumptions. We begin with sta-

bility.

4.1. Stability Results

For h a real valued function defined on Q let

Kh4[x] =K(4h)[x] = Ik(x,u)4(u)h(u)du (4.6)

We consider the situation where A is elliptic and K corresponds to a Green's function for a

boundary value problem. Since J(O) = IIL411L2, for the first stability assumption (A.3) we must

only show that IlK IIL2 = IIKh 01IL2 for some range of h. For (A.5) we establish the same rela-

tion in Sobolev norms. In 1-dimension we have the following result.

Theorem 4.1. Suppose n = [0,1] and let A be an ordinary differential operator with real

coefficients

A (x,D)4 = Zav(x)(v)(x)V=1

where av e Cv(Q) for v = 1,2, 1 and a,-' is bounded. Suppose K is a Green's operator

for A with Dirichlet boundary conditions: Uv(o) = 0 for v = 1,2, I where Uv(o) are dis-

tinct elements in {4 (k)(0),O(k)(1) ; k = 0,1, .-1) .

Let B'(R) be the ball or radius R in Wl [0,1]. For each 5 > 0 there are constants c I > 0 and

C2 > 0 such thatfor all4 E L2 and h E B (R) with Ih I > 5

- 24 -

cl11|KOIlLI2 S IhlL2 < C2 I0fIK L2'

Proof. Since the coefficients a, are smooth the adjoint differential equation A* is defined

A*(x,D)o = J;(-l)vDv(av(x)4(x)1 (4.8)V-I

The leading coefficient in the adjoint is a, and by assumption the lower order terms are

bounded. Let G (t ,x) be the Green's function for the adjoint, we have k (x ,t) = G (t:x). From

pp. 180 of Collatz[9], a- (x,t) = -vG(t,x) is continuous in x and t for v . 1-2. CY-- (x,t)atv atv at'-'

is continuous except for a jump of size l/a, (x) along the "critical line" x = t. It follows that

aak is square integrable for v = 192,... 1-1. Now since G is the Green's function for the

adjoint equation

A *(tD)G(t,x) =0 (4.9)

a'Gfor all x * t E [0,1]. This means that al(x) -7G(t,x) is a linear combination of lower order

partial derivatives of G for x . t. The coefficients in the linear combination are bounded.

Applying the Cauchy-Schwartz inequality we get that the L2 norm of ak is also bounded.at'

Since G (t x) is a Green's function for the adjoint equation G (t ,x) satisfies the adjoint boun-

dary values. Thus for any solution v to the differential equation xj(V)( )aak(x,t) = 0 for t = 0at"

and t = 1.

We begin with the lower bound. Take Vy(x) = Kh Q

A (x,D )x= av(x)v)(x) = ¢(x)h (x) . (4.10)v=O

Let bv(x) = av(x)lh(x) so ¢(x) = lbv(x)V(V)(x).v=O

25 -

1 11

K4[X] = fk(x,t)4(t)dt = z k(x,t)bv(t)4(v)(t)dt) (4.11)

so

1K1L22 (x,t)bv(t)(V)(t)dt 2 (4.12)

The remainder is integration by parts. Consider the v=l tern on the right hand side. Integrat-

ing by parts and using the condition 4'(O)k (x,O) = iV(l)k (x, 1) = 0 gives

(x,t)b '(t)k(b)(t)dt}2 at (X,t)W(t)dt 12 (4.13)

Applying Cauchy-Schwartz and Sobolev's Inequalities we get the upper bound

5 IIWII 2 Xllak (x .)1122dx Ila 1112 Ilh 112 C(5,R) (4.14)

where C (8,R) is a positive constant depending only on R and S. Similar estimates can be

obtained for each v 5 1-i. The v = I term brings up . a, (x,t)b,(t)xV(1)(t)dt, which is com-

plicated because it involves the (1-l)'th partial derivative of k together with the first derivative

of IV.

1atJ_ (x,t)bl(t)V(1)(t)dt (x,t)bl(t)W(1)(t)dt +J11

Integrating by parts (using the boundary conditions) reduces this to

-aa , k (x ,x-O) _ aat-7k (x ,x +O)]b- (x )v(x ) _ t a k (x ,t )bl (t ),(t )dt (4.15)[he-jumpin k (x,tO)at -(= xx+o)ibzl(x)x(x) -ati

The jump in -~--(x ,t) at t = x is of size 1/a, (x) and applying Cauchy-Schwartz we have

at'- ((x,t)bg(tY)J/)(t)dt }I2dX . IIVIIL22 )II2 11 2 1 C (5 R )

- 26 -

where C (8,R ) is a constant. In this manner we can obtain the bound

I

avkIIK IIZ2 . IIV1L2 C (8,R ) {j L I(x, )IL2d ) ( a V112 1 (4.17)L2 2 1dv= ~t7V V= W2

Since 11,1142 = IIKhPI12 this gives the lower bound.

For the upper bound let W(x) = KO[x] so A(xD)W(x) =(x). Put

gh(x) = K(A V)h[x] =KOh[x]. Using the Green's function

gA (X) = Ik(x t) a v(tO)(v)(t) I h (t)dt (4.18)

By Leibnitz's rule

dv V ()th(-)t-t±(7(t:)h (t)) = 1 ()TV(k)(t)h (v.k)(t) (4.19)

So

ga (x) = (x,t) av ())dt - k(x, { £()a v(t)M(k)(t)h (vk)(t)I dtv=o dtv v=Ok=O

I V- 1- W(x )h (x) - E X(x ,t )()a v(t)4()(t )h (vk)(t) (4.20)

v=-Ok" a

Integrating by parts and applying the Cauchy-Schwartz and Sobolev inequalities gives

|h12 < IIWII 2 C(R 2) (ria(, )IIL2dx){(EllaV112v}

< IIyII4 c2(R) (4.21)

Since IlIgh 112= IIK4h 112 and II'VII - IIK4I124 the upper bound is also proved. U

The above result may very well extend to multi-dimensional domains and to pseudodifferential

operators but at the moment we have no results to report. For differential operators satisfying

an elliptic type coercive estimate, see Chapter 10,11 of Agmon[l], assumption (A.3) implies

- 27 -

assumption (A.5).

Theorem 4.2. Let B'(R) be the Sobolev ball of radius R in Wt (a). Suppose K is a Green's

operator for an I'th order differential operator A .

(i) If the coefficients ofA are in Wk (Q) andfor 4e w (+k()

IkWt1w2 +k IIID A)112 + 11W112

(ii) For h e H, IIKh)1112 = IIK )112

Then for h e Bk(R)(,H with Ih I > 8 >Ofor some 8

22 l w2+

uniformly in h.

Proof: We first show that there is c1 > 0 such that for all 4

IIK,h lli2.+k < clilK )II2 (4.22)

This does not us the condition that I h I > S. By (ii)

IIKh )l2I IIDa(4)h)1I2 + IIKh )II 2 < { z IIDa(4h)I112 + IK4)11Z2)lal =k lal =k

Applying Cauchy Schwartz and Sobolev's inequalities

I ID a(x)h )112 < { llh 112}k) { : IID a4II2 + 11II12} (4.23)lal -k lal -k

We are done if we can bound ll4ll,2 in terns of IIK )112 + 1lIDa)11 22 Since A is an I'thlal =k

order differential operator with smooth coefficients

lIAVIIL2 < IIW|1L2 + JlIDaVyIjj2 < llVl12 (4.24)lal =1

Letting xV = K4 and using (i), this gives

- 28 -

Ik11L2 < IIK112pI+- IIK pll 2 + IID|IL 2 (4.25)Ilal =k

which establishes the upper bound. For the lower bound, let O=(h so ( = 4yh'1 and

KO = K.-m. Now use the upper bound to get

IIKh lWyit224 < c211K 2WI,+k (4.26)

in other words, IIKII2wl+k < C21jKhOIljw+k. U2 ~~~~~2

4.2. Growth Rate Results

Assumption (A.4) gives us:

Theorem 4.3. Suppose assumption A.4 holds then y,v .

Proof: Let X, be the eigenvalues of the Raleigh Quotient

iiKe0ii2/{IIKeIi12 + <e,We>} (4.27)

- 1 = YV,. By (A.4) and the Mapping principle in Weinberger[36], yv -v where gv are

the eigenvalues of

ILK II WmKeijM (4.28)

Let L be an integer larger than M. By A.4(i), Sobolev spaces may be defined by the K-

method of interpolation, see page 40 of[20]. Thus for p = MIL

IIKOII 2M = [ I + tv]P<Ov,K 0>2 (4.29)W2v

where tv and Xv are the eigenvalues and eigenvectors of ( 2 lID From standardal=L

theory for elliptic differential operators, Agmon[l] for example, 4v = v2L'd. Direct calculation

of gv using the Maximum-Minimum characterization of eigenvalues, pp. 79 of[35], gives

gV = [1 + tv]9. (The corresponding eigenfunctions are K-1(p). Thus gv = v 2M/d and so

- 29 -

Yv V2M/d *

If K is a Green's operator for an I'th order differential operator and L is of order m then

Yv = V2(m d. Thus if (A.4) is true we must have M = m + I. We provide some results in

this direction next. Assumption A.7 and A.8 are in force for the remainder of this section.

Theorem 4.4. (A and L elliptic)

(i)a e C2'2m+'(Q) for IoI 1:5andlpe Cm+" (Q) for I3I <m.

(ii) A and L are uniformly elliptic (see pp. 45 of[]]).

then if yv are the eigenvalues ofW relative to K K

y = V2(m+l)/d

Proof:

LA(x,D)= I (x)D{l a((x)DOa} (4.30)11PIm lal Sl1

is an (I+m )'th order differential operator whose coefficients are real lie in C'"+m)(0). Let

P (x ,D) = (LA )* (LA) = A * WA. Given the smoothness P (x ,DJ) is a differential operator of

order 2(m +1) with continuous coefficients. <4,P (xD))4 = ILA 4112 and the coefficients of

LA are real so P(xD)) is a positive operator. Let P'(xD) be the principal part of P(x,D).

Forte Rd

P'(x,4) = 2II(x )lp(x )a a(X)a a(X)4a+P+a+ (4.31)113.1t3'I =mIal,lai =1

= I Ia1(x)2aI2I1Ial1-I 111 -m

> Eo I 12114el2m = Eo1412(1+m)

where Eo > 0. The second last inequality comes from the ellipticity of A and L. Therefore

P (x ,D ) is unifomnly elliptic differential operator of order 2(m +1). Let

- 30 -

Xv,v; v = 1,2, ) be the eigenvalues and eigenfunctions of P.

A WA XVv = Xvv

(WV,,W) = 8V,L v,g = 1,2, (4.32)

From Agmon[l], v, = v2(m+I)d Let Xv(x) = A (x,D )yv(x) so K4v = xVv. By substitution

<O¢L,WoV> = XV5V9L

(K4OgK4v) = 8vg v,g = 1,2, (4.33)

Therefore yv = v which proves the result U

4.3. Generalizations to Pseudodifferential Operators

There are extensions of the above results which apply to more general pseudodifferential

operators. We give a theorem related to this below. To motivate this, we first indicate esti-

mates for the eigenvalues of Abel transfonns. Related results are given in Nychka and

Cox[23].

4.3.1. Abel Transform

Let f : [0,1 ] -+ R. The fractional integral or Abel transform of f is I,f (x) where

IJ(x)-~~~[rl)[x-u] -lf(udu 1Iaf (x) f (x) a=o (4.34)

This transform has many practical applications, see Ross[28] for example. Suppose W = L* L

dmwhere Lf (x) = -mf (x). We wish to consider the eigenvalues of the Raleigh quotient

(r (4.35)

for 4e W' [0,1]and 0< c <1.

- 31 -

Theorem 4.5. If Yv v = 1,2, are the eigenvalues of the Raleigh quotient in (4.35)

arranged in increasing order thenyv - V2(m+a).

Proof: For convenience replace the interval [0,1] by [0,2nt]. Let

c = (0 e Wm [0,2it] : (s)ds = 0 , 0()(0) = 0(&)(27r) = 0 , 0 < j < m }. Ec are periodic

functions on [0,2in] with mean zero. Let AX, be the eigenvalues of the Raleigh quotient in

(4.35) over 8c. By the Separation Theorem on pp. 107 of Weinberger[35], k - < yv < Av

for v > 2m+1. Any 0 e e3c can be represented as a Fourier series (whose m'th derivative is

convergent in L2).-(x) = O ke andI k I13>0

L4(x)= ekex(ik)M (4.36)Ik 1=>0

where

2n

4Qk = i If (x)e-Icd . (4.37)2JC

0= 0 so from pp. 120 of Butzer and Westphal[5],

ikxIa¢(x)= z Qk(ik; ae (4.38)

IklI>0 (ik)c

for0c<a 1. Thus

I OQk 2k2(L OILO I k 1>0(L4,L4)_IkI>0 ~~~~~~~(4.39)(1a41af) 2 -2a

Ik 1>0

Letting iVk = k-"4/k for I k I > 0, the Raleigh quotient becomes

z k1222(m+a)Ik1>0k1k 1>0 (4.40)

Ikc 1>0

- 32 -

Using the Maximum-Minimum characterization of eigenvalues we compute the eigenvalues and

eigenvectors explicitly: This gives eigenvalues Xv = v2(m+a) and corresponding eigenvectors

,,v) = 1/2 for k = -v,v and Nk(V) = 0 otherwise. So yv v2(+a).a

Abel transforms are related to Green's functions functions for fractional derivatives. We end

this section with a result for general operators of this type - so called pseudodifferential opera-

tors.

4.3.2. Pseudodifferential Operators

Here we follow Taylor[32]. The symbol class S',1 (Qi) is defined as the set of

p e C (xRd) with the property that for any compact M c n, any multi-indices a, P, there

exists CM,a,p such that

ID PD 'p(x,4)I < CMp ap(l + I1I)mo (4.41)

for all x e M and 4 e Rd. The operator corresponding to the symbol p is denoted P, one

says P e OPS m1 (Qi). The action of P is on a function is defined by means of the Fourier

inversion formula. The Fourier transform of 0 : Q -4 Rd is 0(e) = (2)r4d (x)e-' 4dx for

e Rd and

P (x ) o(x) = (x ,4)0(4)e= d . (4.43)

Adjoints and products of pseudodifferential operators are themselves pseudodifferential opera-

tors and there are asymptotic expansions for their symbols, see Chapter 2 of Taylor[32]. The

operator P is elliptic of order m if on each compact M c Q there are constants CM and R

such that

Ip(x,4)I > CM(I + II)m if x E M, 141 >R (4.44)

- 33 -

The following lemma is elementary.

Lemma 4.4. If Po e OPSm, (Q) is elliptic of order m and P1e OPSm-1 (n) then

PO + P1 e OPSm, (Q) and is elliptic of order m.

Proof. P0 + P1 is clearly an element of OPS m1 (Q). Consider a compact M c D. Since P0

is elliptic of order m and PI e OPS ,m7' (L) there are positive constants R, C1 and C2 such

that

Ipo(x,4)l > C (I + It I)M (4.45)and

IpI(x,4)I < C2( + ItI)M-1 (4.46)

for allx e M and 141 > R. Therefore

Ip(x,4) I > C1(1+ I )m -C2(+141)M 1 (4.47)

Hence we can find R' ( i.e. R' > max(2C2/C IR) ) such that p (x,t) satisfies the ellipticity con-

dition.E

From the spectral theory of pseudodifferential operators we have the following result.

Theorem 4.6. If K be a Greens function for an operator A, where A e OPS l (fl) is a

pseudodifferential operator with symbol a. Let W E OPS 2On (Q) be a positive elliptic

differential operator of order 2m with symbol w (x ,4). If A* A is elliptic of order 21 and fl is

compact, then the eigenvalues Yv ofW relative to A* A behave as

y = V2(m+1 /d

Proof. Let P = A* WA. Since W is positive so is P. By exercise 4.2 on p 47 of[32]

P E Qp5J1m+l) (Q). Let p be the symbol of P, p = a* (x ,)w (x ,)a (x,4) and p I = p - pO.

Po SjEm+I)((Q) and

- 34 -

IpO(x,4)I = Ia*(x,4)a(x,4)1 Iw(x,4)1 (4.41)

which gives that the operator corresponding to po is elliptic of order 2(m+1). Using an

asymptotic expansion of p and applying Theorem 4.4 (p 46,[32]) twice,

Pi = P - Po SS0jm+I>l (a). Lemma 3.5 shows that P is elliptic of order 2(m+1).

P has a complete set of orthonornal (in L2) eigenfunctions ( v ; v = 1,2, } and

corresponding eigenvalues { X, ; v = 1,2, }. From the discussion at the beginning § 1 page

295 of[32] and Theorem 2.1 in the same chapter, Xv = v2(m+l)d* Letting

v(X) = A (x,D)N)v(x), yv = A, as in Theorem 4.4. U

- 35 -

5. Application to System Identification

For the abstract system identification problem in (1.10)

1(6;X)= lx(u8) (5.1)

where lx is a continuous linear functional on U and us satisfies A (e,ue) = f. Ix is a com-

ponent of the measurement operator X, u is a real-valued function u: -+ Rq and that the

measured data are equivalent to evaluations on u. Let X be the limiting design operator,

X, -+ X as n - oo (convergence in L2 norm).

Assumption. A.9. (Equivalence to Evaluation)

The mapping X: L2(Q) -+ L2(I) is an isomorphism.

Thus L2 norm in the range of X is equivalent to L2 norm in the domain of X. It is sometimes

of interest to consider measurements which fill up the frequency domain. The Plancherel

theorem shows that (A.9) will be true for such data. General integral measurements (compact

X) are excluded by the assumption.

Formally the linearized operator is defined by K94[x] = lx(f -}. From assumption

A.1(i)

IK90[]IIL2au

(5.2)

uniformly in 0. We therefore assume without loss of generality that K94[x] = -f-4[x ]. The

form of Ke follows from the implicit function theorem.

Theorem 5.1. Let es, U, F be Banach spaces and suppose A: ExU -4 F. If

(i) A (0,u) has uniformly continuous Frechet derivatives w.r.t. 0 and u, a(A and a.A, in a

neighborhood of (00,u0) where A (00,u0)-f = Ofor some f E F.

- 36-

(ii) a,A (0o,u O) is invertible

Then there exists a Frechet differentiable map us,: e -4 U defined in a neighborhood Nfo of

00 such that A (,u e)-f = 0 for 0 e N890. MoreoverDu

0 satisfies

Due{auA (0°,u°)) = - {aA (0.,u O)4}

Proof: See Theorem 2.3.5 of[17] X

Remark: The existence of solutions for a given 00 is not resolved by the above theorem. For

work on quasi-linear differential equations, see Cannon and Ewing[6] and chapter 8 of Lady-

zhenskaya and Ural'tseva[19]. A general theorem is given in Kravaris and Seinfeld[18].

5.1. Examples

(i) Linear Diffusion Equation

Let u(x,t): Qx[O,T] -e R and 0(x); Q -e R.

at ax axA(O,u) =DOau) (5.3)

u(,0)

For 0 strictly positive A corresponds to a parabolic diffusion equation with Neuman boundary

conditions. Using Theorem 5.1, K84 satisfies

aKto a a[K= a Due-~[x'I~t~-g(0(x) a x,t]) [ Wx) (x,t)] (5.4)

with zero initial and zero Neuman boundary conditions.

- 37 -

(ii) Non-linear Diffusion Equation

Again u (x,t): lx[O,T] R but now e(x,u); QxR -* R.

_ - [O0(X,u)-]{t ats

A(e,u) = a u (5.5)

u(,O) J

For 0 strictly positive A corresponds to a quasi-linear parabolic diffusion equation with Neu-

man boundary conditions. The results in Cannon and Ewing[6] show that u9 exists. Ke4

satisfies

aK04O a K04 ae au6 _ a aue0at [xt]T [e(X,ug) --[x,t] + K9[x,t]-- ]- [-(xue) -ol(5.6)

with zero initial and zero Neuman boundary conditions.

(iii) Scalar Wave Equation

Letu(x): -+ R and0(x); n-R.

A (0,u ) = { [A2 + k'O(x)]u } (5.7)

an is the boundary of n. When the potential 0 is strictly positive then the equation can be

solved. K904 satisfies

[A2 + k20(x)]KOe = - k2,4)U (5.8)

with Ke(1[x] = 0 on the prescribed part of an.

Remark

It is clear that if A is a differential operator whose boundary conditions do not depend on

O then Koo will satisfy a differential equation with zero boundary conditions and forcing term

determined by 4 and u . Symbolically we have

- 38 -

POKe4 = Qe4 (5.9)

where P0 and Qe are differential operators. Intuitively if P0 is elliptic of order I and Q8 is

elliptic of order k with I > k then K84 should give X, k-I orders of smoothness. Thus if W

corresponds to a positive elliptic differential operator of order 2m, we conjecture that the

eigenvalues of W relative to K;K9 should asymptotically behave like v2(m+I-k)d. In 1-

dimension with constant coefficient equations this intuition is easy to substantiate using Fourier

series expansions. However, in general the justification is probably not so straightforward.

We now consider some properties of the linearization arising in the linear diffusion example

introduced in § 1.

5.2. Linear Diffusion

5.2.1. Time Invariant Case

Our discussion will be restricted to the 1-dimensional problem n2 = [0,1]. In the time

invariant case it is of interest to identify 0 in the equation

d (O(x) du (x)) = f (x) .(5.10)dx dx

With f known and u measured with error. The boundary conditions must specify information

about u and its first derivative. For simplicity we assume the boundary data are u (0) = a and

u(')(0)=b .0. It is easy to show that Koo: [0,1] -+ R satisfies

d (OW. dKe4 [x]) = d [0W, due( ] (5.11)dcx dx dx dx

with d-K90[0] = K,9[0] = 0. Integrating and using the first boundary condition we have

0(x) [x] = (x)-d-(x) . (5.12)dx dx

For 0 strictly positive (or negative) we can integrate using the second boundary condition to

- 39 -

get

Ke4x ] = 4(s )h8(s )ds (5.13)

where h e(x) = du/x0(x). Note that K04 corresponds to a Hammerstein type linearizationdx

like the ones considered in §4: Keo = K(he9) where KO[x] = (t)dt. Formally K is a

Green's operator for the differential operator A (x ,D) = D with a zero boundary condition at

x = 0. We take L(x,D) = Dm. LA = Dm+l and

IILA IIL2 + 11WIL2 IkWI+1 (5.14)

so from the discussion at the beginning of §4, the Sobolev norm identification assumption

(A.4) is satisfied. It follows from Theorem 4.3 that the eigenvalues of W relative to K* K

behave like V2(m+l) asymptotically. Stability results are obtained with some conditions on the

forcing term. First a lemma.

Lemma 5.2. Let a .0 be an integer and suppose

(i) f E W [0,1] and f (x) > 6 >O and b > O (or f (x) < 5 <O and b < 0).

(ii) N(aR) = 0: 0E Bl+a(R) and 0(x)> e>O0

then h eE Bl+a(R') and I he I > rl > 0 where R' and 1 depend on R, 6, a and f and b.

Proof: Integrating (5.10) and using the boundary condition gives,

du9 ~x du(xd x(t)dt + b I so hg(x) = d(x)/(x) = 1( (t)dt +b1f It follows

dx OW dWfrom (i) and (ii) that there is some r1 > 0 such that I h 8(x) > vj. By direct computation it is

easy to find C > 0 such that

IlIh II 2,+cz < C 11 2 I,cz+i {b2 + lif II2,m} = R * (5.15)W2 ~~Ocx)W2

-40 -

From Theorems 4.1 and 4.2 we obtain the next result.

Theorem 5.3. For some a . 0 (an integer), e > 0 and R > 0, let N0o c Nja,R) and

f e W2 then assumptions (A.3) and (A.S) hold. s = 2 + a in (A.5).

Proof: For (A.3), let A = D and K4[x] = t(y)dy. Since a 2 0, Lemma 5.2 gives that

he e B1(R?) for O e Ne,, moreover h9 is bounded away from zero (uniformly in 0 E Neo).

(A.3) now follows by Theorem 4.1.

From Lemma 5.2 we can take H = Ba + 1(R') in Theorem 4.2. Since A = D, condition (i) of

that Theorem is clearly satisfied for k = a + 1, 1 = 1. We conclude from Theorem 4.2 that

(A.5) is satisfied for s = 2+a. U

Letting r = 2(m+1), we can use Theorems 3.1 and 3.2 to read off the behavior of the bias and

variability of the linearized estimators. Combining this with the results to be presented in §6

we get the following result.

Theorem 5.4 (Convergence of Constrained Regularization Estimator)

Suppose 00 E Wk[0,1] for k > 3.25 and 00(x) > e > 0. Let m . k, M = m+l and

po = (k+l)/M. If An is a sequence such that n-l'X5/M - 0 then for 0 X,, then

I1Ox - %I0p2 < (Po - P)

lHn - OIlp < Op(n'X-P + 1/2M))

Thus

11K(0nxA 0O)tIlMw < A(Po - P + Op(nWlx-p 1/2M))

An upper bound on the optimal rate of convergence is Op (n -.M(po - p)(2Mpo + 1)) which occurs

when A = n .(2Mp+ 1)* This rate is achievable if k > 3.5.

- 41 -

Proof: Theorems 6.8, 6.1 and 3.1 give the bound on the bias; Theorems 6.8, 6.2 and 3.2 gives

the stochastic bound. The Sobolev norm identification gives the bound on the convergence rate

in Sobolev norm. Equating the upper bounds on the systematic and random errors gives the

upper bound on the optimal rate of convergence which occurs when n2M/(2MPo+ 1)

X> Xn if k > 3.5. U

5.2.2. Parabolic Case

It seems reasonable that the degree of compactness of Kg should be determined by the

elliptic component of the differential operator. Standard energy type estimates are consistent

with this notion, see (42.19) of Treves[34] for example. We conjecture that convergence

results obtained for the time invariant case carry over to the parabolic situation. By this we

mean that the asymptotic behavior of the eigenvalues of W relative to K6K9 is the same in the

time dependent case as in the time invariant case. To provide some numerical evidence for

this we computed two sets of eigenvalues, {Yv) and (yj}, where yv are estimates of the eigen-

values of

(|ID 2 }I121{IIKe4II2) (5.16)

and yv are estimates of the eigenvalues of

{ IID212( IIK4IIl21 (5.17)

x

and KO[x] =(t)dt. The estimates are obtained by a Raleigh-Ritz method, see Page 79 of

Weinberger[35], with the approximating subspace being the span of 30 cubic B-splines with

equi-spaced knots in [0,1]. Plots of log (yv) and log (yv) against log(v), called eigensequence

plots, were introduced by Nychka, Wahba, Goldfarb and Pugh[22]. The eigensequence plots

are given in Figure 5.1 and 5.2. Both plots are remarkably linear suggesting that the

- 42 -

eigenvalues of (5.16) and (5.17) have an algebraic rate of growth. Theoretically, using

Theorem 4.4 for example, y, = v6 and if the parabolic problem were like the time invariant one

we would also have yv V6. This means that the slopes of the eigensequence plots would both

be around 6. By regression we obtain slope estimates of 5.6 (±.36) for yv and 5.8 (+.25) for

v (The standard errors are in brackets). The theoretical value of 6 is quite consistent with

both estimates - lending support to our conjecture.

- 43 -

6. Justification for the Linearization

Recall that the linearized estimators are defined as

Wk = -o- [U(00) + xW(e0)]r-Z(00)and

o >X= - [U(M ) + XW(eX)]11ZnX(Oe) (6.1)

where 0O E C and ZX(Ox) = 0. We will show that OX and the regularization estimator itself OnX

both exist and that: I1O1 - Oolip = IiO. - Oolip and ilinX -e,llp = llOnX - OXI1p- Our results will

show that the linearized convergence theorems described in §3 accurately predict the first order

asymptotic bias and variability of the constrained non-linear regularization estimator. An

assumption will be made which says that 00 lies in the interior of the constraint set The

linearizations are justified by arguments very similar to those used in Cox and O'Sullivan[12].

Intuitively the idea amounts to showing that Zx and.Znx are roughly linear and that the lineari-

zations behave like Newton-Raphson linearizations for n sufficiently large and X sufficiently

small.

6.1. Some General Results

Let So (R) be the ball of radius R about the origin in ep*. The ball of radius R about 00

in Ep is denoted S*O(R) = {f0} 0 S* (R). Similarly S* (R,X) is the ball of radius R about the

origin in E)p and S 0(R ,X) = 00} e S (R,).

Assumption. A.10. (00 lies in the interior of C and N9g)

For some p and R > O, S 0(R) c C and S (R) c N90.

To describe the theorems we must introduce several quantities:

(6.2)

- 44 -

for given p*, p in R

K2(X) = sup IIG1 (00)[G (0O) - DZX(00)]u Ilp2

e S(1)

K*(%) = sup IIG-'(Oo)[GX(Oo)-DZX(Oo)]uJIp.u e S (1)

K3(X,R) = sup IIG-i (0o)D2ZX(0o + u)vwllpU,V e S (R)w e S(1)

K3 (%kR) = sup IIG (0o)D 2ZX(0 + u)vwltU,V 6 S(R)w e S (1)

K2(nA) = sup IIG 1 (Ox)[G(O-) -DZnx(Ox)]uIulpso(1,X)

K*(n,X) = sup IIGj (Ox)[Gx(O - DZnX(O]ull%p-54 6 SO(1?X)

K3(n,X,R) =

K (n,X,R) =

sup JIGR (OX)DZ2z,X(oX + u)vwlipu,ve S(RA)w e S (l,X)

sup JIG -1 (OX)D 2Znx(Ox + u)v llpu, S;(R,X)w S (l.X)

(6.6)

d(X,p) and dn(X,p) are such that

d(X,p) = 11Ix - Oollp = IIG-' (Oo)XW(Oo)OolipOp {dn (X,p)l} lln - Oollp = IIG - (OX)Znx(Ox)lip (6.7)

Finally let r (A) and r * (X) be such that

K2(X) + K3(X,x (X)) r (X)K2 (X) + KQ (X,x (X) = r (X) (6.8)

wheneverx(X) d(X,p*) uniformly in X as X O. Similarly r,(X) and r(X) are such that

Op {K2(n,X) + K3(n,Xxn (X)} = r, (X)

Op{fK2(n,X) + K*(n;XXn W)) I rn*(;t) (6.9)

whenever xn(X) d,(X,p*) uniformly in k and n as n oo and for X E [XngkO]9 where

(6.3)

(6.4)

(6.5)

- 45 -

ko < oo and X, is some given sequence tending to zero in a manner to be prescribed later on.

Theorem 6.1. (Existence of O,x and the Bias Approximation)

Suppose (A.JO) holds. If r (X) 00 as X -4 0 then we can find Xo > 0 and constants K0 and

Kp for p e R, such that

(i) for X e [0,X0] there is a unique Ox e S* (Kod(X,p*)) satisfying Z;(ex) = 0 and

OX e C N 0o.

(ii) II8f. - O,jllp . Kpr (X) d (X,p) for any p e R.

Proof: The argument is very similar to Theorem 4.1 of[12]. We give an outline of the proof.

The idea is to show that for some Ko and X0 > 0 the mapping

FX(4) = - G-1 (0O)ZX(00 + 4) (6.10)

is a contraction on the ball So(Kod(X,p*)) for each X e [0,X0]. Choose Xo and Ko so that

1ijx - Ooll . Kod(X,P) and So*(Kod(X,p*)) c Ca No for all X e [0,X0].

IIFX())Ilp. = IFX() - (CO - 0) + (-OX- o)llpS 114 - G j (0o)ZX(0o + O) -c(o.- o)ll. + 110x - eool (6.11)

Thus

IIFX()Il- . IIG 1 (0o)[ZX(0o + O) - Z(0o) - GX(0O)4]It1. + II1. - 0011JIIGaj (0o)[DzX(0o) - G X(0o)]OI4)I+ sup IIG ' (0O)D2Z4(00 + 0* )OI. + Kod(k,p*)/2

e So(Kod(X,p0))

< [{(K*(%,Kod(X,p*)) +K3(,K0d(X,p ))} + 1/2]K0d(Q,p*)< [r*(X) + 1/2]Kod(Q,p) < (3/4)Kod(X,p*) (6.12)

for X sufficiently small. By a similar analysis for p e R

IIFX(01) - FX(02)jIp < {(K2(X,Kod(X,p*)) + K3(XK0d(X,p*))}I)lj - 42lIp< Kp r(X) 1101 - 02lip - (6.13)

- 46 -

Since r* (k)4 0, Xo and Ko can be found so that Fx is a contraction on S*(Kod(k,p*)) for

k e [O,X0]. Applying a fixed point theorem, Theorem 9.23 of Rudin[29], we have that for

0 . X . Xo there is a unique ox e S*(Kod(X,p*)) for which FX(O0 + ox) = Ox. Letting

0; = 00 + ox, 0e satisfies the constraints and is the only root of Z4 in S0*(Kod(X,p* )). This

proves part (i).

For p e R

IIeA- AIp = IIGGA (e0)G(0O)[eA-OA]llp= IIG-1 (e0)[GX(e0)(Oe - e0) - G-(Oo)(0X - O0)]Ip= IIGA1 (0O)G(%O)[FA(Q) - FA(O)]IIp= IIFX(4X) - FX(O)IIp S Kpr (X)d (k,p) (6.14)

forrm (6.13). E

Theorem 6.2. (Existence of Onx and the Variability Approximation)

Again suppose (A.10) is true. Let N0o be such that (A.3) holds and suppose there is X0 > 0 so

that Ox e Ng0 for X e [0,X0]. Let Xn be such that r(X) -4 0 as n - uniformly for

X r [R ,Xo]. Consider the event En (X) defined by

"There is a unique root of Zn X, O3nX, in S (Kodn(X,p c)) C. For p E R, On X satisfies

lio3n X 6n xllp < Kprn (;X)dn (X90f

For each 6 > 0 and p e R there are constants no, Kp and Ko such that for all n > nO and

x e [Xn ,Xo]

P (En(W) >1{

Proof: Replace Inx by DZnx(Ox) in Theorem 4.2 of[12]. Because the form of GX(O) plays no

role, the same proof works. (A.10) guarantees that eventually On. E C with arbitrarily high

probability. U

- 47 -

6.2. Application to Constrained Non-Linear Regularization

We need to get a handle on the degree of non-linearity in the data functionals. In (A.1)

we made the assumption that the data functionals were three times Frechet differentiable. With

D denoting differentiaton with respect to e let

Ke,u4[x] = Drjl(;x)u4[x] (6.15)and

Kouv [x] = D2,n(O;x)uvO[x] (6.16)

We will make use of a further assumption.

Assumption. A.11. (Non-linearity of the Data Functionals)

For some t e (d ,M] there are positive constants a,, P, a2, P2' yi. y2 and 8 such that for all

p {0,1 land 0 e Neo

(i) ifK4 r WYw p (f2) and Ku e Wlil (S2)

lIKeU IIwi2p IIKllwIKlp IKulKulwI+51P+~~~ (~~, Ku e2

(ii) ifK eK(W+Y2P() Ku iW a2+02P(Q) and Kv e W2+2P (Q)

iikouvilwiP < IIKOIIw + _ lKIIUlWa2+02p IlKv 11Wc2-2P

From (A. 1) we have a technical but useful lemma.

Lemma 6.3. Suppose J is quadratic and that (Al), (A.3), (A.4), (A5) and (A.ll) holds.

(i) If Kv, Kw e W2 l(Q)nWX2 (Q K, e W (Q) and Ku e W e+d/2 with e >O and

0o+u r Neo then

ID2ZX(0 + u)vw4O12 < fIIK4ll 2 IKvII2 ml IIKwll2 (6.17)+ 2 |w|K w a 617}

+ JlK4l112,6 lKuI112 e+d12IIKVIIK,2Ot2lIKWlI11 2I

- 48 -

(ii) If Kv , Kw e W211()W2(), K4 e W21(n)nW2 2(e) and Ku e W2 + 3dl2

with e > 0 and ex+u e Ne, then

ID 2Z,, x(OA + U )VW O 1 2 < ( ||K |112 IIKv1||2a2 ||Kw ||2 al

+ llK¢Mla2 j2IKu12 d/2IIKV 112 a2 11Kw ll,ca2

+ k42IlKQll 2 a IIKv 112 ai+51 11Kw 112i +1 (6.18)w2 w~~~~~~21 2

+ k2jIlK 4II 2 "+p IIKU ll2e + 3d,2 IKV 112 IIKW I 22+2

+ SUp ISln(x) ItlK4I2I IIKvII .+O,t IIKwIIw2 +I2I.xE fl W2 W 2

(iii) If Ox e Ne0 then for Ku G Wd, (f)nW 1i+1 (fl) and K4 e W2d+ (Q)()WY, (Q) with

I [G X(Oe)) DZI O 2 < kn2 IIKdII++elIKu Wd+e

+ sup IS?t(x) I2IK 112Il lIKu l12 Ml+0l

+ I|K(O)L-OO)II2d/2+? IlK 4II2IIKU II 2 (6.19)

+ kn2lK(ex - OO)I 2 3d 2+e6IIK il 2 Y,1 IIKu 112

Proof: The proof is straightforward but somewhat tedious, we use integration by parts,

Cauchy-Schwartz's, Holder's and Sobolev's inequalities. For part (i) let 0 = 00 + u.

I D2Z;(0)vw4 12 < I [ri(Ooo; x) - ri(0; x)]KevwP[x]dF(x)1 2

+ I Jkvw[x]Koo[x]dF(x)1 2

+ I JKev[X]Kew[x]dF(x)1 2

+ I fKew[x]Kev[x]dF(x) 12 (6.20)

Applying the Cauchy-Schwartz inequality, the fact that F has a density which is bounded away

from zero and infinity, and A. 1 (i), the last three terms can each be bounded by a constant

times

IIK II2 IIKVIIa, IIKWII 2 a (6.21)

- 49 -

For the first term we have

f[r1(eo ; x) -r(O ; x)]K9vw4[x]dF(x)1l < jjrl(0o; .)-t(; )I2A IIKeVW IL2 (6.22)

Expanding l(00; x) - r1(0o + u ; x) and proceeding as in Theorem 3.2 we get

IIl(0o ; )_-r(O ; )IIL2 0 is arbitrarily small. But from (A. I1) (ii)

IiKevw W112 < IIK¢112a IIKvll2a22 (6.23)2 I 2wI

This proves (i).

For part (ii)

ID2z4 (Oe + u)vw4012. ID2Z(Ox + u)vw12 (6.24)

+ ID2[ZnX(OX+u)_ZX(OX+u)]vw412

The first quantity on the left had side is analyzed like in part (i) which gives the first two terms

of the desired upper bound in (6.18). Let 0 = Ox + u

D2[Z (8) - Z(e)]vw42 < - £eiK0v[xi] 12 (6.25)ni=1

+ ( I Ji(eo ; x) -r1(0 ; x)]K0vw 4[x ]d (Fn -F)(x)I2

+ I Kvw [X ]K0O[x ]d (Fn-F)(x) 12

+ I KeV4[X]Kew [x ]d (Fn-F)(x) I2

+ IJKow4[x]Kev[x]d(Fn-F)(x)I2)

The case p = 1 in A. 11, integration by parts and an analysis identical to that used in part (i)

leads to the k^2 terms in the upper bound in (6.18). Finally

-1 1kvw gvw ][x ]dSn (X) (6.26)n i=1

- so -

and applying integration by parts once again

I41X eiKevw4¢[Xi ] 12 SUP Sn (X) 12lKevw4l,wt (6.27)

Recall t > d so from A.11 (ii) this term may be bounded in the manner we wish. This proves

part (ii).

For part (iii)

I[G%(O)-_DZn X(Oe)]uOj2 < I Ku[x]KoX44x]d (Fn-F)(x)I2+ I Jk,U 4X ]dSn (X) 1 2 (6.29)

+ I f[T(0O; x) - r(Ox ; x)]Keu O[[x]dF (x) I2+ Ift(eo ;xo)-T1(ox ; x)]kexu4[x]d(Fn-F)(x)I2.

None of the terms present new difficulties. Using integration by parts and A. 11 the desired

bound follows quite easily. *

Making use of this lemma we can describe the behavior of the quantities r(X) and r, (X) in

(6.8) and (6.9).

Theorem 6.4. (Behavior of r* (X) and r (X))

Suppose 00 re)pO, J is quadratic and assumptions (A.3), (A.4), (A.5) and (A.ll) hold. If p

is such thatfor some e > 0, let

a = max(e+dI2M,c1/M,ox2/M Iand

b = min{(po - d/2M)/2, (2po - d/2M - 8/M)13, po,2 - dI2M - SIM I

if p e [a,b) then r* (X) -4 0 as -+0 and r(X) C j<P-Pp2 for

-dI2M < p < 2- d/2M -6/M.

- 51 -

Proof: Since J is quadratic K2(X) = 0

K3(X,R) = sup ItG-j (0O)D2Z(0o + u)vwIlp . (6.30)u,v e SO(R)w e So()

Using the spectral expansion for 11I11p

IIGx1 (0o)D2Z%(eo + u)Vw It2 =2[1 + yv]p[i + Xyv]Y2[D2Z,(0 + u)vw4v]2 (6.31)v

Lemma 6.3 (i) gives

[D 2ZX(0 + u)vw4v]2 S IIK4v11Z2IKvII1 2, IlKWVII,, (6.32)

+ IIKVI2I2 IIKU112 dr2l 6jKv 1 2mIIKw 11 22 2

where e> 0 is arbitrarily small. From (A.4), IIKellwml Illotll1/M <5oevO SO502~~~~~~

IKv II2 < R2 for v e S*(R). We have similar bounds for u e S*(R) and w E S*(1).2

This gives

[D2Z (0o + u)Vw4V]2 < 11OIII2 R2.12 + II 2vII,MR2R212 . (6.33)

But llkvllo = 1 and 1l(vll&M = [1 + yv]8M so substituting and using the fonnula (3.15) for

Cr(X,p) (p + 6/M < 2 - d/2M) we have

K 2 (A,R ) < R2X-(p + d2)(f + R2x--M (6.34)

From Theorem 3.1 and the constraint on p*, x (X) d (X,p *) _ - p )/2 so

2(= K 2(X,(X)) < X(p -p){X(po-d/2M -2p) + X,(2po-/M -d/2M _3p)) (6.35)

The upper bounds on p imply that X(po- d/2M - 2p*) and X(2po-8M- d/2M - 3p) both tend to

zero, proving the result. U

More stringent conditions are used to obtain results for the behavior of r*(X) and rn(X). We

- 52 -

have the following theorem.

Theorem 6.5. (Behavior of r*(X) and rn (X))

Again suppose (A.3), (A.4), (A.S) and (A.ll) hold and that 00 c-eEPO where po > 2dIM. For

c > 0 arbitrarily small, let

a = max(e+3d/2M,al/M,a2/M,ctl+r3/M ,c2+12/M}and

b = min ((po-d /2M)/2,(po0-/M),po,2-3d/2M,2-d/2M-y1/M,2-d/2M-(6+y2)/M)

If yi , 8 + Y2 < 2d and for some p e [a,b) with 2dlM . p* . sIM, X4, is a sequence such

that n1lx2 (P + d/2M) _ 0 then for any sequence of X's tending to zero with kX Xn,

r(X) -+ as X -+ and r (X) C X<P-P)'2for -d/2M < p < 2- 5d2M.

Proof: Choose k0 sufficiently small so that, via Theorem 6.4 and Theorem 6.1, 0 exists and

using Theorem 3.2 and Markov's inequality we have: O o{jjO,x-e2n)lk(P + d/2M) for

-d/2M < p < 2- 3d/2M.

K3(n,XR) = sup IIG -1 (0;)D2Zn (Ox + u)vwIlp (6.36)U,V E S(R,.X)w e S;(1,X)

Expanding the nonn

IIG,j (eX)D2Znx(ex + u)vwl12 = X[1 + yXV]p[l + XyXv]-2[D2Zx(0x + u)vw XV]2 (6.37)v

Now we apply Lemma 6.3(ii). For X < X0, Ox eNeo so (A.3), (A.4) and (A.5) give that

IlkXvIlt/M= =lKxvllw=JKxIxvllwt tllxvllxitM for 0 t < s. Using (A.4) and the lower

boundon p*,tIKuItQ1+01 < lIKul M IIluIllx- R etc.

[D2Znx(ex + U)VW4XV]2 < Il)Xv11IR 212 + II|4)AV12RMR4.12 (6.38)+ kn2II4XVII 21yl/M R2 12 + kn2IIOVII2 (8+y)iM R 4.12 (6.39)

+Sx ||¢AV||(8y2)/M R2 12

- 53 -

But llxvllxp = [1 + yxv]P and yxv - vId so from (3.15) using the upper bound on p

K#(n A,R ) < R2p+ d/2M) +R4Xkp + d/2M +8/M)+ k 2:(p + d/2M + yI/M)R 2 + k,2X (p + d/2M + (8 + Y2)/M)R 4

+2s"(p + /2 + (8; + y2YM)2

By a similar analysis

K2(nX) = sup IG21 (0)[G(Ox) - DZn%(Ox)]u jIp (6.40)S e SO((1X)

where

IIG -1 (OX)[G %(eX) - DZn x(eO)]uuI,p = Y[1 + yxv]P [I + kyxvyf2[[G;(Ox) - DZn %(ex)]UO-V]2Vp&v

Using Lemma 6.3(iii), the constrains on p and applying (A.3), (A.4) and (A.5) as above, we

have for u E S (1,k)

[[G%(ex) - DZnx(ef)]uOXv]2 < kf2I|XVIId,M212 + S211XvII2Y1/M 12 (6.41)

+ ii d E - x 0

+ kn2IIeX eOI3d1+C2IjXVII2,Y,1,Mli+ k211x-00l3d/2M+E1 vlx y/

Using (3.15) and recalling that r = d/2M

K? (n ,) < k 2X(P + 3d/2M + e) + S ,, + dnM + y/M) (6.42)

+ IIX _ 0011e/2M+AX( + dI2M) + kn2IIOX 0103d/2M+ (p + d/2M + y1/M)

where E > 0 is arbitrarily small. po > 3d/2M so 11X - e011 2+ <£ - d/2M - e and

IeX _ 00l12d,2+£ < xpo- 3d/2M - E (again e > 0 is arbitrarily small). From this

K2 (n 'A) < k 2x(p + 3dI2M + c) + S 2X(p + d/2M + y1IM) (6.43)

Xpo- d/2M - X-(p + d12M) + 2XPO - 3d/2M - £E-(p + d/2M + y1IM)+XoX(p+ k,,XP

- 54 -

From (A.1), Op(kn ) p (s,?)= n-1 and from the remarks at the begining of the proof,

Op (xn2(X)) = n' Ix-(p + d/2M)

r 2(X) = Op (K 2 (n ,) + K#2 (n X,xn (X))) (6.44)< 1%-(p + 3d/2M + E) 1p3/Mp+ d/2M + ylMM)

+n Po +,X+ Xpo - d/2M - EX_(p + d/2M) + n-l XPo - 3d/2M - EX-(p + dl2M + yl/M )

+ n-lX-(p + d/2M)X-(p + d/2M) + n -2X-2(p* + d/2M)X,(p + d/2M + 8/PM)

+ x -(p + d/2M + yI/M) _ _(p + d/2M) + nljp + d/2M + (S +Y2)/M), -2X-2(p* +d/2M)

+ n-ix-(P + d/2M + (8 + y2)/M) - +d/2M)

Rewriting this expression we get

nr2(X) < X(P* P) {qn (W)} (6.45)

where

qn(X) = n l-(p + 3d/2M + E) + n~-X-(P + d/2M +yl/M) (6.46)

+ Xpo - p - 3d/2M - £ + n1XP - 3d/2M - EX (p + d/2M + y1/M)

+ n-l-2(p* + d/2M) + n 2kj3 (P + d/2M)X-(p + d/2M + 8/M)

+ n-2X-Y1/M1nl-AX-2 (p + d/2M) + n-3A78 + Y2)/M n -3k3.(pP + d/2M)

+ X(S+Y2)/M n-2X-2(p + d/2M)

At this point we use that Yi and 6 + Y2 are bounded above by 2d and p . 2d/M. After some

algebra we find

q, (X) < nIX-2 (p" + d/2M) (6.47)

So by definition of Xn , qn (A) - 0 for X E [kn X0]. This proves the result. M.

6.3. Verification of Assumption A.11

We end by describing a situation under which the linearization may be justified for the

time invariant diffusion considered in §5. Before stating the theorem we give two lemmas.

- 55 -

x x

Lemma 6.6. Let K x] =)x (y)dy and Kh Y= (Y)(Y)dY. If h e W' [0,1] then there is a

positive constant c such that for K4 e W2P[O,1], KuE W2[0,1], Kv e W#[O,1] and

p (i{0,11

IlKh Ou 11w22p 5 c llh llwl 11KO11w22p lIKulW 22

and

I1KhOu 1w22p :5 c lih llw 1 JjKOjjw2p JjKujjw22 JjKYtvjw.

Proof: For p = 0, let 4(x)=K4[x]

IIKh uII2W = IIKh Utw2p (6.48)

Integrating by parts as in Theorem 4.2

llkh OU I1622P < C tlW§IL2llhU Wll (6.49)

where c is a positive constant. But Ilhu llw < ljh llw1 Iu llwI < Ilh IIW IlKu I 2Using this,

the result for p = 1 is simple. The result for IAKh 4uv Jjwp is obtained in a very similar fashion.

U

For the diffusion problem, recall from (5.13) we have

K04[x] = I4(y)he(y)dy (6.50)

x

where hq(x) = 21 (If(t)dt + b }. From this

K04u = )(y)he'(y)dy (6.51)

x

where h8a'(x) = 23(x) If (t)dt + b }, and

- 56 -

Keou (y)h"(cy)dy (6.52)

x

where he"(x) = 4 ()f(t)dt + b }. So following Lemma 5.2 we have:

Lemma 6.7. Iff e L2[0,1] then for 0 e B 1(R) with I 0(x) I > £ > 0, he ,h0' ,h0", e B1(R')

where R' > 0 depends on R, e and llf IIL2.

Proof: Very straightforward the argument is similar to that given in Lemma 5.2. U

Combining these results we have the main theorem.

Theorem 6.8. Suppose 00 e B k(R) for k > 3.25 and 00(x) > e > 0. Let

No =(O e B2(R); (x) > e} and for m . k C =(0 e W2[0,1];.0 0} and

J(O) = IlDm0ll12. Suppose f E W2 [0,1] and as in Lemma 5.2, f (x) 6 > 0 and b > 0. Then

(A.3), (A,4), (A.5), (A.10) and (A.ll) all hold. Moreover if p* = 21M then

(i) r*(X) -* 0 and r (X) < X-(p - p )'2 for -1/2M < p<2 - 3/2M.

(ii) if X, is a sequence tending to zero such that ni'XS5M - 0 then for any sequence of X's

tending to zero with A> A_ , rn*(X) -+ 0 and rn (A)CA = (m)(y)4(m)(y)dy. (A.3), (A.4) and (A.5) come

from Theorem 5.3 (Neo = Ne(1,R)). M = m+I for (A.4). From Theorem 5.3, s = 3 in (A.5).

p > 21M for (A.10). Using Lemma 6.6 and 6.7, A.11 follows with 6 = a, a2 = 2 and

1I = P2 =71 = 72 = 0

Parts (i) and (ii) follow from Theorems 6.4 and 6.5. -

- 57 -

References

1. Agmon, S., Elliptic Boundary Value Problems, Van Nostrand, 1965.

2. Backus, G. and Gilbert, F., "Uniqueness in the inversion of inaccurate gross earth data,"

Philos. Trans. Royal Soc. Ser. A., vol. 266, pp. 123-192, 1970.

3. Box, G.E.P. and Jenkins, G.M., Times Series Analysis: Forecasting and Control,

Holden-Day, Inc., San Francisco, 1976.

4. Butzer, P. L. and Berens, H., Semi-Groups of Operators and Approximation, Springer-

Verlag, Heidelberg, 1967.

5. Butzer, P.L. and Westphal, U., "An access to fractional differentiation via fractional

difference quotients," in Fractional Calculus and Its Applications, ed. B. Ross,

Akademie-Verlag, Berlin, 1974a.

6. Cannon, J.R. and Ewing, R.E., "Quasi-linear parabolic systems with non-linear boundary

conditions," in Inverse and Improperly Posed Problems in Differential Equations, ed. G.

Anger, pp. 36-43, Akademie-Verlag, Berlin, 1979.

7. Cannon, J. R. and DuChateau, P., "Approximating the solution to the Cauchy problem

for Laplace's equation," SIAM J. Numer. Anal., vol. 14, pp. 473-483, 1980.

8. Cannon, J. R. and DuChateau, P., "An Inverse Problem for a Nonlinear Diffusion Equa-

tion," SIAM J. Appl. Math., vol. 39, pp. 272-289, 1980.

9. Collatz, L., Differential Equations: An Introduction with Applications, John Wiley &

Sons, Chichester, 1986.

10. Colton, D., "The inverse scattering problem for time-hannonic acoustic waves," SIAM

Review, vol. 26, no. 3, pp. 323-350, 1984.

11. Colton, D. and Monk, P., "The numerical solution of the three-dimensional inverse

scattering problem for time harmonic acoustic waves," SIAM J. Sci. Stat. Comput., vol.

- 58 -

8, pp. 278-291, 1987.

12. Cox, D. D. and O'Sullivan, F., "Analysis of penalized likelihood type estimators wih

application to generalized smoothing in Sobolev Spaces," Tech. Rep. No. 51, Statistics

Dept., University of Califomia-Berkeley (submitted to the Annals of Statistics), 1985.

13. Cox, D. D., "Approximation of the method of regularization estimators," Ann. Statist.

(to appear), 1987.

14. Deuflhard, P. and Hairer, E., Numerical Treatment of Inverse Problems in Differential

and Integral Equations, Birkhauser, Boston, 1983.

15. Devaney, A.J., "Reconstructive tomography with diffracting wavefields," Inverse Prob-

lems, vol. 2, pp. 161-183, 1986.

16. Hald, O.H., "Inverse eigenvalue problems for the mantle," Geophys. J. R. Astr. Soc.,

vol. 62, pp. 41-48, 1980.

17. Joshi, M.C. and Bose, R.K., Some Topics in Nonlinear Functional Analysis, J. Wiley &

Sons, New York, 1985.

18. Kravaris, C. and Seinfeld, J. H., "Identification of parameters in distrbuted parameter

systems by regularization," SIAM J. Control and Optimization, vol. 23, no. 2, pp. 217-

241, 1985.

19. Ladyzhenskaya, O.A. and Ural'tseva, N.N., Linear and Quasi-linear Elliptic Equations,

Academic Press, New York, 1968.

20. Lions, J.L. and Magenes, E., Non-Homogeneous Boundary Value Problems and Applica-

tions, I, Springer-Verlag, Berlin, 1972.

21. Liu, J.Q. and Chen, Y.M., "An iterative algorithm for solving inverse problems of two-

dimensional diffusion equations," SIAM J. Sci. Stat. Comput., vol. 5, pp. 255-269, 1984.

- 59 -

22. Nychka, D., Wahba, G., Goldfarb, S., and Pugh, T., "Cross-validated spline methods for

the estimation of three-dimensional tumor size distributions from observations on two

dimensional cross sections," J. Amer. Statist. Assoc., vol. 79, no. 388, pp. 832-846,

1984.

23. Nychka, D. and Cox, D. D., "Convergence rates for regularized solutions of integral

equations from discrete, noisy data," Ann. Statist., 1987 (to appear).

24. O'Sullivan, F. and Wahba, G., "A cross validated Bayesian retrieval algorithm for non-

linear remote sensing experiments," J. Comp. Physics, vol. 59, no. 3, pp. 441455, 1985.

25. O'Sullivan, F., "A statistical perspective on ill-posed inverse problems (with discus-

sion)," J. Statist. Science, vol. 1, pp. 502-527, 1986.

26. O'Sullivan, F. and Wong, T., "Determining a functional diffusion coefficient in the heat

equation," Proceedings of the 19'th Symposium on the Interface, American Statistical

Association, 1987.

27. Parker, R.L., "The magnetotelluric inverse problem," Geophys. Surveys, vol. 46, pp.

5763-5783, 1983.

28. Ross, B., Fractional Calculus and Its Applications, Akademie-Verlag, Berlin, 1974.

29. Rudin, W., Principles ofMathematical Analysis, McGraw-Hill, New York, 1976.

30. Santoza, F. and Symes, W.W., "Linear inversion of band-limited reflection seismo-

grams," SIAM J. Sci. Stat. Comput., vol. 7, pp. 1307-1330, 1986.

31. Stone, C.J., "Optimal global convergence rates for nonparametric regression," Ann. Sta-

tist., vol. .10, pp. 1040-1053, 1982.

32. Taylor, M.E., Pseudodifferential Operators, Princeton University Press, Princeton, New

Jersey, 1981.

- 60 -

33. Tikhonov, A. and Arsenin, V., Solutions of Ill-Posed Problems, Wiley, New York, 1977.

34. Treves, F., Basic Linear Partial Differential Equations, Academic Press, New York,

1975.

35. Weinberger, H. F., "Variational Methods for Eigenvalue Problems," in Lecture Notes by

G. P. Schwartz, Department of Mathematics, University of Minnesota, Minneapolis,

1962.

36. Weinberger, H. F., "Variational Methods for Eigenvalue Approximation," in CBMS

Regional conference series in applied mathematics, SIAM, Philadelphia, 1974.

- 61 -

Figure Legends

Figure 1.l.a : Time series plots of the data at different measurement sites: (i) x = .05, (ii)

x = .50, (iii) x = .95. The solid lines are the true values of u (x,t). See equation (1.6).

Figure 1.1.b : Spatial plots of the data at different times: (i) t = .01, (ii) t = .10, (iii) t = .50.

As in Figure 1.1.a, the solid lines are the true values of u (x,t).

Figure 1.2 : Regularization estimates corresponding to the data in Figure 1.1. The solid line is

the true diffusion coefficient. The dark dashed line corresponds to the optimal predictive mean

square error estimate. The dotted line corresponds to the cross-validated estimate. See text.

Figure 5.1 : Eigensequence plot for the Raleigh quotient {I1D2IL41l2/(11K8414}, see (5.16). The

plot gives log(yv) versus log(v).

Figure 5.2 : Eigensequence plot for the Raleigh quotient ({ID2W1121/(11K411Z21 see (5.17). The

plot gives log(y,) versus log(v).

Time Series Plots at Different Measurement Sites

x = .05

0.0 0.1 0.2 0.3 0.4 0.5

time

0.1 0.2 0.3 0.4

time

0.0 0.1 0.2 0.3 0.4

bme

0.5

0.5

00

0

lw

RO

at

0

00

to

x - .50

o L.

O.C

00

C9

3

2

lw

x .9s

0 - - I a 0 0 . .

lw. 0 a

I -I I I

3

w

viyu t -I ..)

Spatial Plots at Different Times

0.0 0.2 0.4 0.6 0.8 1.0

x

0.0 0.2 0.4 0.6 0.8 1.C

x

I . I

0.0 0.2 0.4 0.6 0.8

00

0o

IqO

t * .01

0

00

0

0N

0

0

0

Go

0(0

0

0

N

0

t. .10

t-.50

II

1.0

Regularized Estimates

0.4 0.6 0.8

x

C\J

0

6

co

(0

qcm

0

0\

0;

0.0 0.2 1.0

Flym il

Eigensequence PlotU')

00CMJ

IC)

O o

O ll ll l l l lI~~~~~~~00

0.0 0.51.0 1.5 2.0 2.5 3.0 3.5~~~~~~

IF-ICZNAE S. .

Elgensequence Plot

.

00

*000.00

.@0000

0

0

0

0

0

0

1.0 1.5 2.0

10

0CV

L1c\j

.-O

0cN

C00\

L00 0

0

0.0 0.5 2.5 3.0 3.5

TECHNICAL REPORTSStatistcs Department

University of California, Berkeley

1. BREIMAN, L amid FREEDMAN, D. (Nov. 1981, revised Feb. 1982). How many variables should be entered in aregrssion q ? Jour. Amer. Statist Assoc., March 1983, 78, No. 381, 131-136.

2. BRILLINGER, D. R. (Jan 1982). Some contasting examples of the time and frequency domain approaches to time seriesanalysis. Time Series Methods in Hydrsciences, (A. H. El-Shaarawi and S. R. Esterby, eds.) Elsevier ScientificPublishing Co., Amsterdam. 1982 pp. 1-15.

3. DOKSUM, K. A. (Jan. 1982). On the performance of estimates in proportional hazard and log-linear models. SurvivalAnalysis, (John Crowley and Richard A. Johnson, eds.) IMS Lectr Notes - Monograph Series, (Shanti S. Gupt seriesed.) 1982, 74-84.

4. BICKEL, P. J. and BREIMAN, L. (Feb. 1982). Sums of functions of nearest neighbor distances, moment bounds, limittheorems and a goodness of fit test An Prob., Feb. 1982, 11. No. 1, 185-214.

5. BRLLUNGER, D. R. and TUKEY, J. W. (Mach 1982). Specwrum estimation and system identification relying on aFourier transform. The Collected Works of J. W. Tukey, vol. 2, Wadsworth, 1985, 1001-1141.

6. BERAN, R. (May 1982). Jackknife approximation to bootstrap estimates. An. S March 1984, 12 No. 1, 101-118.

7. BICKEL, P. J. and FREEDMAN, D. A. (June 1982). Bootstrapping regression models with many parameters.Lchmanm Festschrift, (P. J. Bickel, K. Doksum and J. L. Hodges, Jr., eds.) Wadsworth Press, Belmont, 1983, 2848.

8. BICKEL P. J. and COLLINS, J. (March 1982). Mziniming Fisher information over mixtures of distributions. Sankhyi,1983, 45, Series A, Pt. 1, 1-19.

9. BREIMAN, L. and FRIEDMAN, J. (July 1982). Estimating optimal transformations for multiple regression and correlation.

10. FREEDMAN, D. A. and PETERS, S. (July 1982, revised Aug. 1983). Bootstrapping a regrssion equation. someempirical results. JASA, 1984, 79, 97-106.

11. EATON, M. L. and FREEDMAN, D. A. (Sept. 1982). A remark on adjusting for covariates in multiple regression.

12. BICKEL, P. J. (April 1982). Minimax estimation of the mean of a mean of a normal distribution subject to doing weUlat a point. Recent Advances in Statistics, Academic Press, 1983.

14. FREEDMAN, D. A., ROTFHENBERG, T. and SUTCH, R. (Oct. 1982). A review of a residential energy end use model.

15. BRILLINGER, D. and PREISLER, H. (Nov. 1982). Maximum likelihood estimation in a latent variable problem. Studiesin Econometrics, Time Series, and Multivariate Statistics, (eds. S. Karlin, T. Amemiya, L. A. Goodman). AcademicTrF-, Wew York, 1983, pp. 31-&37

16. BICKEL, P. J. (Nov. 1982). Robust regression based on infinitesimal neighborhoods. Ann. Statist., Dec. 1984, 12,1349-1368.

17. DRAPER, D. C. (Feb. 1983). Rank-based robust analysis of linear models. I. Exposition and review. Statistcal Science,1988, Vol3 No. 2 239-271.

18. DRAPER, D. C. (Feb 1983). Rank-based robust inference in regression models with several observations per cell.

19. FREEDMAN, D. A. and FENBERG, S. (Feb. 1983, revised April 1983). Statistics and the scientific method, Commentson and reactions to Freedman, A rejoinder to Fienberg's comments. Springer New York 1985 Cohort hAn s in SocialResearch, (W. M. Mason and S. E. Fienberg, eds.).

20. FREEDMAN, D. A. and PETERS, S. C. (March 1983, revised Jan. 1984). Using the bootstrap to evaluate forecastingequations. J. of Forecastin. 1985, Vol. 4, 251-262.

21. FREEDMAN, D. A. and PETERS, S. C. (March 1983, revised Aug. 1983). Bootstrapping an econometric model: someempiical results. JBES, 1985, 2, 150-158.

22. FREEDMAN, D. A. (March 1983). Structural-equation models: a case study.

23. DAGGETr, R. S. and FREEDMAN, D. (April 1983, revised Sept 1983). Econometrics and the law: a case study in theproof of antiaust damages. Proc. of the Berkeley Conference, in honor of Jerzy Neyman and Jack Kiefer. Vol I pp.123-172. (L. Le Cam, R. Olshen eds.) Wadsworth, 1985.

- 2 -

24. DOKSUM, K. and YANDELL, B. (April 1983). Tests for exponentiality. Handbook of Staistis, (P. R. Krishnaiah andP. K. Senw eds.) 4, 1984, 579-611.

25. FREEDMAN, D. A. (May 1983). Comments on a paper by Markus.

26. FREEDMAN, D. (Oct. 1983, revised March 1984). On bootstrapping two-stage least-squares estimates in stationary linearmodels. AMn. Statist*, 1984, 12, 827-842.

27. DOKSUM, K. A. (Dec. 1983). An extension of partial likelihood methods for proportional hazard models to generaltransformation models. Ann. Statist., 1987, 15, 325-345.

28. BICKEL, P. J., GOETZE, F. and VAN ZWET, W. R. (Jan. 1984). A simple analysis of third order efficiency of estimateProc. of the Neyman-Kiefer Conference, (L. Le Cam, ed.) Wadsworth, 1985.

29. BICKEL, P. J. and FREEDMAN, D. A. Asymptotic normality and the bootstrap in stratified sampling. Ann. Statist.12 470-482.

30. FREEDMAN, D. A. (Jan. 1984). The mean vs. the median: a case study in 4-R Act litigation. JBES. 1985 Vol 3pp. 1-13.

31. STONE, C. J. (Feb. 1984). An asymptotically optimal window selection rule for kemel density estimates. Ann. Statist.,Dec. 1984, 12, 1285-1297.

32. BREIMAN, L. (May 1984). Nail finders, edifices, and Oz.

33. STONE, C. J. (Oct. 1984). Additive regression and other nonparametric models. Ann. Statist., 1985, 13, 689-705.

34. STONE, C. J. (June 1984). An asymptotically optimal histogran selection rule. Proc. of the Berkeley Conf. in Honor ofJerzy Neyman and Jack Kiefer (L. Le Cam and R. A. Olshen, eds.), 11, 513-520.

35. FREEDMAN, D. A. and NAVIDI, W. C. (Sept. 1984, revised Jan. 1985). Regression models for adjusting the 1980Census. Statistical Science. Feb 1986, Vol. 1, No. 1, 3-39.

36. FREEDMAN, D. A. (Sept. 1984, revised Nov. 1984). De Finetti's theorem in continuous time.

37. DIACONIS, P. and FREEDMAN, D. (Oct. 1984). An elementary proof of Stirling's formula. Amer. Math Monthly. Feb1986, Vol. 93, No. 2, 123-125.

38. LE CAM, L. (Nov. 1984). Sur l'approximation de familles de mesures par des families Gaussiennes. Ann. Inst.Henri Poincare, 1985, 21, 225-287.

39. DIACONIS, P. and FREEDMAN, D. A. (Nov. 1984). A note on weak star uniformities.

40. BREIMAN, L. and IHAKA, R. (Dec. 1984). Nonlinear discriminant analysis via SCALING and ACE.

41. STONE, C. J. (Jan. 1985). The dimensionality reduction principle for generalized additive models.

42. LE CAM, L. (Jan. 1985). On the normal approximation for sums of independent variables.

43. BICKEL P. J. and YAHAV, J. A. (1985). On estimating the number of unseen species: how many executions werethere?

44. BRILLJNGER, D. R. (1985). The natural variability of vital rates and associated statistics. Biometrics, to appear.

45. BRILLINGER, D. R. (1985). Fourier inference: some methods for the analysis of array and nonGaussian series dataWater Resources Bulletin, 1985, 21, 743-756.

46. BREIMAN, L. and STONE, C. J. (1985). Broad spectrum estimates and confidence intervals for tail quantiles.

47. DABROWSKA, D. M. and DOKSUM, K. A. (1985, revised March 1987). Partial likelihood in transfornation modelswith censored data. Scandinavian J. Statist., 1988, 15, 1-23.

48. HAYCOCK, K. A. and BRILLINGER, D. R. (November 1985). LIBDRB: A subroutine library for elementary timeseries analysis.

49. BRILLINGER, D. R. (October 1985). Fitting cosines: some procedures and some physical examples. Joshi Festschrift,1986. D. Reidel.

50. BRILLINGER, D. R. (November 1985). What do seismology and neurophysiology have in common? - Statistics!Comptes Rendus Math. Acad. Sci. Canada. January, 1986.

51. COX, D. D. and O'SULLIVAN, F. (October 1985). Analysis of penalized likelihood-type estimators with application togeneralized smoothdng in Sobolev Spaces.

- 3 -

52. O'SULLIVAN, F. (November 1985). A practical perspective on ill-posed inverse problems: A review with somenew developnnts. To appear in Journal of Statistical Science.

53. LE CAM, L and YANG, G. L (November 1985, revised March 1987). On the preservation of local asymptotic normalitYunder infonnaton loss.

54. BLACKWELL., D. (Novenber 1985). Approximate normality of large products.

55. FREEDMAN, D. A. (June 1987). As others see us: A case study in path analysis. Joumal of EucationalStatistics. 12, 101-128.

56. LE CAM, L and YANG, G. L. (January 1986). Replaced by No. 68.

57. LE CAM, L. (Febrary 1986). On the Bernstein - von Mises theorem.

58. O'SULLIVAN, F. (January 1986). Estimation of Densities and Hazards by the Method of Penalized likelihood.

59. ALDOUS, D. and DIACONIS, P. (February 1986). Strong Uniform Times and Finite Random Walks.

60. ALDOUS, D. (March 1986). On the Markov Chain simulation Method for Uniform Combinatorial Distributions andSimulated Annealing.

61. CHENG, C-S. (April 1986). An Optmizaton Problem with Applications to Optmal Desipg Thoory.

62. CHENG, C-S., MAJUMDAR, D., STUFKEN, J. & TURE, T. E. (May 1986, revised Jan 1987). Optinal step typdesign for co g test treatments with a controL

63. CHENG, C-S. (May 1986, revised Jan. 1987). An Application of the Kiefer-Wolfowitz Equlivalence Theorem.

64. O'SULLIVAN, F. (May 1986). Nonparametric Estmation in the Cox Propodrional Hazards Model.

65. ALDOUS, D. (JUNE 1986). Finite-Time Implications of Relaxation Times for Sochasdcally Monotone Processes.

66. PITMAN, J. (JULY 1986, revised November 1986). Stationary Excursions.

67. DABROWSKA, D. and DOKSUM, K. (July 1986, revised November 1986). Estimates and confidence intervals formedian and mean life in the proportional hazard model with censored data. Biometrika, 1987, 74, 799-808.

68. LE CAM, L. and YANG, G.L. (July 1986). Distinguished Statistics, Loss of information and a theorem of Robert B.Davies (Fourth edition).

69. STONE, C.J. (July 1986). Asymptotic properties of logspline density estimation.

71. BICKEL, PJ. and YAHAV, J.A. (July 1986). Richardson Extrapolation and the Bootstrap.

72. LEHMANN, E.L. (July 1986). Statistics - an overview.

73. STONE, C.J. (August 1986). A nonparametric framework for statistical modelling.

74. BIANE, PH. and YOR, M. (August 1986). A relation between L6vy's stochastic area formula, Legendre polynomial,and some continued fractions of Gauss.

75. LEHMANN, E.L. (August 1986, revised July 1987). Comparing Location Experiments.

76. O'SULLIVAN, F. (September 1986). Relative risk estimation.

77. O'SULLIVAN, F. (Sptmber 1986). Deconvolution of episodic hormone data.

78. PITMAN, J. & YOR. M. (September 1987). Further asymptotic laws of planar Brownian motion.

79. FREEDMAN, D.A. & ZEISEL. H. (November 1986). From mouse to mam The quantitative assessment of cancer risks.To appear in Statistical Science.

80. BRILLINGER, D.R. (October 1986). Maximum likelihood analysis of spike trains of interacting nerve cells.

81. DABROWSKA, D.M. (November 1986). Nonparametric regression with censored survival time data.

82. DOKSUM, K.J. and LO, A.Y. (Nov 1986, revised Aug 1988). Consistent and robust Bayes Procedures forLocation based on Pial lIformation.

83. DABROWSKA, D.M., DOKSUM, KA. and MIURA, R. (November 1986). Rank esimate in a class of seniparnerictwo-sunple models.

- 4 -

84. BRlLLINGER, D. (Decenber 1986). Some statistical methods for random process data from seismology andneurophysiology.

85. DIACONIS, P. and FREEDMAN, D. (December 1986). A dozen de Finetti-style results in search of a theory.Am. inst. Henr irll 1987, 23, 397-423.

86. DABROWSKA, D.M. (January 1987). Uniforn consistency of nearest neighbour and keemel conditional Kaplan- Meier estdmas.

87. FREEDMAN, DA., NAVIDI, W. and PETERS, S.C. (February 1987). On the impact of variable selection infitting regression equations.

88. ALDOUS, D. (February 1987, revised April 1987). Hashing with linear probing, under non-uniforn probabilities.

89. DABROWSKA, D.M. and DOKSUM, KA. (March 1987, revised January 1988). Estimating and testing in a twosanple generalized odds rate model. J. Amer. Statist. A 1988, 83, 744749.

90. DABROWSKA, D.M. (March 1987). Rank tests for matched pair experiments with censored data.

91. DIACONIS, P and FREEDMAN, D.A. (April 1988). Conditional limit tieorems for exponential families and finiteversions of do Finetti's theorem. To appear in the Joumal of Applied Probability.

92. DABROWSKA, D.M. (April 1987, revised September 1987). Kaplan-Meier estimate on the plan.

92L ALDUS, D. (April 1987). The Harmonic mean formula for probabilities of Unions: Applications to spare randomgraphs.

93. DABROWSKA, D.M. (June 1987, revised Feb 1988). Nonparametric quantile regression with censored data

94. DONOHO, D.L. & STARK, PR. (June 1987). Uncetainty principles and signal recovery.

95. CANCELIED

96. BRILLINGER, D.R. (June 1987). Some examples of the statistical analysis of seismological data To appear inProceedings, Centennial Anniversary Symposium, Seismographic Stations, University of Califormia, Berkeley.

97. FREEDMAN, DA. and NAVIDI, W. (June 1987). On the multi-stage model for carcinogenesis. To appear inEnvironmental Health Perspectives.

98. O'SULUVAN, F. and WONG, T. (June 1987). Detemining a function diffusion coefficient in the heat equation.99. O'SULLIVAN, F. (June 1987). Constrained non-linear regularization with application to some system identification

problems.

100. LE CAM, L. (July 1987, revised Nov 1987). On the standard asymptotic confidence ellipsoids of Wald.

101. DONOHO, D.L. and LIU, R.C. (July 1987). Pathlogies of some minimum distance estimators. Annals ofStatistics, June, 1988.

102. BRILLINGER, D.R., DOWNING, K.H. and GLAESER, R.M. (July 1987). Some statistical aspects of low-doseelectron imaging of crystals.

103. LE CAM, L. (August 1987). Harald Cramer and sums of independent random variables.

104. DONOHO, A.W., DONOHO, D.L. and GASKO, M. (August 1987). Macspin: Dynamic graphics on a desktopcomputer. IEEE Computer Graphics and applications, June, 1988.

105. DONOHO, D.L. and LIU, R.C. (August 1987). On minimax estimation of linear functionals.

106. DABROWSKA, D.M. (August 1987). Kaplan-Meier estimate on the plane: weak convergence, LIL and the bootstrap.

107. CHENG, C-S. (Aug 1987, revised Oct 1988). Some orthogonal main-effect plans for asymmetrical factorials.

108. CHENG, C-S. and JACROUX, M. (August 1987). On the construction of trend-free run orders of two-level factorialdesigns.

109. KLASS, M.J. (August 1987). M g E max Sk/ES': A prophet inequality for sums of I.I.D. mean zero variates.

110. DONOHO, D.L. and LIU, R.C. (August 1987). The "automatic" robustness of minimum distance functionals.Annals of Statistics, June, 1988.

111. BICKEL- PJ. and GHOSH, J.K. (August 1987, revised Jumn 1988). A decomposition for thelikMelihood ratio statisticand the Batlett correction - a Bayesian argument

- 5 -

112. BURDZY, K., PIT?MAN, J.W. and YOR, M. (Septmber 1987). Some asymptotic laws for crossings and excurions.

113. ADHDCARI A. and PilT , . (September 1987). The shortet plana arc of width 1.

114. RITOV, Y. (September 1987). Estimadon in a linear regression model with censored data

115. BICKEL PJ. and RITOV, Y. (Sept 1987, revised Aug 1988). Large sample theory of estimation in biased samplingregression models I.

116. RrTOV, Y. and BICKEL, P.J. (Sept.1987, revised Aug. 1988). Achieving information bounds in non andsemiparaetric models.

117. RlTOV, Y. (October 1987). On the convergence of a maximal correlation algoritun with alternating projections.

118. ALDOUS, D.J. (October 1987). Meeting times for independent Markov chains.

119. HESSE, C.H. (October 1987). An asymptotic expansion for the mean of the passage-time distribution of integratedBrownian Motion.

120. DONOHO, D. and LIU, R. (Oct. 1987, revised Mar. 1988, Oct. 1988). Geometizing rates of convergence, II.

121. BRIllINGER, D.R. (October 1987). Esimating the chances of large earthquakes by radiocarbon dadng and statisticalmodelling. To appear in Statistics a Guide to the Unknown.

122. ALDOUS, D., FLANNERY, B. and PALACIOS, J.L. (November 1987). Two applications of um processes: The fringeanalysis of seach trees and the simulation of quasi-stationary distributions of Markov chains.

123. DONOHO, D.L., MACGIBBON, B. and LIU, R.C. (Nov.1987, revised July 1988). Mbinmax risk for hyperrectangles.

124. ALDOUS, D. (Novernber 1987). Stopping times and tightnss IL

125. HESSE, C.H. (November 1987). The present state of a stochastic model for sedimentation.

126. DALANG, R.C. (December 1987, revised June 1988). Optimal stopping of two-parameter processes onnonstandard probability spaces.

127. Same as No. 133.

128. DONOHO, D. and GASKO, M. (December 1987). Multivariate generalizations of the median and timmed mean H.

129. SMITH, D.L. (Decenber 1987). Exponential bounds in Vapnik-tervonenkis classes of index 1.

130. STONE, CJ. (Nov.1987, revised Sept. 1988). Uniform error bounds involving logspline models.

131. Same as No. 140

132. HESSE, C.H. (December 1987). A Bahadur - Type representation for empirical quantiles of a large class of stationary,possibly ifinite - variance, linear processes

133. DONOHO, D.L. and GASKO, M. (December 1987). Multivariate generations of the median and trimmed mean, L.134. DUBINS, L.E. and SCHWARZ, G. (December 1987). A sharp inequality for martingales and stopping-times.

135. FREEDMAN, DA and NAVIDL W. (Decenber 1987). On the risk of lung cancer for ex-smokers.

136. LE CAM, L. (January 1988). On some stochastic models of the effects of radiation on cell survival.

137. DIACONIS, P. and FREEDMAN, DA. (April 1988). On the uniform consistency of Bayes estimates for multinomialprobabiities.

137a. DONOHO, D.L. and LIU, R.C. (1987). Geometrizing rates of convergence, I.

138. DONOHO, DL. and LIU, R.C. (January 1988). Geometrizing rates of convergence, m.

139. BERAN, R. (January 1988). Refining simultaneous confidence sets.

140. HESSE, C.H. (December 1987). Numerical and statistical aspects of neural networks.

141. BRILLINGER, D.R. (Mar. 1989). a) A study of second- and third-order spectral pocedures and maximum likelihoodidentificaion of a bilinear system. b) Some staistical aspects of NMR spectroscopy, Actas del 20 congreso lantinowneri-cano de probabilidad y estadistica matenatica, Caracas, 1985.

142. DONOHO, D.L. (Jan. 1985, revised Jan. 1988). One-sided inference about functionals of a density.

- 6 -

143. DALANG, R.C. (Feb. 1988, revised Nov. 1988). Randomization in the two-armed bandit problem.144. DABROWSKA. D.M., DOKSUM, K.A. and SONG, J.K. (February 1988). Graphical comparisons of cumulative hazards

for two populations.145. ALDOUS, D.J. (February 1988). Lower bounds for covering times for reversible Markov Chains and random walks on

graphs.146. BICKEL, PJ. and R1TOV, Y. (Feb.1988, revised August 1988). Estimating integrated squared density derivatives.

147. STARK, PB. (March 1988). Strict bounds and applications.

148. DONOHO, D.L. and STARK, P.B. (March 1988). Rearrangements and smoothng.

149. NOLAN, D. (March 1988). Asymptotics for a multivariate location estimator.

150. SEILLIER, F. (March 1988). Sequential probability forecasts and the probability integral transform.

151. NOLAN, D. (March 1988). Limit theorems for a random convex set.

152. DIACONIS, P. and FREEDMAN, DA. (April 1988). On a theorem of Kuchler and Lauritzen.

153. DIACONIS, P. and FREEDMAN, DA. (April 1988). On the problem of types.

154. DOKSUM, KA. (May 1988). On the correspondence between models in binary regression analysis and survival analysis.

155. LEHMANN, E.L. (May 1988). Jerzy Neyman, 1894-1981.

156. ALDOUS, D.J. (May 1988). Stein's method in a two-dimensional coverage problem.

157. FAN, J. (June 1988). On the optimal rates of convergence for nonparametric deconvolution problem.

158. DABROWSKA, D. (June 1988). Signed-rank tests for censored matched pairs.

159. BERAN, R.J. and MILLAR, P.W. (June 1988). Multivariate symmetry models.

160. BERAN, R.J. and MILLAR, P.W. (June 1988). Tests of fit for logistic models.

161. BREIMAN, L. and PETERS, S. (June 1988). Comparing automatic bivariate smoothers (A public service e se).

162. FAN, J. (June 1988). Optimal global rates of convergence for nonparametric deconvolution problem.

163. DIACONIS, P. and FREEDMAN, DA. (June 1988). A singular measure which is locally uniform. (Revised byTech Report No. 180).

164. BICKEL, PJ. and KRIEGER, A.M. (July 1988). Confidence bands for a distrbution function using the bootstrap.

165. HESSE, C.H. (July 1988). New methods in the analysis of economic time series I.

166. FAN, JIANQING (July 1988). Nonparametric estimation of quadratic functionals in Gaussian white noise.

167. BREIMAN, L., STONE, C.J. and KOOPERBERG, C. (August 1988). Confidence bounds for extreme quantiles.

168. LE CAM, L. (August 1988). Maximum likelihood an introduction.

169. BREIMAN, L. (Aug.1988, revised Feb. 1989). Submodel selection and evaluation in regression I. The X-fixed caseand little bootstrap.

170. LE CAM, L. (September 1988). On the Prokhorov distance between the empirical process and the associated Gaussianbridge.

171. STONE, C.J. (September 1988). Large-sample inference for logspline models.

172. ADLER, R.J. and EPSTEIN, R. (September 1988). Intersection local times for infinite systems of planar brownianmotions and for the brownian density process.

173. MILLAR, P.W. (October 1988). Optimal estimation in the non-parametric multiplicative intensity model.

174. YOR, M. (October 1988). Interwinings of Bessel processes.

175. ROJO, J. (October 1988). On the concept of tail-heaviness.

176. ABRAHAMS, D.M. and RIZARDI, F. (September 1988). BLSS - The Berkeley interactive statistical system:An overview.

- 7 -

177. MILLAR, P.W. (October 1988). Gamma-funnls in the domain of a probability, with statistcal implications.

178. DONOHO, D.L. and LIU, R.C. (October 1988). Hardest one-dimensional subfamilies.

179. DONOHO, DL. and STARK, P.B. (October 1988). Recovery of sparse signals from data missing low frequencies.

180. FREEDMAN, DA amd PITMAN, JIA. (Nov. 1988). A measure which is singular and uniformly locally uniform.(Revision of Tech Reprt No. 163).

181. DOKSUM, K.A. and HOYLAND, ARNUIOT (Nov. 1988, revised Jan. 1989). A model for step-stress acceleated lifetesting experiments based on Wiener processes and the inverse Gaussian distribution.

182. DALANG, R.C., MORTON, A. and WILLINGER, W. (November 1988). Equivalent martingale measures andno-arbitrage in stochastic securities market models.

183. BERAN, R. (Novanber 1988). Calibrating prediction regions.

184. BARLOW, M.T., PITMAN, L. and YOR, M. (Feb. 1989). On Walsh's Brownian Motions.

185. DALANG, R.C. and WALSH, LB. (Dec. 1988). Almost-equivalence of the germ-field Markov propenty and the shapMarkov propt of the Brownian sheet.

186. HESSE, C.H. (Dec. 1988). Level-Crossing of integrated Ornstein-Uhlenbeck processes

187. NEVEU, J. and PriMAN, J.W. (Feb. 1989). Renewal poperty of the extrema and tree property of the excurionof a one-dimensional brownian motion.

188. NEVEU, J. and PiTMAN, J.W. (Feb. 1989). The branching process in a brownian excursion.

189. PiTMAN, J.W. and YOR, M. (Mar. 1989). Some extensions of the arcsine law.

190. STARK, PB. (Dec. 1988). Duality and discretization in linear inverse problems.

191. LEHMANN, E.L. and SCHOLZ, F.W. (Jan. 1989). Ancillarity.

192. PEMANTLE, R. (Feb. 1989). A time-dependent version of P6lya's urn.

193. PEMANTLE, R. (Feb. 1989). Nonconvergence to unstable points in urn models and stochastic approximations.

194. PEMANTLE, R. (Feb. 1989). When are touchpoints limits for generalized P6lya ums.

195. PEMANTLE, R. (Feb. 1989). Random walk in a random environment and first-passage percolation on trees.

196. BARLOW, M., P1TMAN, J. and YOR, M. (Feb. 1989). Une extension multidimensionnelle de la loi de I'arc sinu.197. BREIMAN, L. and SPECTOR, P. (Mar. 1989). Submodel selection and evaluation in regression - the X-random case.

198. BREIMAN, L., TSUR, Y. and ZEMEL, A. (Mar. 1989). A simple esimation procedure for censoredregression models with known err distribution.

Copies of these Reports plus the most recent additions to the Technical Report series are available from the Statistics Depart-ment technical typist in room 379 Evans Hall or may be requested by mail from:

Department of StatisticsUniversity of CalifomiaBerkeley, Califomia 94720

Cost: $1 per copy.

Date post:	21-May-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

during UniversityUniversity ofCalifornia Berkeley, CA94720 Technical ReportNo. 99 June 1987 1...

Documents