Constrained Non-Linear Regularization with application to someSystem Identification Problems
Finbarr O'Sullivan 1
Department of StatisticsUniversity of CaliforniaBerkeley, CA 94720
Technical Report No. 99June 1987
1 Research supported in part by the National Science Foundation under Grant No.MCS-840-3239. Some of the work was done during my stay at the Institute forMathematics and its Applications, University of Minnesota, in January 1987.
Department of StatisticsUniversity of CalifomiaBerkeley, California
Constrained Non-Linear Regularizationwith application to some
System Identification Problems
Finbarr O'Sullivan1Department of StatisticsUniversity of CalifomiaBerkeley. CA 94720.
ABSTRACT
The paper studies consistency properties for constrained method of regu-larization estimators of 0 in the abstract non-linear regression model
Zi= rl(0;Xi) + ej i = 1,2,....
Tl(O ; x) are non-linear functionals of 0, an element of a Hilbert space E8, andsi are independent measurement errors with mean zero and bounded variance.Several inverse or system identification problems associated with the determi-nation of functional parameters in differential operators can be formulated inthis manner. Conditions are given under which the asymptotic properties ofmethod of regularization estimators may be approximated by linearized estima-tors obtained via the Gauss-Newton algorithm. Rates of convergence of thelinearized estimators detennine the rates of convergence of the non-linear esti-mators. The degree of non-linearity in the functionals Tj(; ) plays an impor-tant role.AMS 1980 subject classifications. Primary, 62-G05, Secondary, 62-J05, 41-A35, 41-A25, 47-A53, 45-L10, 45-M05.
Key words and phrases. Inverse problems, Non-linear Regression, Regulariza-tion, Constraints, Rates of Convergence, System Identification
Running Head: Constrained Non-Linear Regularization
1 Research supported in part by the National Science Foundation under Grant No. MCS-840-3239. Some of the workwas done during my stay at the Institute for Mathematics and its Applications, University of Minnesota, in January1987.
Constrained Non-Linear Regularization
with application to some
System Identification Problems
Finbarr O'Sullivan1
Department of Statistics
University of California
Berkeley. CA 94720.
1. Introduction
The 1-dimensional heat equation with variable conductivity and insulated boundary has
the form
Au(x,t) +{(x)+ u(xt)}=f(x9t) x e [0,1] and te [0,T
subject to
u (x,O) = uo(x) x e [0,1],
a-u(,At)=-au(1,t)=0 te [0,T]. (1.1)ax ax
If the conductivity 0 is known and the initial and forcing terms u0 and f are specified in
appropriate function spaces the equation can be solved (at least numerically) to obtain a tem-
perature profile u (,t) at any positive time t. An inverse system identification problem is to
determine the conductivity given measured infonnation about u. If u were observed continu-
1 Research supported in part by the National Science Foundation under Grant No. MCS-840-3239. Some of the workwas done during my stay at the Institute for Mathernatics and its Applications, University of Minnesota, in January1987.
- 2 -
ously in time and space without error then 0 could be found by integrating the relation
ax ax j at
and using the boundary condition a u (O,t) = 0. This gives
aa0(x)-a U(xj,) = Lf(X',t) - a@,a(tt] (1.3)
Thus if the spatial temperature gradients are non-zero almost everywhere and 0 is continuous
then (1.3) uniquely detennines 0. It is easy to appreciate that the best choices for f and uo
are ones which force large temperature gradients. If f is zero then for large time the tempera-
ture becomes essentially constant so the information at later times is not as useful as informa-
tion at earlier times. Time series analysts usually recommend the use of white noise input for
the identification of linear transfer function models, see chapter 11 of Box and Jenkins[3] for
example. In the present context this corresponds to taking a very variable uo. In practice the
information gathered about u is incomplete so that some form of regularization is needed to
estimate 0.
1.1. Constrained Non-Linear Least Squares Regularization
To estimate 0 using the data
Zij = U(Xi,tj) + eij , i = 1,2,3, n; j = 1,2,3, m . (1.4)
where xi are in [0,1], tj are in [O,T] and the eij's are random errors, is a classical ill-posed
inverse problem. For a regularized solution we let
m n
InX(O) = nm j=i[z= - u(; x,tj)]2 + kJ(0) , 2>> 0. (1.5)
where J is a penalty functional designed to be large for physically unreasonable conductivity
- 3 -
profiles. u (O; *,) solves the heat equation (1.1) with conductivity given by 9 - this is a non-
linear functional of 0. The estimation criterion is to chose 9 in some linear function space E)
so as to minimize l,,X subject to the constraint that 9 be positive.
on = argmin 1,,(O) (1.6)
C is the subset of e) consisting of positive functions. We call (1.6) a constrained non-linear
least squares regularization method.
An Illustration
Simulated data were generated according to the model in (1.4). The parameters were set
as follows: 0(x) = 0.5 + x - 5x4 + 6x5 - 2x6, uo(x) = 10 + 270X2 - 180x3, and the forcing
term f is zero. There were n = 20 measurement sites (xi = i/21, i =1,2, . 20) with m = 50
observations at each site (tj = j1l01, j =1,2, 50). Gaussian noise was added with mean
zero and standard deviation 3. Time series plots of the data at different sites and spatial plots
of the data at different times are given in Figure 1.1. Since there is no forcing term the later
time points are likely to provide little or no information about 9. Notice how the spatial plots
flatten out in time.
For numerical computation 9 is expanded in a p -dimensional B-spline basis,
p9(x) = F9jBj(x), and the coefficients O9 are determined to minimize the non-linear regulari-
j=1
zation criterion 4nX subject to the coefficients being positive. The penalty functional was taken
to be the Laplacian penalty J(O) = I[O(x)]2dx, used in cubic smoothing splines. In terms of
the B-spline coefficients O = (01,02, ,0p)' the penalty is O'LO where njk = (x)B*k(x)dx.
A constrained Gauss-Newton algorithm is used to obtain the solution, see O'Sullivan and
- 4 -
Wong[26]. The results are presented in Figure 1.2. There are three curves corresponding to:
(i) the true conductivity profile, (ii) an estimate obtained by selecting k to minimize the predic-
tive mean square error (optimal MSE estimate)
R(x) = 1mn
xi ,t)]2 , (1.7)nm j=lt=1
and (iii) an estimate obtained by chosing X to minimize the generalized cross-validation type
criterion (GCV estimate)
m n
2[Zij _ aU fnXistj)]2VQX) = j1li1l(1
(trace[I - [ X'X+ nmXQ]f1X'X])2 (1.8)
where X is the linearized design matrix, X(i)k = Du (O,X ; xi,tj), see O'Sullivan and
Wahba[24] for another use of a similar cross-validation function.
The example raises a number of questions: (i) Is the estimation technique consistent? In
what sense? (ii) Does the theory of cross-validation for linear estimators carry over to this
situation? (iii) Is there an optimal design theory for these sorts of experiments?
1.2. General System Identification and an Abstract Non-linear Regression Model
Inverse problems associated with the identification of functional parameters in partial
differential equations arise in several fields including diffraction tomography, reservoir
engineering and seismology. Some recent references are[7, 8, 10, 11, 14, 15, 16, 18,21, 27,30].
A generalized framework for a large class of these problems is the following (see §2 of Kra-
varis and Seinfeld[18]): U, F and (D are function spaces, there is an operator A such that for
each f in F and 0 in C c E, there is a locally unique u in U satisfying
A (u,0) = f (1.9)
Now let Xn be a linear mapping on U, X,, U -- Rn. We assume that f is known and we
- 5 -
are given a vector of measurements zn
Zn =XnlU +En (1.10)
where en is a vector of i.i.d. mean zero, finite variance random variables. With these data we
wish to estimate the functional parameter 0.
The problem can be put in the form of an abstract non-linear regression problem. Equa-
tion (1.9) gives an expression for u which in general is non-linear in f and 0. With f fixed
the measurements in (1.10) can thus be regarded as non-linear functionals of 0. We write the
data as
Zi==(0;Xi) + e, . i = 1,2,3,...n (1.11)
where the design points xi lie in some index set I, for each x in I, 1( ;x): 3 -4 R is a non-
linear functional of 0 (for the data in (1.4), I c [O,1]x[O,T]). The e,'s are measurement errors.
Applying a constrained non-linear least squares regularization method to these data gives the
estimation crterion
Onx = argmin I4 n [Z _1(0;X)]2 +XJ() } > 0.
where C is a subset of e and J is an appropriate penalty functional. In some circumstances it
is would be of interest to replace the least squares fitting by a more general M-type estimation
criterion. This is not considered here. We make the following basic assumptions.
Assumption. A.1.
(i) Let Fn be the empirical cumulative distribution function (cd.f) of the xi's and let F be the
limiting c.df. F has a density which is bounded away from zero and infinity. If
kn = sup I Fn (x)-F(x)Ix e I
then for some C > 0 and all n sufficiently large kn < Cn -/2.
- 6 -
(ii) The ei's have mean zero and variance bounded by a2. S4 is the partial sum process of the
measurement errors
S. (x) = n- 1 a F£i :xi x
where x 5 y means all the components of x are less than or equal to the corresponding com-
ponents ofy. Let s, = sup nS4(x)I thenOp(s4)=x CZ I
(iii) 0 is a Hilbert space with inner product <, >. For each x e I, rj( ; x) is three times
Frechet differentiable. J is also three times Frechet differentiable.
If the x,'s are random then A.1 (i) is straightforward. Kolmogorov's inequality gives A.l(ii) in
the case that the ei's are independent. A.1(iii) is needed to analyze the estimators.
1.3. Outline
Regularization makes the inverse problem well-posed in the sense of Tikhonov[33]. To
verify this there have been a number of theoretical investigations related to uniqueness
(identifiability) and continuous data dependence of the method, see for example[7, 8, 18].
Sharper approximation theoretic and statistical stability results most also be possible. Thus if
the true parameter is assumed to lie in a particular function space then we ought to be able to
give some description of the degree of approximation associated with the regularization method
as a function of n and X. This was the motivation for our investigation. In practice these
kinds of results provide some insight into how much data are required in order to get a desired
degree of resolution. Asymptotic linear approximations for the constrained regularzation esti-
mator are developed. The linearizations are obtained using the Gauss-Newton rather than
Newton-Raphson expansions. The justification for the linearizations uses various conditions on
the smoothness of the data functionals Tl(,x). Our results provide some justification for the
use of the Backus-Gilbert[2] averaging kemel calculus in non-linear situations,[25].
- 7 -
The method of analysis owes much to the techniques developed by Cox[13] for the
approximation of linear method of regularization estimators. The paper begins by reviewing
this theory. General Newton-Rahpson and Gauss-Newton approaches to the analysis of non-
linear estimators are described in §2. Gauss-Newton linearizations are more useful for our pur-
poses. §3 has results on the rates of convergence for the linearized estimators and in §4 we
verify that these results are true for certain classes of Hammerstein integral operators. In §5
we look at some system identification problems and develop a convergence result for the time
invariant version of (1.1). We conjecture that similar results are available for the time depen-
dent parabolic case and provide some numerical evidence to support this. The final section
theoretically justifies the asymptotic reliability of the Gauss-Newton expansion. There are a
number of interesting directions for future work. We have begun to look at parabolic and
hyperbolic problems in a more systematic theoretical fashion. Hyperbolic equations arise in
inverse scattering, diffraction tomography for example.
1.4. Some Notation
Norms and inner products are subscripted except for < , > which denotes the inner pro-
duct on 89 and ( , ) which is used for the L2 inner product. The symbol D is used for
differentiation. The notations < and = are used extensively: If U is a metric space then
g (u) c f (u) means that there is a constant K > 0 such that g(u) . K f (u) for all u whose
norn is less than unity. g(u) = f (u) if f (u) < g(u) and g(u) <f (u). If V(u) is some set
of quantities associated with u then g (u ,v) < f (u ,v) unifonnly for v e V (u) means
g (u,v ) s K f (u,v ) for v e V(u ) and for u whose norm is less than unity. g (u,v ) - f (u,v )
uniformly for v e V(u) if f (u,v) < g(u,v) and g(u,v) < f (u,v) both uniformly for
V E V(u).
- 8 -
The notations < and - are also used for sequences. If u -* uo in U then f (u) < g (u) as
u -o uo if there is a constant K > 0 such that in a neighborthood of u0, f (u ) < K g(u).
f (u ) = g (u ) and the idea of unifonnity are defined as one would expect.
Acknowledgements
Computing was primarily carried out on the Cray X-MP at San Diego. This was sup-
ported by the National Science Foundation under Grant No. MCS-840-3239. I thank Profes-
sors Victor DiPema, Grace Wahba and Hans Weinberger for some valuable discussions and
advice.
- 9 -
2. Spectral Analysis of Method of Regularization Estimators
2.1. Linear Estimators
Cox[13] has provided an elegant approach to the the asymptotic analysis of method of
regularization estimators. We briefly review this. The theory is based on an abstract linear
regression framework: we have data yn e Rn where
Yn =XnO+En (2.1)
X. is a continuous linear map from a Hilbert space e (with inner product <,>) into R" and e-
is an error process for which Ee£ = 0 and scaled so that Var(en') = ayIn (In is an nxn iden-
tity matrix and a. = cI2/n). The method of regularization estimator is defined to be the minim-
izer of
In(e) = jItY - xneii2 + x<e,we> , A.> 0 (2.2)
Ililln denotes the usual Euclidean norm in Rn. W is a linear positive semi-definite operator
(<0,WO> 2 0 for all ). Assuming a unique minimizer, OnX, exists then we can write
On = [Xn'X, + A fX ,,nYn (2.3)
where X* is the adjoint (transpose) of Xn. Letting Un = Xn*X, the bias or systematic error in
OnX is expressed as
0 - E0,n = [I - [Un + XWTIUn]30 9 (2.4)
the variability or stability operator (covariance matrix) is
Var(0,nX = [Un + kW]f1Un [Un + XW]' (2.5)
These expressions follow directly from (2.3) and the assumptions in the abstract linear regres-
sion model.
Asymptotically U,, converges to a limiting operator U. With regularity, the simultaneous
- 10 -
diagonalization of W relative to U yields a set of eigenfunctions {dv ; v = 1,2,3, I and
eigenvalues {Yv; v = 1,2,3, } for which
WOV = Yv'4v<OVUOII> = 8V) V , g = 1,2,3,.... (2.6)
where 5v4 is Kronecker's delta function. Convergence characteristics are most conveniently
analyzed in terms of this eigensystem. (Throughout the paper eigenvalues are arranged in
increasing order). A collection of convergence nonns are defined as follows:
llillIp = Z[1 + yV],<O,UOV>2 , p E R. (2.7)Vv
For each p there is a subset, GO, of elements of e for which lllp is finite. Let E3p be the com-
pletion of E&p in the norm 11 lip. This is a Hilbert space with inner productp
<0,4>p= J;[l + yv]P<O,UOv><4,UOv>. The convergence norms 1l1lip are sort of abstract, how-Vv
ever, for certain statistical smoothing problems function space interpolation theory[4] can be
used to relate the 11I lp norms to a collection of more interpretable Sobolev norms. Extensions
of these results to deconvolution problems are described in Nychka and Cox[23]. Bias and
variance formulas in 11IHlp-norms are given by
110- EOn,Wl2 II[1- [U + kW]-fU]3ll 2
= 1R[Y,/(l+Xyv)]2[1+yv]p<O,U4V>2 (2.8)v
E{ll[X:Xn + XWf1X*en 11p2} = E{[l+yv]p<[X *Xn + XW]FIX e 9U4.V>2}
1X[l+yv]p l+kyv]2E<X, E >2
^2j:[j+.yv]p[jl+;yV]2=--n2 C (Xgp) (2.9)vV
The = signs in these formulas involve the replacement of the operator Un = X*X, operator by
- 11 -
its limit U (this is justified in[13]). From (2.8) it is shown that if the true parameter is in
@p+2a for some a e (0,1) then IIE8,x - {311p = Xa as x -- 0. Here the = sign that the
sequences on the left and right are equivalent up to fixed constants. The behavior of the quan-
tity C (X,p) determines the asymptotic variance. If Yv Vr then C (k,p) is convergent for
p < 2-1/r and as X -+ 0, C(X,p) = %(p+l/r) for -h/r < p < 2-1/r. (See Theorem 2.4 of[13]).
Optimal stochastic convergence rates, in the sense of Stone[31], follow from these results.
2.2. Analysis of Non-linear Estimators
General method of regularizadon estimators may be obtained by minimizing a criterion of
the form
ln(e) = In(0) + J(0) , > 0 (2.10)
over some subset C of @. In is a data fit criterion and J is a penalty term. An example was
given in §1 above. Non-quadratic In may arise because of non-Gaussian measurement charac-
teristics or non-linear functional data. Cox and O'Sullivan[12] descrbe a Cramer style
approach to the analysis of these estimation schemes. We briefly describe this theory and then
indicate how the approach is modified to handle the abstract non-linear regression model.
2.2.1. Newton-Raphson Approach
The analysis in[12] uses a second order Taylor series approximation for the regularization
functional ln; which amounts to the Newton-Raphson method. Let D denote differentiation
with respect to 0. The variational equation for 0,,X is Z,x- 0 where Z,,X(O) = D4nX(0). In the
limit as n -+ co, ZnX(0) -+ Z)x(O). The derivative of the limiting variational equation is
D ZX(6) = U(O) + XW(O). Let 00 be a, locally unique, solution to ZX(O) = 0 at A = 0. For all
A less than a sufficiently small Ao there is a locally unique 0O satisfying Z(00) = 0. More-
over, with probability approaching unity there is a locally unique 0,X satisfying ZnX(0:n) = 0.
- 12-
The estimation error is decomposed into a systematic plus random component:
OnX - 00 = (OX - 030) + ((nX - OX) -(2.11)
Two linear approximations are obtained; the first is for analyzing the bias and the second is for
variability
Systematic Error:
ok- -00 _k 00 = - [U(00) + XW(O0)]yIZX(O0)Random Error:
3nX - O ( = - [U(8XO) + XW(0X)]l'ZnX(OX) (2.12)
Here = indicates that convergence rates are equivalent to first order. Rates of convergence
results are formulated in tenns of norms related to the operators U and W. For any 0 < x < 4
the simultaneous diagonalization of W(OX) relative to U(O%) yields a set of eigenfunctions
%Iv ; v = 1,2,3, } and eigenvalues (y,.v; v = 1,2,3, } such that
W(&)Axv = Yxvoxv<0V'U(OX)Xg> = , v. ,v = 1,2,3,.... (2.13)
where 8vg is Kronecker's delta function. The convergence norms are expressed as
veiiP = [1 + >2 . (2.14)v
A key result which applies to many non-parametric smoothing problems, including hazard esti-
mation, multivariate density estimation, estimation of regression functions in generalized linear
models and robust estimation of vector valued functions, is that the limiting (as v -e oo)
behavior of yxv does not depend on X and in fact yxv = v' for some r >0. Function space inter-
polation theory[4] may again be used to relate the II llxp norms to a collection of more interpret-
able Sobolev norms. Asymptotic convergence rates in a variety of such nonns are detailed
in[12].
- 13 -
2.2.2. Gauss-Newton Approach
Direct second order Taylor series expansion of 'nx does not yield results for regulariza-
tion in the abstract non-linear regression model. The main technical problem which anises is
that the "U-component" of D Z%(O%) - J (OX) becomes difficult to analyze when T1(0,x) are
non-linear functionals. The Gauss-Newton approach overcomes the difficulty; in general we let
K[x] = DT(0,x)o for x e 1. U(O) = KgKe so
,,U(O)0> = 4,K*K00> = fKt[x1]Ke0[x]dF(x) (2.18)
From assumption A.l(i), <0,U(0)4> = (K90,K94) (uniformnly in 0). The linear approximations
are
Systematic Error:
ok- 00 = 0o - 0o - [U(00) + %W(00)]f ZX(00)Random Error:
On- O - 0 = - [U(O) + XW(0X)f1-ZnXA(0) . (2.19)
In the case that J is quadratic with no linear term, so DJ (0) = WO where W is a fixed linear
operator, ZX(0O) = XW 0O. For most of the paper we will assume that this is so. In any event
we will assume that ZX(00) = XW (%O)O. The rates of convergence for the linearized estima-
tors are formulated in terms of norms related to the operators U(O) and W. Some general
results are described in the next section. As in the case of the Newton-Raphson approach we
can show that the convergence characteristics of the non-linear method of regularization esti-
mator can be described to first order by the behavior of the linearized estimators in (2.19).
The justification for the linearization uses various smoothness conditions on rl(0;x). Our con-
ditions imply that the eigensystems related to the simultaneous diagonalization of W relative to
K&,Ke, are uniformly equivalent for all x sufficiently small. Theoretical justification for the
linearizations is given in §6.
- 14 -
3. Linearized Convergence Theory
3.1. Introduction
To study convergence characteristics of the Gauss-Newton linearization we must first
consider some properties of the simultaneous diagonalization of W relative to U. More pre-
cisely the simultaneous diagonalization of W(O) relative to U(O*) = K K*.
Assumption. A.2. (Properties of K* and W(O )).
K* is compact and has zero null space on e). W(O*) is a positive operator and for 0 e E9
110112 = <O,U(* )O> + <,W(O* )O> > 110112
We will define several Hilbert spaces. For 0* e 93, let
@° =(0 e 93 : II0II2 = <0,U(0* )0> + <0,W(0* )0> < co) and let 8* be the completion of e02
with respect to the nonn 11 11*. E)* is a Hilbert space with inner product
<4,0>* = <4,U(O* )O> + <4,W(O* )O>
From Weinberger[36], if K* and W(O*) satisfy (A.2) there is an eigensystem in E3*, fy*,,y*v,v = 1,2, } satisfying
<0* 9,,W(O* )O* > = 7* v6,v
<O* VqU(O*)*> = SV1 v,p = 1,2, (3.1)
The eigenfunctions are complete in 9* so any 0 E 9* has a convergent representation of the
form 0 = ;<0,U(O*)*V>-4*V. For any x > 0 we can define [U(O*) + XW(O*)f-1 by the Spec-v
tral Theorem and we have
[U(o*) + xwW(o*)]-'U(O*)O*v = [1 + XY*V141O*V (3.2)
The convergence norms are defined as in (2.14); II0I*p = [ 1 + y* ]I<0,U(0* )4* v>2 in partic-v
- 15 -
ular
11* v12p = [1 +Y VI, (3.3)
Let 92° be the elements of E) for which 1iieii* is finite. The completion of @92p in the 1111*p
norm leads to another Hilbert space @* p with inner product
<0,4>*P = [1 + y V]P<O,U(O )O* ><4,U(O*)O*v> p e R . (3.4)v
Notice that @* = e* l. A stability condition will allow us compare the Hilbert spaces (9*p for
a range of e* values.
Assumption. A.3. (Local Stability Condition)
Let N90 be a convex subset of e containing 00. K* and W(O) are locally stable w.r.t. N90 if
there are fixed operators K and W satisfying (A.2) and constants cl > 0 and c2> 0 such that
for all 0* in N9o all 0 in e
C I<O.WO)> < <O,W(O*)O> < C2 <O.WO> (3.5)cl (KKXO) < <4,U(9*)4> . c2 (K4KO). (3.6)
The convexity of N9o is used in §6. Under assumption (A.3) the eigensystems exist for all
0* e NOO. By the Mapping Principle, Weinberger[36], y*, = y, uniformly in Noo, where y, are
the eigenvalues of W relative to U = K* K. Let X, be the eigenfunction corresponding to y,
and let E3 be the completion of (0 e 8: [I0IIp= [1 + y]P(K ,,K 0)2 < oo}. This is a Hil-
bert space with inner product < , .>p defined as in (3.4). From the definitions of 1I1ll1p and 111i1pis clear that .*p - 9p for p e R (the nonns are equivalent uniformly for 0* E N90). From
Theorem 2.3 (iii) of Cox[13], we have the result
- 16-
Theorem 3.1. (Linearized Bias).
Let N00 be any subset of EG containing 00 for which (A.3) holds then for a E [0,1], if
00 e E9p+2a
110- oollp < x' 11011p+2a as X -+ 0. (3.16)
the bound is tight in the sense that sW.7 K - °P is up to a constant (independent of X)
equal to %'(1 +o(1)) as x - 0.
Proof: From[13]
IIX[U(00) + XW(e0)r-1W(eO)0OIIOP = 110ex-0011Op < Xa iieo0o p+2a uniformily as X - 0.
but (A.3) implies lI'llop = 1IIIlp which proves the first statement. The tightness of the upper
bound follows in a similar fashion again using the results in [13] E
We could extend this result to obtain further asymptotic bias characteristics as in
Theorem 2.3 of[l3]. To study the second linearization or variability it will be necessary to use
integration by parts. To carry out this exercise it is convenient to have an identification of the
lI norms in ternms of Sobolev norms. We make the following assumption.
Assumption. A.4. (Sobolev Norm Identification)
(i) For 0 e Ge, KO: Q - R with Q c Rd a bounded set satisfying (7.10) and (7.11) of Lions
and Magenes[201.
(ii) For some M > 3d/2 (not necessarily an integer) there are constants cl > 0 and C2 > 0
such that
c1 |IK j11M < IIK4iI12 + <4,W4> = 11112 . c2< IK4tIWf
for 0 E e9.
- 17 -
By definition of 1llp and A.4(i), II1Ifo = IIKOIIL2 the L2 norm being over the set n. Moreover
A.4(i) allows Sobolev spaces W" (0) for s 2 0 to be defined by the K-method of interpolation,
see page 40 of[20]. From the K-method of interpolation assumption (A.4) implies that
IlIllp = IIK¢IIwmp for p e [0,1] and equivalently IIKOlw I-lQlltm for t E [O,M]. Also, fol-
lowing an identical argument to that given in Lemma A1.2(a) of[12], IIK(llwI = 1kMI. tlM-
These relations are most useful. Thus (A.4) gives that IIlip are identified with Sobolev norms
for p e [0,1]. For 2 > p > 1, II-lip can be identified with norms on Sobolev spaces satisfying
additional boundary conditions, see §3 of Cox[13] where similar results are worked out in
detail. A further stability condition is needed.
Assumption. A5. (Sobolev Norm Stability)
For some 3d/2 < s < M there are constants cl > 0 and C2 > 0 such that
c 111K 0112s < IIK 0112s :5 C2lK 2
for 0* E N60 and 0 e 8.
From (A.4) and (A.3), IIK4llw) - IIlIIs/M = IIII*s,/M, so (A.5) implies IIK*4llws IIlII*s,/M.
From the K-method of interpolation and (A.4)
IIK*4)lIwf 1111* pslM = IkIItpslM = IIKwIIws p E [0,1]
To analyze the second linearization we need some information concerning the limiting behavior
of the eigenvalues of W relative to U = K* K. We shall see in Theorem 4.3 that (A.4) will
imply Yv = v Id. In general we assume that there is an estimate of this form.
Assumption. A.6. Let Yv , v = 1,2, be the eigenvalues of W relative to U. Then yv Vr
for some r > 0.
- 18 -
Let Cr(X,ca) be defined as
Cr(X,(X) = ([1 + Vr]a[l + XVr ]-2 (3.14)v=1
From Theorem 2.4 of[13], Cr(A,P) is convergent for p<2 - llr and
{ -(p+l/r) -llr < p < 2 - rCr(X,P) = log(1/X) p = -1r (3.15)
1 p <-llr
There is the following general result concerning the behavior of the second linearization.
Theorem 3.2. (Linearized Variability).
Suppose assumptions 1 and 3 through 6 hold. If
(i) there is Ao > 0 such that O0 exists in Ne0 for X e [0,X0]
(ii) 0o e E)PO with po > 3d/2M andfor some 8 > 0
110x- 0OI8+3d/2M = 1106 - 011O8+3d/2MIl + o(1)) as X -+0.
Then for -llr < p < 2 - llr - dIM and £ > 0 arbitrarily small
EII0onX II < n-'Cr(X,p) + n'knCr(X,p + dI2M) + k2C,(X,p)XS-
If k, is a sequence such that n-l4X-dM -+ 0 then
Elln>w~ oll2 -A-lk<p+l/r)EIIenX - <Xl
uniformlyfor X e [,n,ko] andfor -lr < p < 2-1lr-dIM.
Proof. From (A.3) 1I1Ilp II lxp so using the spectral expansion for the norm
lln_,- 0jxi1 2 [1+yXv]p[-+yXv] 2<_n(X)_ ZX(Ex)X4)O>)2 (3.16)v
But
- 19 -
n
E<Znx(Ox) - Z%(Ox),O%v>2 = n-202~ K4x)2(.7z (0z, <> 2 = n-<52 2 (,¢V(xi ) 12 (3. 17)i=l
+ ;[ri(0o;x )-T1(0e;x )]KxO4v(x )d (Fn-F )(x ) 12
where a2 is the variance of the ei's. Therefore
< nIIK%xXvIIl2 + nl l)[KXOXv[x]]2d(Fn - F)(x) I (3.18)
+ {[Tl(fo;x )-rl(fx;x)]Kxoxv(x )d(F,,-F )(x )12
Using integration by parts, Holders inequality and Sobolev's Imbedding theorem to analyze the
terms involving Fn - F (there are several examples of this in[12] ) we obtain
E<Zn;L(OX) Z;L(OOiOXv> < n 'IIKXO;LvIIZ2 + n -'kn IlKovIL2lK,xvIlI2+ kn2jIT(%O; )-Tj(08;.)jj2 d IIKXvI12d (3.19)
Clearly
Il1(00o;) - 1(Ox; 1¾2 Sup I1,(Oo;x) - j(0X;x)I2 +Xsup2I '. (eo;x) - Zfl(OX;X)12x e Cl i=1X e Cl axi ~~~~ax,
Consider the first term here, by the mean value theorem
Tl((o&x) - 71(Ox;x) = Kx(x)(0O - Ox)[x] (3.20)
where X(x) e [O,XO]. Therefore
supIJn(o;X) - rI(9;;x) 12 sup I K(x(OX - (o)[x 2
< sup sup IKX(O -00)[x]12 (3.21)X e [OA)Ixea< sup IK(Ox - e0)[x] 12 < IIK(OL--;)jjWd42
xXe 2
where 8 > 0 is arbitrarily small. The Sobolev norm stability assumption (A.5) was used to
obtain the second to last inequality, the last inequality uses Sobolev's Inequalities, see
Theorems 3.9 and 3.10 of[l]. By a similar analysis, the second term is bounded by a constant
times IIK(00 - OOjj3d246, with 6 such that 3d/2 + 5 < s in (A.5). Using hypothesis (ii) and
- 20 -
(A.4)
IIK(e ° lW24 IIKW(jd/2o)IId IIO- 0iI3d/21+E (3.22)
where e > 0 can be arbitrarily small. 00 e EPO for po > 3d/2M, so Theorem 3.1 gives
||n(e0;) - rl(e ;')Id < I--00113d/2M+e AI2d/M+8 (3.23)
where a > d/2M and 8 > 0 is arbitrarily small.
IIKXO)Xv t2I I=llvllxo = [1 +yv]° = 1 and using (A.5), (A.4), (A.3) (in that order)
IIKk¢xvIIdd -IIK4vIIW -II4AVIL3IM - IkxvIIXd/M = [1 + v]dM (3.24)
Thus (3.25)
(OO- ZX(0),O>2 < n + nk [1 +XV]d2M + kn 2XE[l + Y
where e> 0 is arbitrarily small. But from (A.5) and (A.6), yAV =V so for
-llr < p < 2 - llr - dIM
EIIO^4 - OX1j2 < nlCr(X,p) + nl1knCr(X,p+d/2M) + kp2X2xCr(X,p+d/M) (3.26)
Using the constraints on p and expression (3.15) for Cr(A,p), this reduces to
< n1Cr(k,p) + n lknX-d/2M Cr(X,p) + k2x2a-d/MCr (X,P)
= n 1Cr(X,p) + lkn XAd/2H2MCr(Xp) + k2XECr(X,P) (3.27)
where £ > 0 is arbitrarily small. This proves the first statement of the theorem. The last part
follows immediately. U
- 21 -
4. Linearizations Results for Hammerstein Equations
Here we consider data functionals which are evaluations in the range of a first kind Ham-
merstein integral operator
rn(O;x) =rk(u,x)f(e(u),u)du , x e n (4.1)
or
Tj(O,x) = N (e)[x] = KF ()[x] (4.2)
where K is an integral operator with kernel k (u x). K is assumed to be compact, invertible
with zero null space. F is not assumed to be compact, indeed in our examples the derivative
of F defines an isomorphism on L2(M). For Hammerstein operators K determines the smooth-
ness or compactness while F detemines the degree of non-linearity. In order for ri(Ox) to be
defined we assume f (x,y) is continuously differentiable in x and that f (O(u),u) and
af(O(u ),u) and are in L2(Q) for e in L2(Q). The linearized operator isax
Kh44xI = )k (u,x)h (u)(u)du (4.3)
where h (u) = af (O(u),u). Thus K* = K,,. with h (u) = afL(e (u),u).
In practice, integral operators typically arise as Green's operators for differential or
pseudodifferential operators. Boundary value problems give rise to integral equations of the
first kind, initial value problems to Volterra integral equations. Following Agmon[l] a
differential A operator of order I is defined as
A(x,D) = 1; aa(x)Da. (4.4)I al lI
where a = (al,c2, ad)' is a d dimensional integer multi-index with non-negative com-
ponents. I a I is the euclidean norn of a and D a is the differential operator
- 22 -
D a2 D Dadda
D =DD1 DX2 Dxs d IIa' (4.5)
Assumption. A.7. (Properties ofK)
(i) K is compact with zero null space.
(i) K is a Green's operator for a linear differential operator A with prescribed boundary con-
ditions; whenever K ) = v then A V = 4 and v satisfies the boundary conditions, conversely if
A vf = 4 and yf satisfies the boundary conditions then K ) = Nf.
Assumption. A.8. (Quadratic Penalty)
The penalty functional has the form, J()) = IL04l12 where L is an m'th order linear
differential operator with real coefficients
L(x,D) I; (x)D.111 5l
Assumptions (A.7) and (A.8) are in force throughout this section. Since J is quadratic, the
first half of the stability condition in (A.3) is automatically satisfied with W = L*L where L*
is the adjoint of L, see p 51 of[l] for the definition of the adjoint. W is a positive operator
(<O,We> = IILOil2 >0 ). If the coefficients of A are sufficiently smooth then the composition
of L and A is well defined. To establish (A.4) it is enough that
C II4)IIW,,+/ 5 IILA OIIl2 + II111L2 1c2II4)Iw, +i (4.6)
for 4 in the range of K. Sobolev's Inequality may be used to obtain the upper bound. The
lower bound is much more delicate. The natural tool is Garding's Inequality (see Theorem 7.6
of[l]). If L and A are uniformly elliptic with sufficiently smooth coefficients lILA )112 is a uni-
formly elliptic quadratic form of order 2(m+1) and Garding's Inequality gives that
||LA4IlL2 + Il4)lL2 - cil lWl +l (4.7)
- 23 -
but only for 0 e WO,2 (elements in Wm+' whose derivatives up to order m+l-l vanish on the
boundary of Q see definition 8.1 of[l]). In the case that K corresponds to a Dirichlet problem
for A we might have that KW e W0,2 (fl). This goes some way towards establishing the result
but what we need is that KW e wmjj (n) which typicaly would not be true. It seems that we
shall have to resign ourselves to establishing the norn equivalence on a case by case basis.
More significant progress is possible with regard to the other assumptions. We begin with sta-
bility.
4.1. Stability Results
For h a real valued function defined on Q let
Kh4[x] =K(4h)[x] = Ik(x,u)4(u)h(u)du (4.6)
We consider the situation where A is elliptic and K corresponds to a Green's function for a
boundary value problem. Since J(O) = IIL411L2, for the first stability assumption (A.3) we must
only show that IlK IIL2 = IIKh 01IL2 for some range of h. For (A.5) we establish the same rela-
tion in Sobolev norms. In 1-dimension we have the following result.
Theorem 4.1. Suppose n = [0,1] and let A be an ordinary differential operator with real
coefficients
A (x,D)4 = Zav(x)(v)(x)V=1
where av e Cv(Q) for v = 1,2, 1 and a,-' is bounded. Suppose K is a Green's operator
for A with Dirichlet boundary conditions: Uv(o) = 0 for v = 1,2, I where Uv(o) are dis-
tinct elements in {4 (k)(0),O(k)(1) ; k = 0,1, .-1) .
Let B'(R) be the ball or radius R in Wl [0,1]. For each 5 > 0 there are constants c I > 0 and
C2 > 0 such thatfor all4 E L2 and h E B (R) with Ih I > 5
- 24 -
cl11|KOIlLI2 S IhlL2 < C2 I0fIK L2'
Proof. Since the coefficients a, are smooth the adjoint differential equation A* is defined
A*(x,D)o = J;(-l)vDv(av(x)4(x)1 (4.8)V-I
The leading coefficient in the adjoint is a, and by assumption the lower order terms are
bounded. Let G (t ,x) be the Green's function for the adjoint, we have k (x ,t) = G (t:x). From
pp. 180 of Collatz[9], a- (x,t) = -vG(t,x) is continuous in x and t for v . 1-2. CY-- (x,t)atv atv at'-'
is continuous except for a jump of size l/a, (x) along the "critical line" x = t. It follows that
aak is square integrable for v = 192,... 1-1. Now since G is the Green's function for the
adjoint equation
A *(tD)G(t,x) =0 (4.9)
a'Gfor all x * t E [0,1]. This means that al(x) -7G(t,x) is a linear combination of lower order
partial derivatives of G for x . t. The coefficients in the linear combination are bounded.
Applying the Cauchy-Schwartz inequality we get that the L2 norm of ak is also bounded.at'
Since G (t x) is a Green's function for the adjoint equation G (t ,x) satisfies the adjoint boun-
dary values. Thus for any solution v to the differential equation xj(V)( )aak(x,t) = 0 for t = 0at"
and t = 1.
We begin with the lower bound. Take Vy(x) = Kh Q
A (x,D )x= av(x)v)(x) = ¢(x)h (x) . (4.10)v=O
Let bv(x) = av(x)lh(x) so ¢(x) = lbv(x)V(V)(x).v=O
25 -
1 11
K4[X] = fk(x,t)4(t)dt = z k(x,t)bv(t)4(v)(t)dt) (4.11)
so
1K1L22 (x,t)bv(t)(V)(t)dt 2 (4.12)
The remainder is integration by parts. Consider the v=l tern on the right hand side. Integrat-
ing by parts and using the condition 4'(O)k (x,O) = iV(l)k (x, 1) = 0 gives
(x,t)b '(t)k(b)(t)dt}2 at (X,t)W(t)dt 12 (4.13)
Applying Cauchy-Schwartz and Sobolev's Inequalities we get the upper bound
5 IIWII 2 Xllak (x .)1122dx Ila 1112 Ilh 112 C(5,R) (4.14)
where C (8,R) is a positive constant depending only on R and S. Similar estimates can be
obtained for each v 5 1-i. The v = I term brings up . a, (x,t)b,(t)xV(1)(t)dt, which is com-
plicated because it involves the (1-l)'th partial derivative of k together with the first derivative
of IV.
1atJ_ (x,t)bl(t)V(1)(t)dt (x,t)bl(t)W(1)(t)dt +J11
Integrating by parts (using the boundary conditions) reduces this to
-aa , k (x ,x-O) _ aat-7k (x ,x +O)]b- (x )v(x ) _ t a k (x ,t )bl (t ),(t )dt (4.15)[he-jumpin k (x,tO)at -(= xx+o)ibzl(x)x(x) -ati
The jump in -~--(x ,t) at t = x is of size 1/a, (x) and applying Cauchy-Schwartz we have
at'- ((x,t)bg(tY)J/)(t)dt }I2dX . IIVIIL22 )II2 11 2 1 C (5 R )
- 26 -
where C (8,R ) is a constant. In this manner we can obtain the bound
I
avkIIK IIZ2 . IIV1L2 C (8,R ) {j L I(x, )IL2d ) ( a V112 1 (4.17)L2 2 1dv= ~t7V V= W2
Since 11,1142 = IIKhPI12 this gives the lower bound.
For the upper bound let W(x) = KO[x] so A(xD)W(x) =(x). Put
gh(x) = K(A V)h[x] =KOh[x]. Using the Green's function
gA (X) = Ik(x t) a v(tO)(v)(t) I h (t)dt (4.18)
By Leibnitz's rule
dv V ()th(-)t-t±(7(t:)h (t)) = 1 ()TV(k)(t)h (v.k)(t) (4.19)
So
ga (x) = (x,t) av ())dt - k(x, { £()a v(t)M(k)(t)h (vk)(t)I dtv=o dtv v=Ok=O
I V- 1- W(x )h (x) - E X(x ,t )()a v(t)4()(t )h (vk)(t) (4.20)
v=-Ok" a
Integrating by parts and applying the Cauchy-Schwartz and Sobolev inequalities gives
|h12 < IIWII 2 C(R 2) (ria(, )IIL2dx){(EllaV112v}
< IIyII4 c2(R) (4.21)
Since IlIgh 112= IIK4h 112 and II'VII - IIK4I124 the upper bound is also proved. U
The above result may very well extend to multi-dimensional domains and to pseudodifferential
operators but at the moment we have no results to report. For differential operators satisfying
an elliptic type coercive estimate, see Chapter 10,11 of Agmon[l], assumption (A.3) implies
- 27 -
assumption (A.5).
Theorem 4.2. Let B'(R) be the Sobolev ball of radius R in Wt (a). Suppose K is a Green's
operator for an I'th order differential operator A .
(i) If the coefficients ofA are in Wk (Q) andfor 4e w (+k()
IkWt1w2 +k IIID A)112 + 11W112
(ii) For h e H, IIKh)1112 = IIK )112
Then for h e Bk(R)(,H with Ih I > 8 >Ofor some 8
22 l w2+
uniformly in h.
Proof: We first show that there is c1 > 0 such that for all 4
IIK,h lli2.+k < clilK )II2 (4.22)
This does not us the condition that I h I > S. By (ii)
IIKh )l2I IIDa(4)h)1I2 + IIKh )II 2 < { z IIDa(4h)I112 + IK4)11Z2)lal =k lal =k
Applying Cauchy Schwartz and Sobolev's inequalities
I ID a(x)h )112 < { llh 112}k) { : IID a4II2 + 11II12} (4.23)lal -k lal -k
We are done if we can bound ll4ll,2 in terns of IIK )112 + 1lIDa)11 22 Since A is an I'thlal =k
order differential operator with smooth coefficients
lIAVIIL2 < IIW|1L2 + JlIDaVyIjj2 < llVl12 (4.24)lal =1
Letting xV = K4 and using (i), this gives
- 28 -
Ik11L2 < IIK112pI+- IIK pll 2 + IID|IL 2 (4.25)Ilal =k
which establishes the upper bound. For the lower bound, let O=(h so ( = 4yh'1 and
KO = K.-m. Now use the upper bound to get
IIKh lWyit224 < c211K 2WI,+k (4.26)
in other words, IIKII2wl+k < C21jKhOIljw+k. U2 ~~~~~2
4.2. Growth Rate Results
Assumption (A.4) gives us:
Theorem 4.3. Suppose assumption A.4 holds then y,v .
Proof: Let X, be the eigenvalues of the Raleigh Quotient
iiKe0ii2/{IIKeIi12 + <e,We>} (4.27)
- 1 = YV,. By (A.4) and the Mapping principle in Weinberger[36], yv -v where gv are
the eigenvalues of
ILK II WmKeijM (4.28)
Let L be an integer larger than M. By A.4(i), Sobolev spaces may be defined by the K-
method of interpolation, see page 40 of[20]. Thus for p = MIL
IIKOII 2M = [ I + tv]P<Ov,K 0>2 (4.29)W2v
where tv and Xv are the eigenvalues and eigenvectors of ( 2 lID From standardal=L
theory for elliptic differential operators, Agmon[l] for example, 4v = v2L'd. Direct calculation
of gv using the Maximum-Minimum characterization of eigenvalues, pp. 79 of[35], gives
gV = [1 + tv]9. (The corresponding eigenfunctions are K-1(p). Thus gv = v 2M/d and so
- 29 -
Yv V2M/d *
If K is a Green's operator for an I'th order differential operator and L is of order m then
Yv = V2(m d. Thus if (A.4) is true we must have M = m + I. We provide some results in
this direction next. Assumption A.7 and A.8 are in force for the remainder of this section.
Theorem 4.4. (A and L elliptic)
(i)a e C2'2m+'(Q) for IoI 1:5andlpe Cm+" (Q) for I3I <m.
(ii) A and L are uniformly elliptic (see pp. 45 of[]]).
then if yv are the eigenvalues ofW relative to K K
y = V2(m+l)/d
Proof:
LA(x,D)= I (x)D{l a((x)DOa} (4.30)11PIm lal Sl1
is an (I+m )'th order differential operator whose coefficients are real lie in C'"+m)(0). Let
P (x ,D) = (LA )* (LA) = A * WA. Given the smoothness P (x ,DJ) is a differential operator of
order 2(m +1) with continuous coefficients. <4,P (xD))4 = ILA 4112 and the coefficients of
LA are real so P(xD)) is a positive operator. Let P'(xD) be the principal part of P(x,D).
Forte Rd
P'(x,4) = 2II(x )lp(x )a a(X)a a(X)4a+P+a+ (4.31)113.1t3'I =mIal,lai =1
= I Ia1(x)2aI2I1Ial1-I 111 -m
> Eo I 12114el2m = Eo1412(1+m)
where Eo > 0. The second last inequality comes from the ellipticity of A and L. Therefore
P (x ,D ) is unifomnly elliptic differential operator of order 2(m +1). Let
- 30 -
Xv,v; v = 1,2, ) be the eigenvalues and eigenfunctions of P.
A WA XVv = Xvv
(WV,,W) = 8V,L v,g = 1,2, (4.32)
From Agmon[l], v, = v2(m+I)d Let Xv(x) = A (x,D )yv(x) so K4v = xVv. By substitution
<O¢L,WoV> = XV5V9L
(K4OgK4v) = 8vg v,g = 1,2, (4.33)
Therefore yv = v which proves the result U
4.3. Generalizations to Pseudodifferential Operators
There are extensions of the above results which apply to more general pseudodifferential
operators. We give a theorem related to this below. To motivate this, we first indicate esti-
mates for the eigenvalues of Abel transfonns. Related results are given in Nychka and
Cox[23].
4.3.1. Abel Transform
Let f : [0,1 ] -+ R. The fractional integral or Abel transform of f is I,f (x) where
IJ(x)-~~~[rl)[x-u] -lf(udu 1Iaf (x) f (x) a=o (4.34)
This transform has many practical applications, see Ross[28] for example. Suppose W = L* L
dmwhere Lf (x) = -mf (x). We wish to consider the eigenvalues of the Raleigh quotient
(r (4.35)
for 4e W' [0,1]and 0< c <1.
- 31 -
Theorem 4.5. If Yv v = 1,2, are the eigenvalues of the Raleigh quotient in (4.35)
arranged in increasing order thenyv - V2(m+a).
Proof: For convenience replace the interval [0,1] by [0,2nt]. Let
c = (0 e Wm [0,2it] : (s)ds = 0 , 0()(0) = 0(&)(27r) = 0 , 0 < j < m }. Ec are periodic
functions on [0,2in] with mean zero. Let AX, be the eigenvalues of the Raleigh quotient in
(4.35) over 8c. By the Separation Theorem on pp. 107 of Weinberger[35], k - < yv < Av
for v > 2m+1. Any 0 e e3c can be represented as a Fourier series (whose m'th derivative is
convergent in L2).-(x) = O ke andI k I13>0
L4(x)= ekex(ik)M (4.36)Ik 1=>0
where
2n
4Qk = i If (x)e-Icd . (4.37)2JC
0= 0 so from pp. 120 of Butzer and Westphal[5],
ikxIa¢(x)= z Qk(ik; ae (4.38)
IklI>0 (ik)c
for0c<a 1. Thus
I OQk 2k2(L OILO I k 1>0(L4,L4)_IkI>0 ~~~~~~~(4.39)(1a41af) 2 -2a
Ik 1>0
Letting iVk = k-"4/k for I k I > 0, the Raleigh quotient becomes
z k1222(m+a)Ik1>0k1k 1>0 (4.40)
Ikc 1>0
- 32 -
Using the Maximum-Minimum characterization of eigenvalues we compute the eigenvalues and
eigenvectors explicitly: This gives eigenvalues Xv = v2(m+a) and corresponding eigenvectors
,,v) = 1/2 for k = -v,v and Nk(V) = 0 otherwise. So yv v2(+a).a
Abel transforms are related to Green's functions functions for fractional derivatives. We end
this section with a result for general operators of this type - so called pseudodifferential opera-
tors.
4.3.2. Pseudodifferential Operators
Here we follow Taylor[32]. The symbol class S',1 (Qi) is defined as the set of
p e C (xRd) with the property that for any compact M c n, any multi-indices a, P, there
exists CM,a,p such that
ID PD 'p(x,4)I < CMp ap(l + I1I)mo (4.41)
for all x e M and 4 e Rd. The operator corresponding to the symbol p is denoted P, one
says P e OPS m1 (Qi). The action of P is on a function is defined by means of the Fourier
inversion formula. The Fourier transform of 0 : Q -4 Rd is 0(e) = (2)r4d (x)e-' 4dx for
e Rd and
P (x ) o(x) = (x ,4)0(4)e= d . (4.43)
Adjoints and products of pseudodifferential operators are themselves pseudodifferential opera-
tors and there are asymptotic expansions for their symbols, see Chapter 2 of Taylor[32]. The
operator P is elliptic of order m if on each compact M c Q there are constants CM and R
such that
Ip(x,4)I > CM(I + II)m if x E M, 141 >R (4.44)
- 33 -
The following lemma is elementary.
Lemma 4.4. If Po e OPSm, (Q) is elliptic of order m and P1e OPSm-1 (n) then
PO + P1 e OPSm, (Q) and is elliptic of order m.
Proof. P0 + P1 is clearly an element of OPS m1 (Q). Consider a compact M c D. Since P0
is elliptic of order m and PI e OPS ,m7' (L) there are positive constants R, C1 and C2 such
that
Ipo(x,4)l > C (I + It I)M (4.45)and
IpI(x,4)I < C2( + ItI)M-1 (4.46)
for allx e M and 141 > R. Therefore
Ip(x,4) I > C1(1+ I )m -C2(+141)M 1 (4.47)
Hence we can find R' ( i.e. R' > max(2C2/C IR) ) such that p (x,t) satisfies the ellipticity con-
dition.E
From the spectral theory of pseudodifferential operators we have the following result.
Theorem 4.6. If K be a Greens function for an operator A, where A e OPS l (fl) is a
pseudodifferential operator with symbol a. Let W E OPS 2On (Q) be a positive elliptic
differential operator of order 2m with symbol w (x ,4). If A* A is elliptic of order 21 and fl is
compact, then the eigenvalues Yv ofW relative to A* A behave as
y = V2(m+1 /d
Proof. Let P = A* WA. Since W is positive so is P. By exercise 4.2 on p 47 of[32]
P E Qp5J1m+l) (Q). Let p be the symbol of P, p = a* (x ,)w (x ,)a (x,4) and p I = p - pO.
Po SjEm+I)((Q) and
- 34 -
IpO(x,4)I = Ia*(x,4)a(x,4)1 Iw(x,4)1 (4.41)
which gives that the operator corresponding to po is elliptic of order 2(m+1). Using an
asymptotic expansion of p and applying Theorem 4.4 (p 46,[32]) twice,
Pi = P - Po SS0jm+I>l (a). Lemma 3.5 shows that P is elliptic of order 2(m+1).
P has a complete set of orthonornal (in L2) eigenfunctions ( v ; v = 1,2, } and
corresponding eigenvalues { X, ; v = 1,2, }. From the discussion at the beginning § 1 page
295 of[32] and Theorem 2.1 in the same chapter, Xv = v2(m+l)d* Letting
v(X) = A (x,D)N)v(x), yv = A, as in Theorem 4.4. U
- 35 -
5. Application to System Identification
For the abstract system identification problem in (1.10)
1(6;X)= lx(u8) (5.1)
where lx is a continuous linear functional on U and us satisfies A (e,ue) = f. Ix is a com-
ponent of the measurement operator X, u is a real-valued function u: -+ Rq and that the
measured data are equivalent to evaluations on u. Let X be the limiting design operator,
X, -+ X as n - oo (convergence in L2 norm).
Assumption. A.9. (Equivalence to Evaluation)
The mapping X: L2(Q) -+ L2(I) is an isomorphism.
Thus L2 norm in the range of X is equivalent to L2 norm in the domain of X. It is sometimes
of interest to consider measurements which fill up the frequency domain. The Plancherel
theorem shows that (A.9) will be true for such data. General integral measurements (compact
X) are excluded by the assumption.
Formally the linearized operator is defined by K94[x] = lx(f -}. From assumption
A.1(i)
IK90[]IIL2au
(5.2)
uniformly in 0. We therefore assume without loss of generality that K94[x] = -f-4[x ]. The
form of Ke follows from the implicit function theorem.
Theorem 5.1. Let es, U, F be Banach spaces and suppose A: ExU -4 F. If
(i) A (0,u) has uniformly continuous Frechet derivatives w.r.t. 0 and u, a(A and a.A, in a
neighborhood of (00,u0) where A (00,u0)-f = Ofor some f E F.
- 36-
(ii) a,A (0o,u O) is invertible
Then there exists a Frechet differentiable map us,: e -4 U defined in a neighborhood Nfo of
00 such that A (,u e)-f = 0 for 0 e N890. MoreoverDu
0 satisfies
Due{auA (0°,u°)) = - {aA (0.,u O)4}
Proof: See Theorem 2.3.5 of[17] X
Remark: The existence of solutions for a given 00 is not resolved by the above theorem. For
work on quasi-linear differential equations, see Cannon and Ewing[6] and chapter 8 of Lady-
zhenskaya and Ural'tseva[19]. A general theorem is given in Kravaris and Seinfeld[18].
5.1. Examples
(i) Linear Diffusion Equation
Let u(x,t): Qx[O,T] -e R and 0(x); Q -e R.
at ax axA(O,u) =DOau) (5.3)
u(,0)
For 0 strictly positive A corresponds to a parabolic diffusion equation with Neuman boundary
conditions. Using Theorem 5.1, K84 satisfies
aKto a a[K= a Due-~[x'I~t~-g(0(x) a x,t]) [ Wx) (x,t)] (5.4)
with zero initial and zero Neuman boundary conditions.
- 37 -
(ii) Non-linear Diffusion Equation
Again u (x,t): lx[O,T] R but now e(x,u); QxR -* R.
_ - [O0(X,u)-]{t ats
A(e,u) = a u (5.5)
u(,O) J
For 0 strictly positive A corresponds to a quasi-linear parabolic diffusion equation with Neu-
man boundary conditions. The results in Cannon and Ewing[6] show that u9 exists. Ke4
satisfies
aK04O a K04 ae au6 _ a aue0at [xt]T [e(X,ug) --[x,t] + K9[x,t]-- ]- [-(xue) -ol(5.6)
with zero initial and zero Neuman boundary conditions.
(iii) Scalar Wave Equation
Letu(x): -+ R and0(x); n-R.
A (0,u ) = { [A2 + k'O(x)]u } (5.7)
an is the boundary of n. When the potential 0 is strictly positive then the equation can be
solved. K904 satisfies
[A2 + k20(x)]KOe = - k2,4)U (5.8)
with Ke(1[x] = 0 on the prescribed part of an.
Remark
It is clear that if A is a differential operator whose boundary conditions do not depend on
O then Koo will satisfy a differential equation with zero boundary conditions and forcing term
determined by 4 and u . Symbolically we have
- 38 -
POKe4 = Qe4 (5.9)
where P0 and Qe are differential operators. Intuitively if P0 is elliptic of order I and Q8 is
elliptic of order k with I > k then K84 should give X, k-I orders of smoothness. Thus if W
corresponds to a positive elliptic differential operator of order 2m, we conjecture that the
eigenvalues of W relative to K;K9 should asymptotically behave like v2(m+I-k)d. In 1-
dimension with constant coefficient equations this intuition is easy to substantiate using Fourier
series expansions. However, in general the justification is probably not so straightforward.
We now consider some properties of the linearization arising in the linear diffusion example
introduced in § 1.
5.2. Linear Diffusion
5.2.1. Time Invariant Case
Our discussion will be restricted to the 1-dimensional problem n2 = [0,1]. In the time
invariant case it is of interest to identify 0 in the equation
d (O(x) du (x)) = f (x) .(5.10)dx dx
With f known and u measured with error. The boundary conditions must specify information
about u and its first derivative. For simplicity we assume the boundary data are u (0) = a and
u(')(0)=b .0. It is easy to show that Koo: [0,1] -+ R satisfies
d (OW. dKe4 [x]) = d [0W, due( ] (5.11)dcx dx dx dx
with d-K90[0] = K,9[0] = 0. Integrating and using the first boundary condition we have
0(x) [x] = (x)-d-(x) . (5.12)dx dx
For 0 strictly positive (or negative) we can integrate using the second boundary condition to
- 39 -
get
Ke4x ] = 4(s )h8(s )ds (5.13)
where h e(x) = du/x0(x). Note that K04 corresponds to a Hammerstein type linearizationdx
like the ones considered in §4: Keo = K(he9) where KO[x] = (t)dt. Formally K is a
Green's operator for the differential operator A (x ,D) = D with a zero boundary condition at
x = 0. We take L(x,D) = Dm. LA = Dm+l and
IILA IIL2 + 11WIL2 IkWI+1 (5.14)
so from the discussion at the beginning of §4, the Sobolev norm identification assumption
(A.4) is satisfied. It follows from Theorem 4.3 that the eigenvalues of W relative to K* K
behave like V2(m+l) asymptotically. Stability results are obtained with some conditions on the
forcing term. First a lemma.
Lemma 5.2. Let a .0 be an integer and suppose
(i) f E W [0,1] and f (x) > 6 >O and b > O (or f (x) < 5 <O and b < 0).
(ii) N(aR) = 0: 0E Bl+a(R) and 0(x)> e>O0
then h eE Bl+a(R') and I he I > rl > 0 where R' and 1 depend on R, 6, a and f and b.
Proof: Integrating (5.10) and using the boundary condition gives,
du9 ~x du(xd x(t)dt + b I so hg(x) = d(x)/(x) = 1( (t)dt +b1f It follows
dx OW dWfrom (i) and (ii) that there is some r1 > 0 such that I h 8(x) > vj. By direct computation it is
easy to find C > 0 such that
IlIh II 2,+cz < C 11 2 I,cz+i {b2 + lif II2,m} = R * (5.15)W2 ~~Ocx)W2
-40 -
From Theorems 4.1 and 4.2 we obtain the next result.
Theorem 5.3. For some a . 0 (an integer), e > 0 and R > 0, let N0o c Nja,R) and
f e W2 then assumptions (A.3) and (A.S) hold. s = 2 + a in (A.5).
Proof: For (A.3), let A = D and K4[x] = t(y)dy. Since a 2 0, Lemma 5.2 gives that
he e B1(R?) for O e Ne,, moreover h9 is bounded away from zero (uniformly in 0 E Neo).
(A.3) now follows by Theorem 4.1.
From Lemma 5.2 we can take H = Ba + 1(R') in Theorem 4.2. Since A = D, condition (i) of
that Theorem is clearly satisfied for k = a + 1, 1 = 1. We conclude from Theorem 4.2 that
(A.5) is satisfied for s = 2+a. U
Letting r = 2(m+1), we can use Theorems 3.1 and 3.2 to read off the behavior of the bias and
variability of the linearized estimators. Combining this with the results to be presented in §6
we get the following result.
Theorem 5.4 (Convergence of Constrained Regularization Estimator)
Suppose 00 E Wk[0,1] for k > 3.25 and 00(x) > e > 0. Let m . k, M = m+l and
po = (k+l)/M. If An is a sequence such that n-l'X5/M - 0 then for 0 < p < 2 - 512M and
any sequence of X's tending to zero with X > X,, then
I1Ox - %I0p2 < (Po - P)
lHn - OIlp < Op(n'X-P + 1/2M))
Thus
11K(0nxA 0O)tIlMw < A(Po - P + Op(nWlx-p 1/2M))
An upper bound on the optimal rate of convergence is Op (n -.M(po - p)(2Mpo + 1)) which occurs
when A = n .(2Mp+ 1)* This rate is achievable if k > 3.5.
- 41 -
Proof: Theorems 6.8, 6.1 and 3.1 give the bound on the bias; Theorems 6.8, 6.2 and 3.2 gives
the stochastic bound. The Sobolev norm identification gives the bound on the convergence rate
in Sobolev norm. Equating the upper bounds on the systematic and random errors gives the
upper bound on the optimal rate of convergence which occurs when n2M/(2MPo+ 1)
X> Xn if k > 3.5. U
5.2.2. Parabolic Case
It seems reasonable that the degree of compactness of Kg should be determined by the
elliptic component of the differential operator. Standard energy type estimates are consistent
with this notion, see (42.19) of Treves[34] for example. We conjecture that convergence
results obtained for the time invariant case carry over to the parabolic situation. By this we
mean that the asymptotic behavior of the eigenvalues of W relative to K6K9 is the same in the
time dependent case as in the time invariant case. To provide some numerical evidence for
this we computed two sets of eigenvalues, {Yv) and (yj}, where yv are estimates of the eigen-
values of
(|ID 2 }I121{IIKe4II2) (5.16)
and yv are estimates of the eigenvalues of
{ IID212( IIK4IIl21 (5.17)
x
and KO[x] =(t)dt. The estimates are obtained by a Raleigh-Ritz method, see Page 79 of
Weinberger[35], with the approximating subspace being the span of 30 cubic B-splines with
equi-spaced knots in [0,1]. Plots of log (yv) and log (yv) against log(v), called eigensequence
plots, were introduced by Nychka, Wahba, Goldfarb and Pugh[22]. The eigensequence plots
are given in Figure 5.1 and 5.2. Both plots are remarkably linear suggesting that the
- 42 -
eigenvalues of (5.16) and (5.17) have an algebraic rate of growth. Theoretically, using
Theorem 4.4 for example, y, = v6 and if the parabolic problem were like the time invariant one
we would also have yv V6. This means that the slopes of the eigensequence plots would both
be around 6. By regression we obtain slope estimates of 5.6 (±.36) for yv and 5.8 (+.25) for
v (The standard errors are in brackets). The theoretical value of 6 is quite consistent with
both estimates - lending support to our conjecture.
- 43 -
6. Justification for the Linearization
Recall that the linearized estimators are defined as
Wk = -o- [U(00) + xW(e0)]r-Z(00)and
o >X= - [U(M ) + XW(eX)]11ZnX(Oe) (6.1)
where 0O E C and ZX(Ox) = 0. We will show that OX and the regularization estimator itself OnX
both exist and that: I1O1 - Oolip = IiO. - Oolip and ilinX -e,llp = llOnX - OXI1p- Our results will
show that the linearized convergence theorems described in §3 accurately predict the first order
asymptotic bias and variability of the constrained non-linear regularization estimator. An
assumption will be made which says that 00 lies in the interior of the constraint set The
linearizations are justified by arguments very similar to those used in Cox and O'Sullivan[12].
Intuitively the idea amounts to showing that Zx and.Znx are roughly linear and that the lineari-
zations behave like Newton-Raphson linearizations for n sufficiently large and X sufficiently
small.
6.1. Some General Results
Let So (R) be the ball of radius R about the origin in ep*. The ball of radius R about 00
in Ep is denoted S*O(R) = {f0} 0 S* (R). Similarly S* (R,X) is the ball of radius R about the
origin in E)p and S 0(R ,X) = 00} e S (R,).
Assumption. A.10. (00 lies in the interior of C and N9g)
For some p and R > O, S 0(R) c C and S (R) c N90.
To describe the theorems we must introduce several quantities:
(6.2)
- 44 -
for given p*, p in R
K2(X) = sup IIG1 (00)[G (0O) - DZX(00)]u Ilp2
e S(1)
K*(%) = sup IIG-'(Oo)[GX(Oo)-DZX(Oo)]uJIp.u e S (1)
K3(X,R) = sup IIG-i (0o)D2ZX(0o + u)vwllpU,V e S (R)w e S(1)
K3 (%kR) = sup IIG (0o)D 2ZX(0 + u)vwltU,V 6 S(R)w e S (1)
K2(nA) = sup IIG 1 (Ox)[G(O-) -DZnx(Ox)]uIulpso(1,X)
K*(n,X) = sup IIGj (Ox)[Gx(O - DZnX(O]ull%p-54 6 SO(1?X)
K3(n,X,R) =
K (n,X,R) =
sup JIGR (OX)DZ2z,X(oX + u)vwlipu,ve S(RA)w e S (l,X)
sup JIG -1 (OX)D 2Znx(Ox + u)v llpu, S;(R,X)w S (l.X)
(6.6)
d(X,p) and dn(X,p) are such that
d(X,p) = 11Ix - Oollp = IIG-' (Oo)XW(Oo)OolipOp {dn (X,p)l} lln - Oollp = IIG - (OX)Znx(Ox)lip (6.7)
Finally let r (A) and r * (X) be such that
K2(X) + K3(X,x (X)) r (X)K2 (X) + KQ (X,x (X) = r (X) (6.8)
wheneverx(X) d(X,p*) uniformly in X as X O. Similarly r,(X) and r(X) are such that
Op {K2(n,X) + K3(n,Xxn (X)} = r, (X)
Op{fK2(n,X) + K*(n;XXn W)) I rn*(;t) (6.9)
whenever xn(X) d,(X,p*) uniformly in k and n as n oo and for X E [XngkO]9 where
(6.3)
(6.4)
(6.5)
- 45 -
ko < oo and X, is some given sequence tending to zero in a manner to be prescribed later on.
Theorem 6.1. (Existence of O,x and the Bias Approximation)
Suppose (A.JO) holds. If r (X) 00 as X -4 0 then we can find Xo > 0 and constants K0 and
Kp for p e R, such that
(i) for X e [0,X0] there is a unique Ox e S* (Kod(X,p*)) satisfying Z;(ex) = 0 and
OX e C N 0o.
(ii) II8f. - O,jllp . Kpr (X) d (X,p) for any p e R.
Proof: The argument is very similar to Theorem 4.1 of[12]. We give an outline of the proof.
The idea is to show that for some Ko and X0 > 0 the mapping
FX(4) = - G-1 (0O)ZX(00 + 4) (6.10)
is a contraction on the ball So(Kod(X,p*)) for each X e [0,X0]. Choose Xo and Ko so that
1ijx - Ooll . Kod(X,P) and So*(Kod(X,p*)) c Ca No for all X e [0,X0].
IIFX())Ilp. = IFX() - (CO - 0) + (-OX- o)llpS 114 - G j (0o)ZX(0o + O) -c(o.- o)ll. + 110x - eool (6.11)
Thus
IIFX()Il- . IIG 1 (0o)[ZX(0o + O) - Z(0o) - GX(0O)4]It1. + II1. - 0011JIIGaj (0o)[DzX(0o) - G X(0o)]OI4)I+ sup IIG ' (0O)D2Z4(00 + 0* )OI. + Kod(k,p*)/2
e So(Kod(X,p0))
< [{(K*(%,Kod(X,p*)) +K3(,K0d(X,p ))} + 1/2]K0d(Q,p*)< [r*(X) + 1/2]Kod(Q,p) < (3/4)Kod(X,p*) (6.12)
for X sufficiently small. By a similar analysis for p e R
IIFX(01) - FX(02)jIp < {(K2(X,Kod(X,p*)) + K3(XK0d(X,p*))}I)lj - 42lIp< Kp r(X) 1101 - 02lip - (6.13)
- 46 -
Since r* (k)4 0, Xo and Ko can be found so that Fx is a contraction on S*(Kod(k,p*)) for
k e [O,X0]. Applying a fixed point theorem, Theorem 9.23 of Rudin[29], we have that for
0 . X . Xo there is a unique ox e S*(Kod(X,p*)) for which FX(O0 + ox) = Ox. Letting
0; = 00 + ox, 0e satisfies the constraints and is the only root of Z4 in S0*(Kod(X,p* )). This
proves part (i).
For p e R
IIeA- AIp = IIGGA (e0)G(0O)[eA-OA]llp= IIG-1 (e0)[GX(e0)(Oe - e0) - G-(Oo)(0X - O0)]Ip= IIGA1 (0O)G(%O)[FA(Q) - FA(O)]IIp= IIFX(4X) - FX(O)IIp S Kpr (X)d (k,p) (6.14)
forrm (6.13). E
Theorem 6.2. (Existence of Onx and the Variability Approximation)
Again suppose (A.10) is true. Let N0o be such that (A.3) holds and suppose there is X0 > 0 so
that Ox e Ng0 for X e [0,X0]. Let Xn be such that r(X) -4 0 as n - uniformly for
X r [R ,Xo]. Consider the event En (X) defined by
"There is a unique root of Zn X, O3nX, in S (Kodn(X,p c)) C. For p E R, On X satisfies
lio3n X 6n xllp < Kprn (;X)dn (X90f
For each 6 > 0 and p e R there are constants no, Kp and Ko such that for all n > nO and
x e [Xn ,Xo]
P (En(W) >1{
Proof: Replace Inx by DZnx(Ox) in Theorem 4.2 of[12]. Because the form of GX(O) plays no
role, the same proof works. (A.10) guarantees that eventually On. E C with arbitrarily high
probability. U
- 47 -
6.2. Application to Constrained Non-Linear Regularization
We need to get a handle on the degree of non-linearity in the data functionals. In (A.1)
we made the assumption that the data functionals were three times Frechet differentiable. With
D denoting differentiaton with respect to e let
Ke,u4[x] = Drjl(;x)u4[x] (6.15)and
Kouv [x] = D2,n(O;x)uvO[x] (6.16)
We will make use of a further assumption.
Assumption. A.11. (Non-linearity of the Data Functionals)
For some t e (d ,M] there are positive constants a,, P, a2, P2' yi. y2 and 8 such that for all
p {0,1 land 0 e Neo
(i) ifK4 r WYw p (f2) and Ku e Wlil (S2)
lIKeU IIwi2p IIKllwIKlp IKulKulwI+51P+~~~ (~~, Ku e2
(ii) ifK eK(W+Y2P() Ku iW a2+02P(Q) and Kv e W2+2P (Q)
iikouvilwiP < IIKOIIw + _ lKIIUlWa2+02p IlKv 11Wc2-2P
From (A. 1) we have a technical but useful lemma.
Lemma 6.3. Suppose J is quadratic and that (Al), (A.3), (A.4), (A5) and (A.ll) holds.
(i) If Kv, Kw e W2 l(Q)nWX2 (Q K, e W (Q) and Ku e W e+d/2 with e >O and
0o+u r Neo then
ID2ZX(0 + u)vw4O12 < fIIK4ll 2 IKvII2 ml IIKwll2 (6.17)+ 2 |w|K w a 617}
+ JlK4l112,6 lKuI112 e+d12IIKVIIK,2Ot2lIKWlI11 2I
- 48 -
(ii) If Kv , Kw e W211()W2(), K4 e W21(n)nW2 2(e) and Ku e W2 + 3dl2
with e > 0 and ex+u e Ne, then
ID 2Z,, x(OA + U )VW O 1 2 < ( ||K |112 IIKv1||2a2 ||Kw ||2 al
+ llK¢Mla2 j2IKu12 d/2IIKV 112 a2 11Kw ll,ca2
+ k42IlKQll 2 a IIKv 112 ai+51 11Kw 112i +1 (6.18)w2 w~~~~~~21 2
+ k2jIlK 4II 2 "+p IIKU ll2e + 3d,2 IKV 112 IIKW I 22+2
+ SUp ISln(x) ItlK4I2I IIKvII .+O,t IIKwIIw2 +I2I.xE fl W2 W 2
(iii) If Ox e Ne0 then for Ku G Wd, (f)nW 1i+1 (fl) and K4 e W2d+ (Q)()WY, (Q) with
I [G X(Oe)) DZI O 2 < kn2 IIKdII++elIKu Wd+e
+ sup IS?t(x) I2IK 112Il lIKu l12 Ml+0l
+ I|K(O)L-OO)II2d/2+? IlK 4II2IIKU II 2 (6.19)
+ kn2lK(ex - OO)I 2 3d 2+e6IIK il 2 Y,1 IIKu 112
Proof: The proof is straightforward but somewhat tedious, we use integration by parts,
Cauchy-Schwartz's, Holder's and Sobolev's inequalities. For part (i) let 0 = 00 + u.
I D2Z;(0)vw4 12 < I [ri(Ooo; x) - ri(0; x)]KevwP[x]dF(x)1 2
+ I Jkvw[x]Koo[x]dF(x)1 2
+ I JKev[X]Kew[x]dF(x)1 2
+ I fKew[x]Kev[x]dF(x) 12 (6.20)
Applying the Cauchy-Schwartz inequality, the fact that F has a density which is bounded away
from zero and infinity, and A. 1 (i), the last three terms can each be bounded by a constant
times
IIK II2 IIKVIIa, IIKWII 2 a (6.21)
- 49 -
For the first term we have
f[r1(eo ; x) -r(O ; x)]K9vw4[x]dF(x)1l < jjrl(0o; .)-t(; )I2A IIKeVW IL2 (6.22)
Expanding l(00; x) - r1(0o + u ; x) and proceeding as in Theorem 3.2 we get
IIl(0o ; )_-r(O ; )IIL2 < sup IIK 0 UIIUd,c -|IKu 112 byL2~~N -0 w/+ d2e b (A. 5)
where e > 0 is arbitrarily small. But from (A. I1) (ii)
IiKevw W112 < IIK¢112a IIKvll2a22 (6.23)2 I 2wI
This proves (i).
For part (ii)
ID2z4 (Oe + u)vw4012. ID2Z(Ox + u)vw12 (6.24)
+ ID2[ZnX(OX+u)_ZX(OX+u)]vw412
The first quantity on the left had side is analyzed like in part (i) which gives the first two terms
of the desired upper bound in (6.18). Let 0 = Ox + u
D2[Z (8) - Z(e)]vw42 < - £eiK0v[xi] 12 (6.25)ni=1
+ ( I Ji(eo ; x) -r1(0 ; x)]K0vw 4[x ]d (Fn -F)(x)I2
+ I Kvw [X ]K0O[x ]d (Fn-F)(x) 12
+ I KeV4[X]Kew [x ]d (Fn-F)(x) I2
+ IJKow4[x]Kev[x]d(Fn-F)(x)I2)
The case p = 1 in A. 11, integration by parts and an analysis identical to that used in part (i)
leads to the k^2 terms in the upper bound in (6.18). Finally
-1 1kvw gvw ][x ]dSn (X) (6.26)n i=1
- so -
and applying integration by parts once again
I41X eiKevw4¢[Xi ] 12 SUP Sn (X) 12lKevw4l,wt (6.27)
Recall t > d so from A.11 (ii) this term may be bounded in the manner we wish. This proves
part (ii).
For part (iii)
I[G%(O)-_DZn X(Oe)]uOj2 < I Ku[x]KoX44x]d (Fn-F)(x)I2+ I Jk,U 4X ]dSn (X) 1 2 (6.29)
+ I f[T(0O; x) - r(Ox ; x)]Keu O[[x]dF (x) I2+ Ift(eo ;xo)-T1(ox ; x)]kexu4[x]d(Fn-F)(x)I2.
None of the terms present new difficulties. Using integration by parts and A. 11 the desired
bound follows quite easily. *
Making use of this lemma we can describe the behavior of the quantities r(X) and r, (X) in
(6.8) and (6.9).
Theorem 6.4. (Behavior of r* (X) and r (X))
Suppose 00 re)pO, J is quadratic and assumptions (A.3), (A.4), (A.5) and (A.ll) hold. If p
is such thatfor some e > 0, let
a = max(e+dI2M,c1/M,ox2/M Iand
b = min{(po - d/2M)/2, (2po - d/2M - 8/M)13, po,2 - dI2M - SIM I
if p e [a,b) then r* (X) -4 0 as -+0 and r(X) C j<P-Pp2 for
-dI2M < p < 2- d/2M -6/M.
- 51 -
Proof: Since J is quadratic K2(X) = 0
K3(X,R) = sup ItG-j (0O)D2Z(0o + u)vwIlp . (6.30)u,v e SO(R)w e So()
Using the spectral expansion for 11I11p
IIGx1 (0o)D2Z%(eo + u)Vw It2 =2[1 + yv]p[i + Xyv]Y2[D2Z,(0 + u)vw4v]2 (6.31)v
Lemma 6.3 (i) gives
[D 2ZX(0 + u)vw4v]2 S IIK4v11Z2IKvII1 2, IlKWVII,, (6.32)
+ IIKVI2I2 IIKU112 dr2l 6jKv 1 2mIIKw 11 22 2
where e> 0 is arbitrarily small. From (A.4), IIKellwml Illotll1/M <5oevO SO502~~~~~~
IKv II2 < R2 for v e S*(R). We have similar bounds for u e S*(R) and w E S*(1).2
This gives
[D2Z (0o + u)Vw4V]2 < 11OIII2 R2.12 + II 2vII,MR2R212 . (6.33)
But llkvllo = 1 and 1l(vll&M = [1 + yv]8M so substituting and using the fonnula (3.15) for
Cr(X,p) (p + 6/M < 2 - d/2M) we have
K 2 (A,R ) < R2X-(p + d2)(f + R2x--M (6.34)
From Theorem 3.1 and the constraint on p*, x (X) d (X,p *) _ - p )/2 so
2(= K 2(X,(X)) < X(p -p){X(po-d/2M -2p) + X,(2po-/M -d/2M _3p)) (6.35)
The upper bounds on p imply that X(po- d/2M - 2p*) and X(2po-8M- d/2M - 3p) both tend to
zero, proving the result. U
More stringent conditions are used to obtain results for the behavior of r*(X) and rn(X). We
- 52 -
have the following theorem.
Theorem 6.5. (Behavior of r*(X) and rn (X))
Again suppose (A.3), (A.4), (A.S) and (A.ll) hold and that 00 c-eEPO where po > 2dIM. For
c > 0 arbitrarily small, let
a = max(e+3d/2M,al/M,a2/M,ctl+r3/M ,c2+12/M}and
b = min ((po-d /2M)/2,(po0-/M),po,2-3d/2M,2-d/2M-y1/M,2-d/2M-(6+y2)/M)
If yi , 8 + Y2 < 2d and for some p e [a,b) with 2dlM . p* . sIM, X4, is a sequence such
that n1lx2 (P + d/2M) _ 0 then for any sequence of X's tending to zero with kX Xn,
r(X) -+ as X -+ and r (X) C X<P-P)'2for -d/2M < p < 2- 5d2M.
Proof: Choose k0 sufficiently small so that, via Theorem 6.4 and Theorem 6.1, 0 exists and
using Theorem 3.2 and Markov's inequality we have: O o{jjO,x-e2n)lk(P + d/2M) for
-d/2M < p < 2- 3d/2M.
K3(n,XR) = sup IIG -1 (0;)D2Zn (Ox + u)vwIlp (6.36)U,V E S(R,.X)w e S;(1,X)
Expanding the nonn
IIG,j (eX)D2Znx(ex + u)vwl12 = X[1 + yXV]p[l + XyXv]-2[D2Zx(0x + u)vw XV]2 (6.37)v
Now we apply Lemma 6.3(ii). For X < X0, Ox eNeo so (A.3), (A.4) and (A.5) give that
IlkXvIlt/M= =lKxvllw=JKxIxvllwt tllxvllxitM for 0 t < s. Using (A.4) and the lower
boundon p*,tIKuItQ1+01 < lIKul M IIluIllx- R etc.
[D2Znx(ex + U)VW4XV]2 < Il)Xv11IR 212 + II|4)AV12RMR4.12 (6.38)+ kn2II4XVII 21yl/M R2 12 + kn2IIOVII2 (8+y)iM R 4.12 (6.39)
+Sx ||¢AV||(8y2)/M R2 12
- 53 -
But llxvllxp = [1 + yxv]P and yxv - vId so from (3.15) using the upper bound on p
K#(n A,R ) < R2p+ d/2M) +R4Xkp + d/2M +8/M)+ k 2:(p + d/2M + yI/M)R 2 + k,2X (p + d/2M + (8 + Y2)/M)R 4
+2s"(p + /2 + (8; + y2YM)2
By a similar analysis
K2(nX) = sup IG21 (0)[G(Ox) - DZn%(Ox)]u jIp (6.40)S e SO((1X)
where
IIG -1 (OX)[G %(eX) - DZn x(eO)]uuI,p = Y[1 + yxv]P [I + kyxvyf2[[G;(Ox) - DZn %(ex)]UO-V]2Vp&v
Using Lemma 6.3(iii), the constrains on p and applying (A.3), (A.4) and (A.5) as above, we
have for u E S (1,k)
[[G%(ex) - DZnx(ef)]uOXv]2 < kf2I|XVIId,M212 + S211XvII2Y1/M 12 (6.41)
+ ii d E - x 0
+ kn2IIeX eOI3d1+C2IjXVII2,Y,1,Mli+ k211x-00l3d/2M+E1 vlx y/
Using (3.15) and recalling that r = d/2M
K? (n ,) < k 2X(P + 3d/2M + e) + S ,, + dnM + y/M) (6.42)
+ IIX _ 0011e/2M+AX( + dI2M) + kn2IIOX 0103d/2M+ (p + d/2M + y1/M)
where E > 0 is arbitrarily small. po > 3d/2M so 11X - e011 2+ <£ - d/2M - e and
IeX _ 00l12d,2+£ < xpo- 3d/2M - E (again e > 0 is arbitrarily small). From this
K2 (n 'A) < k 2x(p + 3dI2M + c) + S 2X(p + d/2M + y1IM) (6.43)
Xpo- d/2M - X-(p + d12M) + 2XPO - 3d/2M - £E-(p + d/2M + y1IM)+XoX(p+ k,,XP
- 54 -
From (A.1), Op(kn ) p (s,?)= n-1 and from the remarks at the begining of the proof,
Op (xn2(X)) = n' Ix-(p + d/2M)
r 2(X) = Op (K 2 (n ,) + K#2 (n X,xn (X))) (6.44)< 1%-(p + 3d/2M + E) 1p3/Mp+ d/2M + ylMM)
+n Po +,X+ Xpo - d/2M - EX_(p + d/2M) + n-l XPo - 3d/2M - EX-(p + dl2M + yl/M )
+ n-lX-(p + d/2M)X-(p + d/2M) + n -2X-2(p* + d/2M)X,(p + d/2M + 8/PM)
+ x -(p + d/2M + yI/M) _ _(p + d/2M) + nljp + d/2M + (S +Y2)/M), -2X-2(p* +d/2M)
+ n-ix-(P + d/2M + (8 + y2)/M) - +d/2M)
Rewriting this expression we get
nr2(X) < X(P* P) {qn (W)} (6.45)
where
qn(X) = n l-(p + 3d/2M + E) + n~-X-(P + d/2M +yl/M) (6.46)
+ Xpo - p - 3d/2M - £ + n1XP - 3d/2M - EX (p + d/2M + y1/M)
+ n-l-2(p* + d/2M) + n 2kj3 (P + d/2M)X-(p + d/2M + 8/M)
+ n-2X-Y1/M1nl-AX-2 (p + d/2M) + n-3A78 + Y2)/M n -3k3.(pP + d/2M)
+ X(S+Y2)/M n-2X-2(p + d/2M)
At this point we use that Yi and 6 + Y2 are bounded above by 2d and p . 2d/M. After some
algebra we find
q, (X) < nIX-2 (p" + d/2M) (6.47)
So by definition of Xn , qn (A) - 0 for X E [kn X0]. This proves the result. M.
6.3. Verification of Assumption A.11
We end by describing a situation under which the linearization may be justified for the
time invariant diffusion considered in §5. Before stating the theorem we give two lemmas.
- 55 -
x x
Lemma 6.6. Let K x] =)x (y)dy and Kh Y= (Y)(Y)dY. If h e W' [0,1] then there is a
positive constant c such that for K4 e W2P[O,1], KuE W2[0,1], Kv e W#[O,1] and
p (i{0,11
IlKh Ou 11w22p 5 c llh llwl 11KO11w22p lIKulW 22
and
I1KhOu 1w22p :5 c lih llw 1 JjKOjjw2p JjKujjw22 JjKYtvjw.
Proof: For p = 0, let 4(x)=K4[x]
IIKh uII2W = IIKh Utw2p (6.48)
Integrating by parts as in Theorem 4.2
llkh OU I1622P < C tlW§IL2llhU Wll (6.49)
where c is a positive constant. But Ilhu llw < ljh llw1 Iu llwI < Ilh IIW IlKu I 2Using this,
the result for p = 1 is simple. The result for IAKh 4uv Jjwp is obtained in a very similar fashion.
U
For the diffusion problem, recall from (5.13) we have
K04[x] = I4(y)he(y)dy (6.50)
x
where hq(x) = 21 (If(t)dt + b }. From this
K04u = )(y)he'(y)dy (6.51)
x
where h8a'(x) = 23(x) If (t)dt + b }, and
- 56 -
Keou (y)h"(cy)dy (6.52)
x
where he"(x) = 4 ()f(t)dt + b }. So following Lemma 5.2 we have:
Lemma 6.7. Iff e L2[0,1] then for 0 e B 1(R) with I 0(x) I > £ > 0, he ,h0' ,h0", e B1(R')
where R' > 0 depends on R, e and llf IIL2.
Proof: Very straightforward the argument is similar to that given in Lemma 5.2. U
Combining these results we have the main theorem.
Theorem 6.8. Suppose 00 e B k(R) for k > 3.25 and 00(x) > e > 0. Let
No =(O e B2(R); (x) > e} and for m . k C =(0 e W2[0,1];.0 0} and
J(O) = IlDm0ll12. Suppose f E W2 [0,1] and as in Lemma 5.2, f (x) 6 > 0 and b > 0. Then
(A.3), (A,4), (A.5), (A.10) and (A.ll) all hold. Moreover if p* = 21M then
(i) r*(X) -* 0 and r (X) < X-(p - p )'2 for -1/2M < p<2 - 3/2M.
(ii) if X, is a sequence tending to zero such that ni'XS5M - 0 then for any sequence of X's
tending to zero with A> A_ , rn*(X) -+ 0 and rn (A)CA<p - p )/2 or -1/2M < p<2 - 5/2M.
x 1
Proof: Let KO[x] = t(y)dy and <t4,W> = (m)(y)4(m)(y)dy. (A.3), (A.4) and (A.5) come
from Theorem 5.3 (Neo = Ne(1,R)). M = m+I for (A.4). From Theorem 5.3, s = 3 in (A.5).
p > 21M for (A.10). Using Lemma 6.6 and 6.7, A.11 follows with 6 = a, a2 = 2 and
1I = P2 =71 = 72 = 0
Parts (i) and (ii) follow from Theorems 6.4 and 6.5. -
- 57 -
References
1. Agmon, S., Elliptic Boundary Value Problems, Van Nostrand, 1965.
2. Backus, G. and Gilbert, F., "Uniqueness in the inversion of inaccurate gross earth data,"
Philos. Trans. Royal Soc. Ser. A., vol. 266, pp. 123-192, 1970.
3. Box, G.E.P. and Jenkins, G.M., Times Series Analysis: Forecasting and Control,
Holden-Day, Inc., San Francisco, 1976.
4. Butzer, P. L. and Berens, H., Semi-Groups of Operators and Approximation, Springer-
Verlag, Heidelberg, 1967.
5. Butzer, P.L. and Westphal, U., "An access to fractional differentiation via fractional
difference quotients," in Fractional Calculus and Its Applications, ed. B. Ross,
Akademie-Verlag, Berlin, 1974a.
6. Cannon, J.R. and Ewing, R.E., "Quasi-linear parabolic systems with non-linear boundary
conditions," in Inverse and Improperly Posed Problems in Differential Equations, ed. G.
Anger, pp. 36-43, Akademie-Verlag, Berlin, 1979.
7. Cannon, J. R. and DuChateau, P., "Approximating the solution to the Cauchy problem
for Laplace's equation," SIAM J. Numer. Anal., vol. 14, pp. 473-483, 1980.
8. Cannon, J. R. and DuChateau, P., "An Inverse Problem for a Nonlinear Diffusion Equa-
tion," SIAM J. Appl. Math., vol. 39, pp. 272-289, 1980.
9. Collatz, L., Differential Equations: An Introduction with Applications, John Wiley &
Sons, Chichester, 1986.
10. Colton, D., "The inverse scattering problem for time-hannonic acoustic waves," SIAM
Review, vol. 26, no. 3, pp. 323-350, 1984.
11. Colton, D. and Monk, P., "The numerical solution of the three-dimensional inverse
scattering problem for time harmonic acoustic waves," SIAM J. Sci. Stat. Comput., vol.
- 58 -
8, pp. 278-291, 1987.
12. Cox, D. D. and O'Sullivan, F., "Analysis of penalized likelihood type estimators wih
application to generalized smoothing in Sobolev Spaces," Tech. Rep. No. 51, Statistics
Dept., University of Califomia-Berkeley (submitted to the Annals of Statistics), 1985.
13. Cox, D. D., "Approximation of the method of regularization estimators," Ann. Statist.
(to appear), 1987.
14. Deuflhard, P. and Hairer, E., Numerical Treatment of Inverse Problems in Differential
and Integral Equations, Birkhauser, Boston, 1983.
15. Devaney, A.J., "Reconstructive tomography with diffracting wavefields," Inverse Prob-
lems, vol. 2, pp. 161-183, 1986.
16. Hald, O.H., "Inverse eigenvalue problems for the mantle," Geophys. J. R. Astr. Soc.,
vol. 62, pp. 41-48, 1980.
17. Joshi, M.C. and Bose, R.K., Some Topics in Nonlinear Functional Analysis, J. Wiley &
Sons, New York, 1985.
18. Kravaris, C. and Seinfeld, J. H., "Identification of parameters in distrbuted parameter
systems by regularization," SIAM J. Control and Optimization, vol. 23, no. 2, pp. 217-
241, 1985.
19. Ladyzhenskaya, O.A. and Ural'tseva, N.N., Linear and Quasi-linear Elliptic Equations,
Academic Press, New York, 1968.
20. Lions, J.L. and Magenes, E., Non-Homogeneous Boundary Value Problems and Applica-
tions, I, Springer-Verlag, Berlin, 1972.
21. Liu, J.Q. and Chen, Y.M., "An iterative algorithm for solving inverse problems of two-
dimensional diffusion equations," SIAM J. Sci. Stat. Comput., vol. 5, pp. 255-269, 1984.
- 59 -
22. Nychka, D., Wahba, G., Goldfarb, S., and Pugh, T., "Cross-validated spline methods for
the estimation of three-dimensional tumor size distributions from observations on two
dimensional cross sections," J. Amer. Statist. Assoc., vol. 79, no. 388, pp. 832-846,
1984.
23. Nychka, D. and Cox, D. D., "Convergence rates for regularized solutions of integral
equations from discrete, noisy data," Ann. Statist., 1987 (to appear).
24. O'Sullivan, F. and Wahba, G., "A cross validated Bayesian retrieval algorithm for non-
linear remote sensing experiments," J. Comp. Physics, vol. 59, no. 3, pp. 441455, 1985.
25. O'Sullivan, F., "A statistical perspective on ill-posed inverse problems (with discus-
sion)," J. Statist. Science, vol. 1, pp. 502-527, 1986.
26. O'Sullivan, F. and Wong, T., "Determining a functional diffusion coefficient in the heat
equation," Proceedings of the 19'th Symposium on the Interface, American Statistical
Association, 1987.
27. Parker, R.L., "The magnetotelluric inverse problem," Geophys. Surveys, vol. 46, pp.
5763-5783, 1983.
28. Ross, B., Fractional Calculus and Its Applications, Akademie-Verlag, Berlin, 1974.
29. Rudin, W., Principles ofMathematical Analysis, McGraw-Hill, New York, 1976.
30. Santoza, F. and Symes, W.W., "Linear inversion of band-limited reflection seismo-
grams," SIAM J. Sci. Stat. Comput., vol. 7, pp. 1307-1330, 1986.
31. Stone, C.J., "Optimal global convergence rates for nonparametric regression," Ann. Sta-
tist., vol. .10, pp. 1040-1053, 1982.
32. Taylor, M.E., Pseudodifferential Operators, Princeton University Press, Princeton, New
Jersey, 1981.
- 60 -
33. Tikhonov, A. and Arsenin, V., Solutions of Ill-Posed Problems, Wiley, New York, 1977.
34. Treves, F., Basic Linear Partial Differential Equations, Academic Press, New York,
1975.
35. Weinberger, H. F., "Variational Methods for Eigenvalue Problems," in Lecture Notes by
G. P. Schwartz, Department of Mathematics, University of Minnesota, Minneapolis,
1962.
36. Weinberger, H. F., "Variational Methods for Eigenvalue Approximation," in CBMS
Regional conference series in applied mathematics, SIAM, Philadelphia, 1974.
- 61 -
Figure Legends
Figure 1.l.a : Time series plots of the data at different measurement sites: (i) x = .05, (ii)
x = .50, (iii) x = .95. The solid lines are the true values of u (x,t). See equation (1.6).
Figure 1.1.b : Spatial plots of the data at different times: (i) t = .01, (ii) t = .10, (iii) t = .50.
As in Figure 1.1.a, the solid lines are the true values of u (x,t).
Figure 1.2 : Regularization estimates corresponding to the data in Figure 1.1. The solid line is
the true diffusion coefficient. The dark dashed line corresponds to the optimal predictive mean
square error estimate. The dotted line corresponds to the cross-validated estimate. See text.
Figure 5.1 : Eigensequence plot for the Raleigh quotient {I1D2IL41l2/(11K8414}, see (5.16). The
plot gives log(yv) versus log(v).
Figure 5.2 : Eigensequence plot for the Raleigh quotient ({ID2W1121/(11K411Z21 see (5.17). The
plot gives log(y,) versus log(v).
Time Series Plots at Different Measurement Sites
x = .05
0.0 0.1 0.2 0.3 0.4 0.5
time
0.1 0.2 0.3 0.4
time
0.0 0.1 0.2 0.3 0.4
bme
0.5
0.5
00
0
lw
RO
at
0
00
to
x - .50
o L.
O.C
00
C9
3
2
lw
x .9s
0 - - I a 0 0 . .
lw. 0 a
I -I I I
3
w
viyu t -I ..)
Spatial Plots at Different Times
0.0 0.2 0.4 0.6 0.8 1.0
x
0.0 0.2 0.4 0.6 0.8 1.C
x
I . I
0.0 0.2 0.4 0.6 0.8
00
0o
IqO
t * .01
0
00
0
0N
0
0
0
Go
0(0
0
0
N
0
t. .10
t-.50
II
1.0
Regularized Estimates
0.4 0.6 0.8
x
C\J
0
6
co
(0
qcm
0
0\
0;
0.0 0.2 1.0
Flym il
Eigensequence PlotU')
00CMJ
IC)
O o
O ll ll l l l lI~~~~~~~00
0.0 0.51.0 1.5 2.0 2.5 3.0 3.5~~~~~~
IF-ICZNAE S. .
Elgensequence Plot
.
00
*000.00
.@0000
0
0
0
0
0
0
1.0 1.5 2.0
10
0CV
L1c\j
.-O
0cN
C00\
L00 0
0
0.0 0.5 2.5 3.0 3.5
TECHNICAL REPORTSStatistcs Department
University of California, Berkeley
1. BREIMAN, L amid FREEDMAN, D. (Nov. 1981, revised Feb. 1982). How many variables should be entered in aregrssion q ? Jour. Amer. Statist Assoc., March 1983, 78, No. 381, 131-136.
2. BRILLINGER, D. R. (Jan 1982). Some contasting examples of the time and frequency domain approaches to time seriesanalysis. Time Series Methods in Hydrsciences, (A. H. El-Shaarawi and S. R. Esterby, eds.) Elsevier ScientificPublishing Co., Amsterdam. 1982 pp. 1-15.
3. DOKSUM, K. A. (Jan. 1982). On the performance of estimates in proportional hazard and log-linear models. SurvivalAnalysis, (John Crowley and Richard A. Johnson, eds.) IMS Lectr Notes - Monograph Series, (Shanti S. Gupt seriesed.) 1982, 74-84.
4. BICKEL, P. J. and BREIMAN, L. (Feb. 1982). Sums of functions of nearest neighbor distances, moment bounds, limittheorems and a goodness of fit test An Prob., Feb. 1982, 11. No. 1, 185-214.
5. BRLLUNGER, D. R. and TUKEY, J. W. (Mach 1982). Specwrum estimation and system identification relying on aFourier transform. The Collected Works of J. W. Tukey, vol. 2, Wadsworth, 1985, 1001-1141.
6. BERAN, R. (May 1982). Jackknife approximation to bootstrap estimates. An. S March 1984, 12 No. 1, 101-118.
7. BICKEL, P. J. and FREEDMAN, D. A. (June 1982). Bootstrapping regression models with many parameters.Lchmanm Festschrift, (P. J. Bickel, K. Doksum and J. L. Hodges, Jr., eds.) Wadsworth Press, Belmont, 1983, 2848.
8. BICKEL P. J. and COLLINS, J. (March 1982). Mziniming Fisher information over mixtures of distributions. Sankhyi,1983, 45, Series A, Pt. 1, 1-19.
9. BREIMAN, L. and FRIEDMAN, J. (July 1982). Estimating optimal transformations for multiple regression and correlation.
10. FREEDMAN, D. A. and PETERS, S. (July 1982, revised Aug. 1983). Bootstrapping a regrssion equation. someempirical results. JASA, 1984, 79, 97-106.
11. EATON, M. L. and FREEDMAN, D. A. (Sept. 1982). A remark on adjusting for covariates in multiple regression.
12. BICKEL, P. J. (April 1982). Minimax estimation of the mean of a mean of a normal distribution subject to doing weUlat a point. Recent Advances in Statistics, Academic Press, 1983.
14. FREEDMAN, D. A., ROTFHENBERG, T. and SUTCH, R. (Oct. 1982). A review of a residential energy end use model.
15. BRILLINGER, D. and PREISLER, H. (Nov. 1982). Maximum likelihood estimation in a latent variable problem. Studiesin Econometrics, Time Series, and Multivariate Statistics, (eds. S. Karlin, T. Amemiya, L. A. Goodman). AcademicTrF-, Wew York, 1983, pp. 31-&37
16. BICKEL, P. J. (Nov. 1982). Robust regression based on infinitesimal neighborhoods. Ann. Statist., Dec. 1984, 12,1349-1368.
17. DRAPER, D. C. (Feb. 1983). Rank-based robust analysis of linear models. I. Exposition and review. Statistcal Science,1988, Vol3 No. 2 239-271.
18. DRAPER, D. C. (Feb 1983). Rank-based robust inference in regression models with several observations per cell.
19. FREEDMAN, D. A. and FENBERG, S. (Feb. 1983, revised April 1983). Statistics and the scientific method, Commentson and reactions to Freedman, A rejoinder to Fienberg's comments. Springer New York 1985 Cohort hAn s in SocialResearch, (W. M. Mason and S. E. Fienberg, eds.).
20. FREEDMAN, D. A. and PETERS, S. C. (March 1983, revised Jan. 1984). Using the bootstrap to evaluate forecastingequations. J. of Forecastin. 1985, Vol. 4, 251-262.
21. FREEDMAN, D. A. and PETERS, S. C. (March 1983, revised Aug. 1983). Bootstrapping an econometric model: someempiical results. JBES, 1985, 2, 150-158.
22. FREEDMAN, D. A. (March 1983). Structural-equation models: a case study.
23. DAGGETr, R. S. and FREEDMAN, D. (April 1983, revised Sept 1983). Econometrics and the law: a case study in theproof of antiaust damages. Proc. of the Berkeley Conference, in honor of Jerzy Neyman and Jack Kiefer. Vol I pp.123-172. (L. Le Cam, R. Olshen eds.) Wadsworth, 1985.
- 2 -
24. DOKSUM, K. and YANDELL, B. (April 1983). Tests for exponentiality. Handbook of Staistis, (P. R. Krishnaiah andP. K. Senw eds.) 4, 1984, 579-611.
25. FREEDMAN, D. A. (May 1983). Comments on a paper by Markus.
26. FREEDMAN, D. (Oct. 1983, revised March 1984). On bootstrapping two-stage least-squares estimates in stationary linearmodels. AMn. Statist*, 1984, 12, 827-842.
27. DOKSUM, K. A. (Dec. 1983). An extension of partial likelihood methods for proportional hazard models to generaltransformation models. Ann. Statist., 1987, 15, 325-345.
28. BICKEL, P. J., GOETZE, F. and VAN ZWET, W. R. (Jan. 1984). A simple analysis of third order efficiency of estimateProc. of the Neyman-Kiefer Conference, (L. Le Cam, ed.) Wadsworth, 1985.
29. BICKEL, P. J. and FREEDMAN, D. A. Asymptotic normality and the bootstrap in stratified sampling. Ann. Statist.12 470-482.
30. FREEDMAN, D. A. (Jan. 1984). The mean vs. the median: a case study in 4-R Act litigation. JBES. 1985 Vol 3pp. 1-13.
31. STONE, C. J. (Feb. 1984). An asymptotically optimal window selection rule for kemel density estimates. Ann. Statist.,Dec. 1984, 12, 1285-1297.
32. BREIMAN, L. (May 1984). Nail finders, edifices, and Oz.
33. STONE, C. J. (Oct. 1984). Additive regression and other nonparametric models. Ann. Statist., 1985, 13, 689-705.
34. STONE, C. J. (June 1984). An asymptotically optimal histogran selection rule. Proc. of the Berkeley Conf. in Honor ofJerzy Neyman and Jack Kiefer (L. Le Cam and R. A. Olshen, eds.), 11, 513-520.
35. FREEDMAN, D. A. and NAVIDI, W. C. (Sept. 1984, revised Jan. 1985). Regression models for adjusting the 1980Census. Statistical Science. Feb 1986, Vol. 1, No. 1, 3-39.
36. FREEDMAN, D. A. (Sept. 1984, revised Nov. 1984). De Finetti's theorem in continuous time.
37. DIACONIS, P. and FREEDMAN, D. (Oct. 1984). An elementary proof of Stirling's formula. Amer. Math Monthly. Feb1986, Vol. 93, No. 2, 123-125.
38. LE CAM, L. (Nov. 1984). Sur l'approximation de familles de mesures par des families Gaussiennes. Ann. Inst.Henri Poincare, 1985, 21, 225-287.
39. DIACONIS, P. and FREEDMAN, D. A. (Nov. 1984). A note on weak star uniformities.
40. BREIMAN, L. and IHAKA, R. (Dec. 1984). Nonlinear discriminant analysis via SCALING and ACE.
41. STONE, C. J. (Jan. 1985). The dimensionality reduction principle for generalized additive models.
42. LE CAM, L. (Jan. 1985). On the normal approximation for sums of independent variables.
43. BICKEL P. J. and YAHAV, J. A. (1985). On estimating the number of unseen species: how many executions werethere?
44. BRILLJNGER, D. R. (1985). The natural variability of vital rates and associated statistics. Biometrics, to appear.
45. BRILLINGER, D. R. (1985). Fourier inference: some methods for the analysis of array and nonGaussian series dataWater Resources Bulletin, 1985, 21, 743-756.
46. BREIMAN, L. and STONE, C. J. (1985). Broad spectrum estimates and confidence intervals for tail quantiles.
47. DABROWSKA, D. M. and DOKSUM, K. A. (1985, revised March 1987). Partial likelihood in transfornation modelswith censored data. Scandinavian J. Statist., 1988, 15, 1-23.
48. HAYCOCK, K. A. and BRILLINGER, D. R. (November 1985). LIBDRB: A subroutine library for elementary timeseries analysis.
49. BRILLINGER, D. R. (October 1985). Fitting cosines: some procedures and some physical examples. Joshi Festschrift,1986. D. Reidel.
50. BRILLINGER, D. R. (November 1985). What do seismology and neurophysiology have in common? - Statistics!Comptes Rendus Math. Acad. Sci. Canada. January, 1986.
51. COX, D. D. and O'SULLIVAN, F. (October 1985). Analysis of penalized likelihood-type estimators with application togeneralized smoothdng in Sobolev Spaces.
- 3 -
52. O'SULLIVAN, F. (November 1985). A practical perspective on ill-posed inverse problems: A review with somenew developnnts. To appear in Journal of Statistical Science.
53. LE CAM, L and YANG, G. L (November 1985, revised March 1987). On the preservation of local asymptotic normalitYunder infonnaton loss.
54. BLACKWELL., D. (Novenber 1985). Approximate normality of large products.
55. FREEDMAN, D. A. (June 1987). As others see us: A case study in path analysis. Joumal of EucationalStatistics. 12, 101-128.
56. LE CAM, L and YANG, G. L. (January 1986). Replaced by No. 68.
57. LE CAM, L. (Febrary 1986). On the Bernstein - von Mises theorem.
58. O'SULLIVAN, F. (January 1986). Estimation of Densities and Hazards by the Method of Penalized likelihood.
59. ALDOUS, D. and DIACONIS, P. (February 1986). Strong Uniform Times and Finite Random Walks.
60. ALDOUS, D. (March 1986). On the Markov Chain simulation Method for Uniform Combinatorial Distributions andSimulated Annealing.
61. CHENG, C-S. (April 1986). An Optmizaton Problem with Applications to Optmal Desipg Thoory.
62. CHENG, C-S., MAJUMDAR, D., STUFKEN, J. & TURE, T. E. (May 1986, revised Jan 1987). Optinal step typdesign for co g test treatments with a controL
63. CHENG, C-S. (May 1986, revised Jan. 1987). An Application of the Kiefer-Wolfowitz Equlivalence Theorem.
64. O'SULLIVAN, F. (May 1986). Nonparametric Estmation in the Cox Propodrional Hazards Model.
65. ALDOUS, D. (JUNE 1986). Finite-Time Implications of Relaxation Times for Sochasdcally Monotone Processes.
66. PITMAN, J. (JULY 1986, revised November 1986). Stationary Excursions.
67. DABROWSKA, D. and DOKSUM, K. (July 1986, revised November 1986). Estimates and confidence intervals formedian and mean life in the proportional hazard model with censored data. Biometrika, 1987, 74, 799-808.
68. LE CAM, L. and YANG, G.L. (July 1986). Distinguished Statistics, Loss of information and a theorem of Robert B.Davies (Fourth edition).
69. STONE, C.J. (July 1986). Asymptotic properties of logspline density estimation.
71. BICKEL, PJ. and YAHAV, J.A. (July 1986). Richardson Extrapolation and the Bootstrap.
72. LEHMANN, E.L. (July 1986). Statistics - an overview.
73. STONE, C.J. (August 1986). A nonparametric framework for statistical modelling.
74. BIANE, PH. and YOR, M. (August 1986). A relation between L6vy's stochastic area formula, Legendre polynomial,and some continued fractions of Gauss.
75. LEHMANN, E.L. (August 1986, revised July 1987). Comparing Location Experiments.
76. O'SULLIVAN, F. (September 1986). Relative risk estimation.
77. O'SULLIVAN, F. (Sptmber 1986). Deconvolution of episodic hormone data.
78. PITMAN, J. & YOR. M. (September 1987). Further asymptotic laws of planar Brownian motion.
79. FREEDMAN, D.A. & ZEISEL. H. (November 1986). From mouse to mam The quantitative assessment of cancer risks.To appear in Statistical Science.
80. BRILLINGER, D.R. (October 1986). Maximum likelihood analysis of spike trains of interacting nerve cells.
81. DABROWSKA, D.M. (November 1986). Nonparametric regression with censored survival time data.
82. DOKSUM, K.J. and LO, A.Y. (Nov 1986, revised Aug 1988). Consistent and robust Bayes Procedures forLocation based on Pial lIformation.
83. DABROWSKA, D.M., DOKSUM, KA. and MIURA, R. (November 1986). Rank esimate in a class of seniparnerictwo-sunple models.
- 4 -
84. BRlLLINGER, D. (Decenber 1986). Some statistical methods for random process data from seismology andneurophysiology.
85. DIACONIS, P. and FREEDMAN, D. (December 1986). A dozen de Finetti-style results in search of a theory.Am. inst. Henr irll 1987, 23, 397-423.
86. DABROWSKA, D.M. (January 1987). Uniforn consistency of nearest neighbour and keemel conditional Kaplan- Meier estdmas.
87. FREEDMAN, DA., NAVIDI, W. and PETERS, S.C. (February 1987). On the impact of variable selection infitting regression equations.
88. ALDOUS, D. (February 1987, revised April 1987). Hashing with linear probing, under non-uniforn probabilities.
89. DABROWSKA, D.M. and DOKSUM, KA. (March 1987, revised January 1988). Estimating and testing in a twosanple generalized odds rate model. J. Amer. Statist. A 1988, 83, 744749.
90. DABROWSKA, D.M. (March 1987). Rank tests for matched pair experiments with censored data.
91. DIACONIS, P and FREEDMAN, D.A. (April 1988). Conditional limit tieorems for exponential families and finiteversions of do Finetti's theorem. To appear in the Joumal of Applied Probability.
92. DABROWSKA, D.M. (April 1987, revised September 1987). Kaplan-Meier estimate on the plan.
92L ALDUS, D. (April 1987). The Harmonic mean formula for probabilities of Unions: Applications to spare randomgraphs.
93. DABROWSKA, D.M. (June 1987, revised Feb 1988). Nonparametric quantile regression with censored data
94. DONOHO, D.L. & STARK, PR. (June 1987). Uncetainty principles and signal recovery.
95. CANCELIED
96. BRILLINGER, D.R. (June 1987). Some examples of the statistical analysis of seismological data To appear inProceedings, Centennial Anniversary Symposium, Seismographic Stations, University of Califormia, Berkeley.
97. FREEDMAN, DA. and NAVIDI, W. (June 1987). On the multi-stage model for carcinogenesis. To appear inEnvironmental Health Perspectives.
98. O'SULUVAN, F. and WONG, T. (June 1987). Detemining a function diffusion coefficient in the heat equation.99. O'SULLIVAN, F. (June 1987). Constrained non-linear regularization with application to some system identification
problems.
100. LE CAM, L. (July 1987, revised Nov 1987). On the standard asymptotic confidence ellipsoids of Wald.
101. DONOHO, D.L. and LIU, R.C. (July 1987). Pathlogies of some minimum distance estimators. Annals ofStatistics, June, 1988.
102. BRILLINGER, D.R., DOWNING, K.H. and GLAESER, R.M. (July 1987). Some statistical aspects of low-doseelectron imaging of crystals.
103. LE CAM, L. (August 1987). Harald Cramer and sums of independent random variables.
104. DONOHO, A.W., DONOHO, D.L. and GASKO, M. (August 1987). Macspin: Dynamic graphics on a desktopcomputer. IEEE Computer Graphics and applications, June, 1988.
105. DONOHO, D.L. and LIU, R.C. (August 1987). On minimax estimation of linear functionals.
106. DABROWSKA, D.M. (August 1987). Kaplan-Meier estimate on the plane: weak convergence, LIL and the bootstrap.
107. CHENG, C-S. (Aug 1987, revised Oct 1988). Some orthogonal main-effect plans for asymmetrical factorials.
108. CHENG, C-S. and JACROUX, M. (August 1987). On the construction of trend-free run orders of two-level factorialdesigns.
109. KLASS, M.J. (August 1987). M g E max Sk/ES': A prophet inequality for sums of I.I.D. mean zero variates.
110. DONOHO, D.L. and LIU, R.C. (August 1987). The "automatic" robustness of minimum distance functionals.Annals of Statistics, June, 1988.
111. BICKEL- PJ. and GHOSH, J.K. (August 1987, revised Jumn 1988). A decomposition for thelikMelihood ratio statisticand the Batlett correction - a Bayesian argument
- 5 -
112. BURDZY, K., PIT?MAN, J.W. and YOR, M. (Septmber 1987). Some asymptotic laws for crossings and excurions.
113. ADHDCARI A. and PilT , . (September 1987). The shortet plana arc of width 1.
114. RITOV, Y. (September 1987). Estimadon in a linear regression model with censored data
115. BICKEL PJ. and RITOV, Y. (Sept 1987, revised Aug 1988). Large sample theory of estimation in biased samplingregression models I.
116. RrTOV, Y. and BICKEL, P.J. (Sept.1987, revised Aug. 1988). Achieving information bounds in non andsemiparaetric models.
117. RlTOV, Y. (October 1987). On the convergence of a maximal correlation algoritun with alternating projections.
118. ALDOUS, D.J. (October 1987). Meeting times for independent Markov chains.
119. HESSE, C.H. (October 1987). An asymptotic expansion for the mean of the passage-time distribution of integratedBrownian Motion.
120. DONOHO, D. and LIU, R. (Oct. 1987, revised Mar. 1988, Oct. 1988). Geometizing rates of convergence, II.
121. BRIllINGER, D.R. (October 1987). Esimating the chances of large earthquakes by radiocarbon dadng and statisticalmodelling. To appear in Statistics a Guide to the Unknown.
122. ALDOUS, D., FLANNERY, B. and PALACIOS, J.L. (November 1987). Two applications of um processes: The fringeanalysis of seach trees and the simulation of quasi-stationary distributions of Markov chains.
123. DONOHO, D.L., MACGIBBON, B. and LIU, R.C. (Nov.1987, revised July 1988). Mbinmax risk for hyperrectangles.
124. ALDOUS, D. (Novernber 1987). Stopping times and tightnss IL
125. HESSE, C.H. (November 1987). The present state of a stochastic model for sedimentation.
126. DALANG, R.C. (December 1987, revised June 1988). Optimal stopping of two-parameter processes onnonstandard probability spaces.
127. Same as No. 133.
128. DONOHO, D. and GASKO, M. (December 1987). Multivariate generalizations of the median and timmed mean H.
129. SMITH, D.L. (Decenber 1987). Exponential bounds in Vapnik-tervonenkis classes of index 1.
130. STONE, CJ. (Nov.1987, revised Sept. 1988). Uniform error bounds involving logspline models.
131. Same as No. 140
132. HESSE, C.H. (December 1987). A Bahadur - Type representation for empirical quantiles of a large class of stationary,possibly ifinite - variance, linear processes
133. DONOHO, D.L. and GASKO, M. (December 1987). Multivariate generations of the median and trimmed mean, L.134. DUBINS, L.E. and SCHWARZ, G. (December 1987). A sharp inequality for martingales and stopping-times.
135. FREEDMAN, DA and NAVIDL W. (Decenber 1987). On the risk of lung cancer for ex-smokers.
136. LE CAM, L. (January 1988). On some stochastic models of the effects of radiation on cell survival.
137. DIACONIS, P. and FREEDMAN, DA. (April 1988). On the uniform consistency of Bayes estimates for multinomialprobabiities.
137a. DONOHO, D.L. and LIU, R.C. (1987). Geometrizing rates of convergence, I.
138. DONOHO, DL. and LIU, R.C. (January 1988). Geometrizing rates of convergence, m.
139. BERAN, R. (January 1988). Refining simultaneous confidence sets.
140. HESSE, C.H. (December 1987). Numerical and statistical aspects of neural networks.
141. BRILLINGER, D.R. (Mar. 1989). a) A study of second- and third-order spectral pocedures and maximum likelihoodidentificaion of a bilinear system. b) Some staistical aspects of NMR spectroscopy, Actas del 20 congreso lantinowneri-cano de probabilidad y estadistica matenatica, Caracas, 1985.
142. DONOHO, D.L. (Jan. 1985, revised Jan. 1988). One-sided inference about functionals of a density.
- 6 -
143. DALANG, R.C. (Feb. 1988, revised Nov. 1988). Randomization in the two-armed bandit problem.144. DABROWSKA. D.M., DOKSUM, K.A. and SONG, J.K. (February 1988). Graphical comparisons of cumulative hazards
for two populations.145. ALDOUS, D.J. (February 1988). Lower bounds for covering times for reversible Markov Chains and random walks on
graphs.146. BICKEL, PJ. and R1TOV, Y. (Feb.1988, revised August 1988). Estimating integrated squared density derivatives.
147. STARK, PB. (March 1988). Strict bounds and applications.
148. DONOHO, D.L. and STARK, P.B. (March 1988). Rearrangements and smoothng.
149. NOLAN, D. (March 1988). Asymptotics for a multivariate location estimator.
150. SEILLIER, F. (March 1988). Sequential probability forecasts and the probability integral transform.
151. NOLAN, D. (March 1988). Limit theorems for a random convex set.
152. DIACONIS, P. and FREEDMAN, DA. (April 1988). On a theorem of Kuchler and Lauritzen.
153. DIACONIS, P. and FREEDMAN, DA. (April 1988). On the problem of types.
154. DOKSUM, KA. (May 1988). On the correspondence between models in binary regression analysis and survival analysis.
155. LEHMANN, E.L. (May 1988). Jerzy Neyman, 1894-1981.
156. ALDOUS, D.J. (May 1988). Stein's method in a two-dimensional coverage problem.
157. FAN, J. (June 1988). On the optimal rates of convergence for nonparametric deconvolution problem.
158. DABROWSKA, D. (June 1988). Signed-rank tests for censored matched pairs.
159. BERAN, R.J. and MILLAR, P.W. (June 1988). Multivariate symmetry models.
160. BERAN, R.J. and MILLAR, P.W. (June 1988). Tests of fit for logistic models.
161. BREIMAN, L. and PETERS, S. (June 1988). Comparing automatic bivariate smoothers (A public service e se).
162. FAN, J. (June 1988). Optimal global rates of convergence for nonparametric deconvolution problem.
163. DIACONIS, P. and FREEDMAN, DA. (June 1988). A singular measure which is locally uniform. (Revised byTech Report No. 180).
164. BICKEL, PJ. and KRIEGER, A.M. (July 1988). Confidence bands for a distrbution function using the bootstrap.
165. HESSE, C.H. (July 1988). New methods in the analysis of economic time series I.
166. FAN, JIANQING (July 1988). Nonparametric estimation of quadratic functionals in Gaussian white noise.
167. BREIMAN, L., STONE, C.J. and KOOPERBERG, C. (August 1988). Confidence bounds for extreme quantiles.
168. LE CAM, L. (August 1988). Maximum likelihood an introduction.
169. BREIMAN, L. (Aug.1988, revised Feb. 1989). Submodel selection and evaluation in regression I. The X-fixed caseand little bootstrap.
170. LE CAM, L. (September 1988). On the Prokhorov distance between the empirical process and the associated Gaussianbridge.
171. STONE, C.J. (September 1988). Large-sample inference for logspline models.
172. ADLER, R.J. and EPSTEIN, R. (September 1988). Intersection local times for infinite systems of planar brownianmotions and for the brownian density process.
173. MILLAR, P.W. (October 1988). Optimal estimation in the non-parametric multiplicative intensity model.
174. YOR, M. (October 1988). Interwinings of Bessel processes.
175. ROJO, J. (October 1988). On the concept of tail-heaviness.
176. ABRAHAMS, D.M. and RIZARDI, F. (September 1988). BLSS - The Berkeley interactive statistical system:An overview.
- 7 -
177. MILLAR, P.W. (October 1988). Gamma-funnls in the domain of a probability, with statistcal implications.
178. DONOHO, D.L. and LIU, R.C. (October 1988). Hardest one-dimensional subfamilies.
179. DONOHO, DL. and STARK, P.B. (October 1988). Recovery of sparse signals from data missing low frequencies.
180. FREEDMAN, DA amd PITMAN, JIA. (Nov. 1988). A measure which is singular and uniformly locally uniform.(Revision of Tech Reprt No. 163).
181. DOKSUM, K.A. and HOYLAND, ARNUIOT (Nov. 1988, revised Jan. 1989). A model for step-stress acceleated lifetesting experiments based on Wiener processes and the inverse Gaussian distribution.
182. DALANG, R.C., MORTON, A. and WILLINGER, W. (November 1988). Equivalent martingale measures andno-arbitrage in stochastic securities market models.
183. BERAN, R. (Novanber 1988). Calibrating prediction regions.
184. BARLOW, M.T., PITMAN, L. and YOR, M. (Feb. 1989). On Walsh's Brownian Motions.
185. DALANG, R.C. and WALSH, LB. (Dec. 1988). Almost-equivalence of the germ-field Markov propenty and the shapMarkov propt of the Brownian sheet.
186. HESSE, C.H. (Dec. 1988). Level-Crossing of integrated Ornstein-Uhlenbeck processes
187. NEVEU, J. and PriMAN, J.W. (Feb. 1989). Renewal poperty of the extrema and tree property of the excurionof a one-dimensional brownian motion.
188. NEVEU, J. and PiTMAN, J.W. (Feb. 1989). The branching process in a brownian excursion.
189. PiTMAN, J.W. and YOR, M. (Mar. 1989). Some extensions of the arcsine law.
190. STARK, PB. (Dec. 1988). Duality and discretization in linear inverse problems.
191. LEHMANN, E.L. and SCHOLZ, F.W. (Jan. 1989). Ancillarity.
192. PEMANTLE, R. (Feb. 1989). A time-dependent version of P6lya's urn.
193. PEMANTLE, R. (Feb. 1989). Nonconvergence to unstable points in urn models and stochastic approximations.
194. PEMANTLE, R. (Feb. 1989). When are touchpoints limits for generalized P6lya ums.
195. PEMANTLE, R. (Feb. 1989). Random walk in a random environment and first-passage percolation on trees.
196. BARLOW, M., P1TMAN, J. and YOR, M. (Feb. 1989). Une extension multidimensionnelle de la loi de I'arc sinu.197. BREIMAN, L. and SPECTOR, P. (Mar. 1989). Submodel selection and evaluation in regression - the X-random case.
198. BREIMAN, L., TSUR, Y. and ZEMEL, A. (Mar. 1989). A simple esimation procedure for censoredregression models with known err distribution.
Copies of these Reports plus the most recent additions to the Technical Report series are available from the Statistics Depart-ment technical typist in room 379 Evans Hall or may be requested by mail from:
Department of StatisticsUniversity of CalifomiaBerkeley, Califomia 94720
Cost: $1 per copy.