+ All Categories
Home > Documents > Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx...

Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx...

Date post: 24-May-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
12
Journal of Geodetic Science 2(2) 2012 113-124 DOI: 10.2478/v10156-011-0036-5 Weighted total least squares formulated by standard least squares theory Research article A. Amiri-Simkooei 1,2* , S. Jazaeri 3 1 Department of Surveying Engineering, Faculty of Engineering, University of Isfahan, 81746-73441 Isfahan, Iran 2 Acoustic Remote Sensing Group, Faculty of Aerospace Engineering, Delft University of Technology, Delft, the Netherlands 3 Department of Surveying and Geomatics Engineering, College of Engineering, University of Tehran, Tehran, Iran Abstract: This contribution presents a simple, attractive, and exible formulation for the weighted total least squares (WTLS) problem. It is simple because it is based on the well-known standard least squares theory; it is attractive because it allows one to directly use the existing body of knowledge of the least squares theory; and it is exible because it can be used to a broad eld of applications in the error-in- variable (EIV) models. Two empirical examples using real and simulated data are presented. The rst example, a linear regression model, takes the covariance matrix of the coefficient matrix as Q A = Q n Q m , while the second example, a 2-D affine transformation, takes a general structure of the covariance matrix Q A . The estimates for the unknown parameters along with their standard deviations of the estimates are obtained for the two examples. The results are shown to be identical to those obtained based on the nonlinear Gauss- Helmert model (GHM). We aim to have an impartial evaluation of WTLS and GHM. We further explore the high potential capability of the presented formulation. One can simply obtain the covariance matrix of the WTLS estimates. In addition, one can generalize the orthogonal projectors of the standard least squares from which estimates for the residuals and observations (along with their covariance matrix), and the variance of the unit weight can directly be derived. Also, the constrained WTLS, variance component estimation for an EIV model, and the theory of reliability and data snooping can easily be established, which are in progress for future publications. Keywords: standard least squares • errors-in-variables model • weighted total least squares • singular value decomposition © Versita sp. z o.o. Received 2012-03-16; accepted 2012-04-30 1. Introduction A signicant part of literature on the estimation theory distin- guishes between the standard least squares (SLS) and the total least squares (TLS). The latter originates from the work of Golub and van Loan (1980) in mathematical literature in which they in- troduced the error-in-variable (EIV) models. An EIV model differs from the standard Gauss-Markov model (GMM) in the sense that the coefficient matrix connecting the parameters to the random observation vector is also affected by random errors. The simplest TLS model includes the case when both coordinate components of * a linear regression model are observed. In geodetic literature, Burkhard Schaffrin has a signicant contri- bution in developing different theories on the TLS technique. We may refer to Schaffrin and Wieser (2008) in which they developed weighted TLS for an EIV model using the traditional Lagrange func- tion. Their algorithm is restricted to the class of Q A = Q n Q m for the covariance matrix of the coefficient matrix. Schaffrin and his colleagues developed new algorithms to solve the TLS adjustment on the model of condition equations and the TLS problem with lin- ear and quadratic constraints. The reader is referred to Schaffrin and Felus (2009) ; Schaffrin and Wieser (2009) and Schaffrin and Wieser (2011). Shen et al. (2011) proposed an algorithm that re- sembles the standard least-squares formulation. Mahboub (2012) developed an algorithm to use a general covariance matrix and
Transcript
Page 1: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science• 2(2) • 2012 • 113-124 DOI: 10.2478/v10156-011-0036-5 •

Weighted total least squares formulated bystandard least squares theoryResearch articleA. Amiri-Simkooei1,2∗, S. Jazaeri3

1 Department of Surveying Engineering, Faculty of Engineering, University of Isfahan, 81746-73441 Isfahan, Iran2 Acoustic Remote Sensing Group, Faculty of Aerospace Engineering, Delft University of Technology, Delft, the Netherlands3 Department of Surveying and Geomatics Engineering, College of Engineering, University of Tehran, Tehran, Iran

Abstract:This contribution presents a simple, attractive, and exible formulation for the weighted total least squares (WTLS) problem. It is simplebecause it is based on the well-known standard least squares theory; it is attractive because it allows one to directly use the existingbody of knowledge of the least squares theory; and it is exible because it can be used to a broad eld of applications in the error-in-variable (EIV) models. Two empirical examples using real and simulated data are presented. The rst example, a linear regression model,takes the covariance matrix of the coefficient matrix as QA = Qn

⊗Qm , while the second example, a 2-D affine transformation, takes

a general structure of the covariance matrix QA . The estimates for the unknown parameters along with their standard deviations of theestimates are obtained for the two examples. The results are shown to be identical to those obtained based on the nonlinear Gauss-Helmert model (GHM). We aim to have an impartial evaluation of WTLS and GHM. We further explore the high potential capability of thepresented formulation.One can simply obtain the covariancematrix of theWTLS estimates. In addition, one cangeneralize theorthogonalprojectors of the standard least squares fromwhich estimates for the residuals and observations (alongwith their covariancematrix), andthe variance of the unit weight can directly be derived. Also, the constrainedWTLS, variance component estimation for an EIVmodel, andthe theory of reliability and data snooping can easily be established, which are in progress for future publications.

Keywords:standard least squares • errors-in-variables model • weighted total least squares • singular value decomposition© Versita sp. z o.o.

Received 2012-03-16; accepted 2012-04-30

1. Introduction

A signi cant part of literature on the estimation theory distin-guishes between the standard least squares (SLS) and the totalleast squares (TLS). The latter originates from the work of Goluband van Loan (1980) in mathematical literature in which they in-troduced the error-in-variable (EIV) models. An EIV model differsfrom the standard Gauss-Markov model (GMM) in the sense thatthe coefficient matrix connecting the parameters to the randomobservation vector is also affected by random errors. The simplestTLSmodel includes the casewhenboth coordinate components of

∗E-mail: [email protected]

a linear regression model are observed.

In geodetic literature, Burkhard Schaffrin has a signi cant contri-bution in developing different theories on the TLS technique. Wemay refer to Schaffrin and Wieser (2008) in which they developedweighted TLS for an EIVmodel using the traditional Lagrange func-tion. Their algorithm is restricted to the class of QA = Qn ⊗ Qm

for the covariancematrix of the coefficientmatrix. Schaffrin andhiscolleagues developed new algorithms to solve the TLS adjustmenton themodel of condition equations and the TLS problemwith lin-ear and quadratic constraints. The reader is referred to Schaffrinand Felus (2009) ; Schaffrin and Wieser (2009) and Schaffrin andWieser (2011). Shen et al. (2011) proposed an algorithm that re-sembles the standard least-squares formulation. Mahboub (2012)developed an algorithm to use a general covariance matrix and

Page 2: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science114

provided some guidelines to make such a matrix. Xu et al. (2012)formulated WTLS as a nonlinear adjustment model without con-straints and further extended it to a partial EIV model.

It is widely known that the EIV-model can also be considered asa nonlinear Gauss–Helmert model (GHM). It is because in a TLSadjustment with an EIV model the same objective function as inthe adjustment by a nonlinear GHM is minimized. TLS adjustmentmay not be regarded as a new adjustment method, but rather asan additional possibility to formulate a new algorithm in the frameof the general method of least-squares (Neitzel, 2010). We aim tohave an impartial evaluation of both methods. The solution of thenonlinear GHM is somewhat critical to handle due to themany pit-falls describedbyPope (1972). In doing so, special attention shouldbe paid to appropriate linearization and iteration of the model. Incontrast, the elegance of the TLS algorithm lies on its simplicityin the sense that it is formulated in the standard GMM and avoidsimmediate linearization. Therefore, instead of solving a nonlinearGHM in an iterative manner, we use the WTLS adjustment withinthe EIV-model and solve it efficiently using a linearly-structured it-erative algorithm. This efficiency might also include the numberof iterations involved, as demonstrated in the present contribu-tion. We should however note that the GHM can be used as a ref-erence to cross-check theWTLS results. Particular attention is paidto the standard deviations of the estimates obtained using WTLSand GHM.

The objective of this contribution is three-fold. First, we aim to for-mulate theWTLS problem using the standard least squares theory.An alternative expression gives a different appearance of Tong etal. (2011) andMahboub (2012) algorithm (but it is practically iden-tical), while an alternative derivation gives a similar expression tothat of Shen et al. (2011). This latter derivation is of interest be-cause it is straightforward and does not treat the problem to benonlinear—a prerequisite for Shen et al. (2011) derivation. Thealgorithm takes into consideration the complete structure for thecovariance matrix of the coefficient matrix. Second, we furtherexplore the high potential capability of the formulation. Havingthis formulation available, one can apply the existing body of theknowledge of the least squares to the WTLS problem. We can atleast highlight that (1) one can simply obtain the covariance ma-trix of the WTLS estimated parameters, (2) one can simply gener-alize the orthogonal projectors of the standard least squares fromwhich estimates for the residuals, observations, and variance of theunit weight can directly be derived, (3) one can de ne measuresfor the reliability andhence the data snoopingprocedures can sim-ply be implemented, and (4) one can easily establish the variancecomponent estimation for an EIV model. Third, because the WTLSproblem can, in principle, be formulated equivalently in terms ofa nonlinear GHM, to assure of the proper formulation of the esti-matedparameters alongwith their covariancematrices, weuse theGauss-Helmert results as a reference. It is shown that the results ofthese two methods are identical.

This paper is organized as follows. In Sect. 2, a general solution

is given to the WTLS problem which is formulated in the standardleast squares framework. Section 3 shows the applicability of thealgorithm to two numerical examples: (i) a linear regressionmodelof which both x and y coordinates have been observed, and (ii)a planar linear affine transformation in which the coordinates areobserved in both the start and target systems. For both examples,the results—estimates along with their standard deviations—areshown tobe identical to those obtainedusingGHM.Wedraw someconclusions in Sect. 4.

2. Weighted total least squares (WTLS)

Before we continue with the weighted total least squares (WTLS),we brie y explain the theory of the standard least squares. Con-sider the linear model of observation equations, called the Gauss-Markov model, for a given geodetic observations set

y = Ax + e (1)

where y is the m-vector of observations, A is the m × n designmatrix, and x is the n-vector of unknown parameters. The least-squares estimate of the unknown parameters is

x = (AT Q−1y A) −1AT Q−1

y y (2)

whereQy is the m×m covariancematrix of the observations. Theestimated vectors of the observations and of the residuals followrespectively from

{y = PAye = P⊥

A y (3)

where PA = A(AT Q−1y A ) −1AT Q−1

y and P⊥A = I −

A(AT Q−1y A ) −1AT Q−1

y are twom×morthogonal projectors. Thecovariance matrices of the least-squares estimates x , y, and e areQx = (AT Q−1

y A) −1, Qy = PAQy, and Qe = P⊥

A Qy , respec-tively (Teunissen, 2000).

Inmany geodetic surveying applications, one usually assumes thatonly the observations y are corrupted by random noise of whichthe preceding formulation can be used. There are, however, casesthat themodel itself is also corruptedby randomnoise. In this case,theGauss-Markovmodel is replacedby an ‘errors-in-variables’ (EIV)model expressed as

y = (A − EA) x + ey (4)

with its stochastic properties characterized by

[ey

eA

]: =[

ey

vec(EA)

]∼([

00

], σ 2

0

[Qy 00 QA

])(5)

Page 3: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science 115

where ey is the m-vector of observational errors, A is the m ×n designmatrix,EA is the correspondingm × nmatrix of randomerrors, x is the n-vector of unknown parameters, D(ey) = σ 2

0 Qy

and D (eA) = σ 20 QA are the corresponding symmetric and non-

negative dispersionmatrices of sizem×m andmn×mn, respec-tively. In both expressions, σ 2

0 is the unknown variance factor ofthe unit weight (for now we assume σ 2

0 = 1).

In the homoscedastic case, one assumes that Qy = Im and QA =Imn , i.e. identities matrices of sizesm andmn, respectively. Thisintroduces the standard TLSwhichwas originally introduced in themathematical literaturebyGolub andvan Loan (1980). They solvedthis TLS problem using the solution of the following eigenvalueproblem:

[AT A AT yyT A yT y

][x

−1

]=[

x−1

]νmin, νmin ≥ 0 (6)

where νmin , the smallest eigenvalue, along with x are obtained inan iterative manner. An introduction into the TLS methods canbe found in van Huffel and Vandewalle (1991), where the solutionis based on a SVD approach. Alternative iteration schemes and acomprehensive literature review are given by Schaffrin et al. (2006)as well. Schaffrin andWieser (2008) argued that the so-called gen-eralized TLS, introducedby vanHuffel andVandewalle (1991), doesnot really solve the weighted TLS problem because the “weights”introduced in TLS (GTLS) do not refer to the covariance matrix ofthe observations. They introduced the “weighted TLS” that followsthe geodetic tradition based on the inverse of the covariance ma-trix.

Similar to Schaffrin andWieser (2008), we use the target (Lagrange)function as

Φ := eTy Q−1

y ey + eTA Q−1

A eA + 2λT

×(y − Ax − ey +

(xT ⊗ Im

)eA)

(7)

with λ a m-vector of unknown Lagrange multipliers and Im anidentitymatrix of sizem. The rst partial derivatives of Eq. (7)withrespect to the vectors ey, eA, λ, and x follow

12

∂Φ∂eT

y= Q−1

y ey − λ = 0 (8)

12

∂Φ∂eT

A= Q−1

A eA + (x ⊗ Im) λ = 0 (9)

12

∂Φ∂λT = y − Ax − ey + (xT ⊗ Im)eA = 0 (10)

12

∂Φ∂xT = −(AT λ − ET

A λ) = 0 (11)

where (∼) and (.) represent the “predicted” and “estimated” quan-tities, respectively. The predicted residual vectors follow fromEq. (8) and Eq. (9) as

ey = Qyλ (12)

and

eA = vec EA = −QA(x ⊗ Im)λ (13)

Having substituted these equations into Eq. (10) yields

λ =(Qy +

(xT ⊗ Im

)QA (x ⊗ Im)

)−1(y − Ax) (14)

or

λ = Q−1y (y − Ax) (15)

where

Qy = Qy +(xT ⊗ Im

)QA (x ⊗ Im) (16)

is the covariance matrix of the predicted observations y = y −EAx . Substitution of Eq. (15) into Eq. (11) yields

AT Q−1y Ax − ET

A Q−1y Ax = AT Q−1

y y − ETA Q−1

y y (17)

After a few simple mathematical operations, the estimated un-known vector follows directly from the preceding equation as

x =((

A−EA

)TQ−1

y A)−1(

A−EA

)TQ−1

y y (18)

This formulation is indeed a different appearance (but equivalent)of the Tong et al. (2011) and Mahboub (2012) formulations. Theformulation looks similar to the standard least squares formulation.But, the so-called normal matrix is not in general symmetric andpositive-de nite, and hence it is not possible to directly derive thecovariance matrix of estimated parameters Qx . To make a sym-metric positive-de nite normalmatrix after substituting forA fromA = A+EA into Eq. (18) and applying a few simplemathematicaloperations, one obtains the least squares estimate x as

(A − EA

)TQ−1

y

(A − EA

)x =

(A − EA

)TQ−1

y

(y − EAx

)

(19)

Page 4: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science116

or

x =((

A − EA

)TQ−1

y

(A − EA

))−1

×(

A − EA

)TQ−1

y

(y − EAx

)(20)

This is indeed the weighted total least-squares formulation whichresembles the standard least squares method in which A =A − EA plays the role of the design matrix A, Qy = Qy +(xT ⊗ Im

)QA (x ⊗ Im)plays the role of the covariancematrixQy ,

and y = y−EAx plays the roleof theobservationvectory. There-fore, we deal with the predicted design matrix A, the predictedobservation vector y, and the covariance matrix Qy of the pre-dicted observations. Equation (20) is then written as

x =(

AT Q−1y A

)−1AT Q−1

y y (21)

This formulation is similar to that of Shen et al. (2011). Our deriva-tion uses the Lagrange multiplier structure in a linear form ratherthan the nonlinear Gauss-Newton formulation. The above formu-lation looks in fact as if to rewrite Eq. (4) in the formofy−EAx =(A − EA) x + ey − EAx , which is equivalent to y = Ax + e.

Thepreceding formulationof theWTLS allowsone todirectly applythe existing body of knowledge of standard least squares theory.For example, without any derivation one obtains the covariancematrix of the estimated parameters x as

Qx =(

AT Q−1y A

)−1(22)

from which the variances and covariances among the estimatescan be derived. We note that the standard deviations of the TLSestimates using Schaffrin and Wieser (2008) algorithm can be ap-proximated by linear variance propagation and numerical compu-tation of the required partial derivatives. This however gives a rig-orous formulation. We also note that due to the nonlinear natureof the problem, in general, the estimate x alongwith its covariancematrix Qx is biased (see Teunissen 1990). The results presented inthis contribution however do not show that this bias is signi cant(see later on Section 3).

In a similarmanner, wemay further explore the potential capabilityof the preceding formulation. For example, the estimated vectorsof the observations and of the residuals in themodel y = Ax + e,read respectively

{y = PAy = Ax = Ax − EAx

e = P⊥A y = y − y = y − Ax = Qyλ

(23)

where PA = A(

AT Q−1y A

)−1AT Q−1

y and P⊥A = I −

A(

AT Q−1y A

)−1AT Q−1

y are two orthogonal projectors. We high-

light that e is the so-called ‘total residuals’ of themodel, and hencedifferent from ey in Eq. (12). In addition, the covariance matricesof the least-squares estimates yand eareQy = PAQy = AQx AT

andQe = P⊥A Qy = Qy −Qy, respectively. Also, note thatP⊥

A =R is the reliability matrix that contains the redundancy numbers

on its main diagonal, i.e. ri = 1 −(

A(

AT Q−1y A

)−1AT Q−1

y

)

ii.

The internal and external reliability along with the data snoopingprocedures can accordingly be established.

In the analogy with the standard least squares, the variance com-ponent estimator of the unit weight is given as

σ 20 =

eT Q−1y e

m − n (24)

The following expression (25) has already been rigorously provedbySchaffrin andWieser (2008) andMahboub (2012) as theestimateof the variance factor of the unit weight

σ 20 =

λT Qyλm − n (25)

Substitution for e from Eq. (23) into Eq. (24) provides the es-timate of the variance factor of the unit weight of Eq. (25) in astraightforward manner.

To calculate EA three strategies are recommended.

1. The rst strategy is based on Eq. (13) in which one ob-tains eA = vec EA = −QA(x ⊗ Im)Q−1

y e. Having eA

available as an nm-vector, one can simply reshape it as anm × nmatrix, i.e. an ‘inverse’ vec operator. In other words,EA = −vec−1(QA(x ⊗ Im)Q−1

y e), where the operatorvec−1 reconstructs an mn-vector to an m × n matrix.

2. The second algorithm is based on considering the follow-ing block structure of the QA matrix:

QA =

Q11 · · · Q1n...

. . ....

Qn1 · · · Qnn

(26)

where Qij , i = 1, . . . , n, j = 1, . . . , n, are sub-matrices of size m × m. QA is then written as QA =∑n

i∑n

j cicTj ⊗ Qij , where ci is the canonical unit vector

having one at ith position and zeros elsewhere. Substitu-tion in vec EA = −QA(x ⊗ Im)λ yields

vec EA = −∑n

i∑n

j (cicTj ⊗ Qij ) (x ⊗ Im) λ

= −∑n

i∑n

j (cicTj x ⊗ Qij )vec

(λ)

= −vec(∑n

i∑n

j Qij λxT cjcTi

) (27)

Page 5: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science 117

which gives

EA = −n∑

i

n∑

jQij λxT cjcT

i = −n∑

i

n∑

jQijQ−1

y exT cjcTi

(28)

3. The third strategy takes the possible Kronecker structure oftheQA = Qn ⊗Qm (as a special case). In this case one cansimply show that

EA = −QmQ−1y exT Qn (29)

When the proper structure of the covariance matrix QA is intro-duced in the EIV model, the full potential capability of the WTLScan be exploited. Schaffrin andWieser (2008) introduce the properstructure of the covariance matrix within the class of QA = Qn ⊗Qm in order to consider fairly general covariance matrices whereQn has size n × n and Qm has size m×m. Qn and therefore QA

could be singular but Qm is nonsingular. An example of these co-factor matrices are given in the next section. Mahboub (2012) ex-plains how a more general description of the dispersion matrix ofthe coefficient matrix A can be implemented in QA . For example,if an element of A is repeated (with positive or negative sign), onemay use identical variances for both elements, and the 100% or -100% correlation between the two elements. Or, if an element ofAis xed, one may use zero for its corresponding variance. We thushighlight the proper use of the error propagation law in construct-ing QA . This is in conjunction with the classical weighted leastsquares inwhich the properweightmatrix basedon the correct co-variance matrix leads to the best (minimum variance) estimators.

Because all matrices involved along with the observation vectorare functions of the unknown vector x , the nal estimate can besought in an iterative procedure. The iterative algorithm for esti-mating unknown parameters is given as follows:

Step 1: (Initialize x)

x0 = (AT Q−1y A)−1 AT Q−1

y y (30)

Step 2: (Set k = 0 and repeat)

e(k) = y − Ax (k) (31)

Q−1(k)y =

[Qy +

(xT (k) ⊗ Im

)QA(x (k) ⊗ Im

)]−1(32)

E (k)A = −

∑ni∑n

j QijQ−1(k)y e(k)xT (k)cjcT

i

or, reshape e(k)A to E (k)

A

e(k)A = −QA

(x (k) ⊗ Im

)Q−1(k)

y e(k)

or, as special caseE (k)

A = −QmQ−1(k)y e(k)xT (k)Qn

(33)

A(k) = A − E (k)A , y(k) = y − E (k)

A x (k) (34)

Q(k+1)x =

(AT (k)Q−1(k)

y A(k))−1

(35)

x (k+1) = Q(k+1)x AT (k)Q−1(k)

y y(k) (36)

k := k + 1

Step 3: (Check for convergence)

Repeat step 2 until one has

∥∥x (k+1) − x (k)∥∥ <∈ (37)

where ∈ is a chosen threshold value for the convergence.

3. Numerical results and discussions

To verify the efficacy of the presented algorithm, two case stud-ies are provided. Both examples have been widely used in manyTLS research papers and are particularly of interest in engineeringSurveying and Geomatic applications. The rst example is a linearregression model in which real and simulated data sets are used.The second example is a 2-D affine transformation for which sim-ulated weighted datasets have been used. In both examples, theresults are compared to the existingWTLSmethods alongwith theresults obtained using the nonlinear Gauss-Helmert model.

3.1. Linear regression model

Real data. The rst example considers the problem of linear re-gression model in which variables u and v have been observed:vi −evi = a( ui −eui )+b. Therefore errors in both variables areinvolved. We use the data presented in Neri et al. (1989) and laterused by Schaffrin and Wieser (2008) (see Table 1).

We aim to estimate the slope a and intercept b of the regres-sion line using the presented WTLS algorithm. The precision ofthe estimates along with the correlation coefficient i.e. ρ =σab

σaσb, will also be provided. If we de ne the parameter vector

as x = [a, b]T , only the rst column of the coefficient ma-trix in Eq. (4) has random errors, while the values in the secondcolumn are xed. We compose the cofactor matrices as follows:Qy =

(diag

([Wv1 , . . . , Wv10 ]

))−1, andQA = Q2 ⊗Q10 , where

Q2 =[

1 00 0

]and Q10 =

(diag

([Wu1 , . . . , Wu10 ]

))−1, with

‘diag’ an operator that converts a vector into a diagonal matrix ofwhich the diagonal entries of the matrix are the vector’s elements.

Page 6: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science118

Table 1. Observed points and corresponding weights according toNeri et al. (1989)

Point no. Observed data Weights

v u Wv Wu

1 5.9 0.0 1.0 1,000.0

2 5.4 0.9 1.8 1,000.0

3 4.4 1.8 4.0 500.0

4 4.6 2.6 8.0 800.0

5 3.5 3.3 20.0 200.0

6 3.7 4.4 20.0 80.0

7 2.8 5.2 70.0 60.0

8 2.8 6.1 70.0 20.0

9 2.4 6.5 100.0 1.8

10 1.5 7.4 500.0 1.0

The threshold ∈ =10−12 is chosen such that it allows a compar-ison of our WTLS to the exact solution of Neri et al. (1989) andthe nonlinear Gauss-Helmert model. The algorithm used to solvea nonlinear Gauss-Helmert method is described by Vanicek andKrakiwsky (1986), pages 202-207. Themethod, originally proposedby Pope (1974), approximates the nonlinearmodel by its linear ver-sion and performs iterations. The algorithm presented in this con-tribution converges after eight iterations. But, starting by the sameinitial unknown parameters, 15 iterations are required to meet theabove threshold for the nonlinear Gauss-Helmert model.

Estimated lineparametersusingdifferent algorithmsarepresentedin Table 2. As can be seen, both algorithms provide the ‘exact so-lution’, given by Neri et al. (1989). The same results have alreadybeen reported by Schaffrin andWieser (2008) on the samedata set.The rigorous standard deviations of the presented TLS algorithmalong with the correlation coefficient are presented in Table 2 (use

is made of Qx = σ 20

(AT Q−1

y A)−1

). The results of the standarddeviations are identical to those obtained by the nonlinear Gauss-Helmert method.

The estimated vectors of the observations and of the (total) resid-uals, i.e. y and e in Eq. (23), the value of the objective functioneT Q−1

y e = eTy Q−1

y ey + eTA Q−1

A eA , and the estimated variancefactor of the unit weight (Eq. 24) are provided in Table 3. In addi-tion, this table includes the redundancynumbers of the ‘equations’.We note that the redundancy numbers are added up to the totalredundancy of the linear regression model, namely

∑10i=1 ri = 8.

Simulated data. 50 points are simulated in the linear regres-sion model. The covariance matrices of the observations and ofthe coefficient matrix are, respectively, Qy = 0.25 I50 and QA =Q2 ⊗ Q50, where Q50 = 0.25 I50. The line parameters are setto a = 1 and b = 10 in this example. The ui ’s components areassumed to beui = i, i = 1, . . . , 50, and the v i ’s are calculatedbased on the line parameters given above. Both components arecorrupted bywhite Gaussian noise using the preceding covariance

matrices. We will present the results of 100,000 independent runs.

The results of line parameters are given based on the algorithmpresented in this contribution. The results are identical to thosebased on algorithms presented in Schaffrin and Wieser (2008),Mahboub (2012), and the Gauss-Helmert method for all of the100,0000 runs. The same threshold ∈= 10−12 was used for allcompared methods and the nal results were identical. The algo-rithm presented in this contribution converges after 5.1 iterationson average, while, starting by the same initial unknown parame-ters, on average, 9.0 iterations are required for the nonlinear Gauss-Helmert model to meet the above threshold which verify fasterconvergence of our algorithm than GHM.

The histograms of the estimated parameters along with their stan-dard deviations of the estimates are presented in Fig. 1. The stan-dard deviations are directly obtained from Qx of Eq. (22), whichare also identical to those provided from the nonlinear Gauss-Helmert method. The mean values of the estimated parametersare b = 9.9979948465 and a = 1.0000627604. The standarddeviations of the 100,000 estimated b’s and a’s are σb= 0.20344and σa= 0.006916. These standard deviations closely follow theaverage (over 100,000 runs) standarddeviations obtained fromQx ,i.e. σb= 0.20301 and σa= 0.006928.

3.2. Two-dimensional affine transformation

When a set of points are observed in two coordinate systems, thetransformation parameters can be estimated in an EIV model us-ing WTLS. The data for the planar linear affine transformation (six-parameter transformation) are simulated. The model is expressedas

[ut

vt

]=[

us vs 10 0 0

0 0 0us vs 1

]

a1

b1

c1

a2

b2

c2

(38)

where the parameters c1 and c2 are the shifts along the u and vaxes, respectively. The other parametersa1,a2, b1 andb2 are re-lated to the four physical parameters of a 2-D linear transformation,which include two scales along theu and v axes, one rotation, andone non-perpendicularity (or affinity) parameter.

The coordinates of a series of points (i.e. i = 1, . . . , k points) areobserved in both the start and the target systems. Equation (38)makes in total m = 2k number of equations and six number ofunknown parameters to be estimated. The observation vector yand the design matrix A are

Page 7: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science 119

Table 2. Estimated straight-line parameters along with their standard deviations and correlation coefficient using data of Table 1; Exact solution byNeri et al. (1989), nonlinear Gauss-Helmert, and present contribution.

Parameter/ Exact solution Nonlinear Gauss- WTLS

standard Helmert

deviation (Neri et al.) model (This paper)

a -0.480533407 -0.4805334074 -0.4805334074

b 5.47991022 5.4799102240 5.4799102240

σa �- 0.0706202698 0.0706202698

σb �- 0.3592465226 0.3592465226

ρab �- -0.9630881337 -0.9630881337

Table 3. Estimated least squares vectors y and e along with value of objective function, and variance factor of the unit weight. Indicated in thetable are also redundancy numbers ri of equations.

Point no. e [m] y [m] ri

1 0.4200897760 5.4800072056 0.9130057041

2 0.3525698427 5.0475766394 0.8918851255

3 -0.2149500906 4.6145537457 0.8459521198

4 0.3694766353 4.2313745663 0.8075900023

5 -0.3941499795 3.8852539888 0.7125177500

6 0.3344367687 3.3838159332 0.8624392888

7 -0.1811365053 2.9426948374 0.6254534330

8 0.2513435614 2.6609974005 0.6342229529

9 0.0435569244 2.3968501980 0.8984788418

10 -0.4239630089 1.5036405369 0.8084547817

OF = 11.86635319 σ20 = 1.4832941492 Sum = 8

y =

ut1vt1......

utkvtk

, A =

us1 vs1 10 0 0...

......

0 0 0us1 vs1 1...

......

......

...

usk vsk 10 0 0

......

...

0 0 0usk vsk 1

(39)

The vectorey of observational noise and thematrixEA of the coef-cient matrix random noise can be de ned with an identical struc-

ture to the preceding equations. We further assume that the noisemeasurements in both the start and target systems is the indepen-dent Gaussian white noise with variances σ 2

s and σ 2t , respectively.

The dispersion matrix of the coefficient matrix A reads

QA =

Q11 0 00 Q22 00 0 0

Q14 0 00 Q25 00 0 0

Q41 0 00 Q52 00 0 0

Q44 0 00 Q55 00 0 0

(40)

where

Q11 = Q22 = σ 2s Ik ⊗

[1 00 0

],

Q44 = Q55 = σ 2s Ik ⊗

[0 00 1

],

Q14 = Q25 = σ 2s Ik ⊗

[0 10 0

],

Q41 = Q52 = σ 2s Ik ⊗

[0 01 0

],

Page 8: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science120

Figure 1. Histograms of estimated intercept (top-left) and estimated slope (top-right) of linear regression model from 100,000 independent simulateddata sets. The frames at bottom show standard deviations of intercept (bottom-left) and slope (bottom-right).

We note that in the second application of Tong et al. (2011) inwhich they applied a WTLS algorithm to an affine transformation,the above-mentioned structure of the covariance matrix is notused.

One method that can be used to solve this problem is the multi-variate WTLS approach that was proposed by Schaffrin andWieser(2009). This is accomplished by changing Eq. (39) into amultivari-ate model. The algorithm is restricted to the class of covariancematrices QA = Qn ⊗ Qm . The alternative is to use the algorithmproposed byMahboub (2012) in which a complete structure of thedispersion matrix can be used. Our results are identical to the re-sults of these two algorithms and hencewewill not repeat them inthis contribution. Wehowevermake a comparisonbetweenour re-sults and those obtained by the Gauss-Helmert method. Particularattention is paid to the standard deviation of the estimates usingthese two methods.

Suppose that the coordinates of k = 20 points in the start system

are transformed by the parameters a1 = 2, b1 = −1, c1 = 0,a2 = −1, b2 = 2 and c2 = 0 into the coordinates in the targetsystem. The errorless coordinates in the start and target systemsare shown in Fig. 2. The coordinates of the points in the start andtarget systems are corrupted by white Gaussian noise with vari-ances σ 2

s = 0.01 and σ 2t = 0.02, respectively. This process

has been repeated over 100,000 independent runs. For each of thesimulated data set, the transformation parameters are estimatedusing the WTLS method proposed in this paper and the nonlinearGauss-Helmert method. The same threshold ∈= 10−12 was usedfor both methods and the nal results were identical. The algo-rithm presented in this contribution converges after 5.8 iterationson average, while, starting by the same initial unknown parame-ters, on average, 6.6 iterations are required for the nonlinear GHMto meet the above threshold.

The histogram of the estimated parameters is given in Fig. 3.The average (over 100,000 runs) values of the estimated param-

Page 9: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science 121

Figure 2. Coordinates of 20 points in start and target systems.

eters are a1 = 2.0000099697, b1 = −1.0000109115,c1 = 0.0000934864, a2 = −0.9999816046, b2 =2.0000040795, and c2 = −0.0012505003. The his-tograms of the standard deviations of the estimates are presentedin Fig. 4. The mean values (over 100,000 runs) are σa1 =0.0041833081, σb1

= 0.0041833307, σc1 = 0.3016628024,σa2 = 0.0041832793, σb2

= 0.0041833108, and σc2 =0.3016610071. The standard deviations are directly obtainedfrom Qx of Eq. (22), which are also identical to those providedfrom the nonlinear Gauss-Helmert method. These standard de-viations closely follow the standard deviations of the 100,000estimated a1 , b1, c1 , a2, b2, and c2, which are σa1 =0.0041879403, σb1

= 0.0041813018, σc1 = 0.3017729384,σa2 = 0.0041902274, σb2

= 0.0041889952, and σc2 =0.3021218549, respectively. These results indicate that the biasexplained by Shen et al. (2011) is not signi cant for the two appli-cations considered in the present contribution.

In the sequel, we consider the performance of the presentedmethod in the case of fully populated covariance matrices of anaffine transformation. The fully populated covariance matrices ofcoordinates of points in the start and target systems i.e. Qusvs andQut vt are constructed using MATLAB built-in function randn as(randn(100, 2 ∗ k))T × randn(100, 2 ∗ k). The noise is con-structed accordingly. In this case, the dispersion matrix of the co-efficient matrix A reads

QA =

Q1

Q2

Q3

Q4

Q5

Q6

Qusvs

Q1

Q2

Q3

Q4

Q5

Q6

T

(41)

where

Q1 = Ik ⊗[

1 00 0

],

Q2 = Ik ⊗[

0 10 0

],

Q3 = Ik ⊗[

0 00 0

],

Q4 = Ik ⊗[

0 01 0

],

Q5 = Ik ⊗[

0 00 1

],

Q6 = Ik ⊗[

0 00 0

],

The same threshold = 10−12 was used for both methods and thecoordinates of 20 points in the start system are transformed by theparametersa1 = 2, b1 = −1, c1 = 0, a2 = −1, b2 = 2 andc2 = 0 into the coordinates in the target system (as explainedbefore). Over 10000 independent runs, on average, our algorithmconverges after 30.4 iterations, while, starting by the same initialunknownparameters, 40.6 iterations are required for the nonlinearGHM to meet the above threshold. Both algorithms provide iden-tical results for all independent runs which show the performanceof the proposed method in the case of fully populated covariancematrices.

4. Conclusions and outlook

In this contribution we showed that the WTLS problem is an ex-tension of the WLS problem. A new WTLS algorithm was formu-lated which is based on the well-known theory of the standardleast squares problem. The WTLS problem uses the complete de-scription of the covariance matrices of the observations, both ofthe observation vector and of the coefficient matrix in which aproper propagation law of the errors is applied. The efficacy of theproposedWTLS algorithmwas demonstrated by solving two com-monly used WTLS problems in geodetic literature, namely a linearregressionmodel and 2-D affine transformation using real and sim-ulated data. The results were shown to be identical to those pro-vided by the nonlinear GHM.

Numerical results on all real and simulated data, including thosewith fully populated covariance matrices, showed faster conver-gence rate of our algorithm than the GHM. We also note that the

Page 10: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science122

Figure 3. Histograms of estimated parameters of two-dimensional affine transformation from 100,000 independent simulated data sets.

solution of the nonlinear GHM is somewhat critical to handle be-cause special attention needs to be paid to appropriate lineariza-tion and iteration of the model. The elegance of the WTLS algo-rithm lies on its simplicity in the sense that it is formulated in thestandard GMMand avoids immediate linearization because the es-timates are obtained based on a linearly-structured iterative algo-rithm. We should however note that the WTLS algorithms requirethe covariancematrixQA , which is of (large) size ofmn × mn. Butthe GHM directly uses the covariance matrix of the functionally in-

dependent observations (usually smaller thanmn × mn). We alsonote that making the covariance matrix QA is rather a tricky expe-rience in some cases.

The proposed algorithm was shown to be simple in the concept,easy in the implementation, and attractive and exible in compari-sonwith the standard least squares theory. The presentedmethodcan be used as an alternative WTLS method to the existing WTLSmethods for computing the exact solution. Using the standardleast squares theory, the covariancematrix of the estimates can di-

Page 11: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science 123

Figure 4. Histograms of standard deviations of estimated parameters of two-dimensional affine transformation from 100,000 independent simulateddata sets.

rectly be obtained. This formulation simply led us to generalize theorthogonal projectors of the standard least squares fromwhich es-timates for the total residuals andobservations alongwith their co-variance matrices were obtained. The variance of the unit weightwas accordingly estimated. This formulation allows one to obtainthe internal and external reliability and to apply the data snoopingprocedures for identi cation of outlying measurements. Also, fur-ther research is in progress for other kinds of WTLS problems suchas the constrained WTLS and the variance component estimation

for an EIV model.

Acknowledgements

The editorial board of Journal of Geodetic Science and threeanonymous reviewers are kindly acknowledged for their helpfulcomments.

Page 12: Weightedtotalleastsquaresformulatedby ...engold.ui.ac.ir/~amiri/JGS_Amiri_Jazaeri_2012.pdf · j ˆx ⊗Q ij)vec (bλ ) = −vec ( ∑ n i n j Qij ... (30) Step2:(Setk = 0andrepeat)

Journal of Geodetic Science124

References

Golub G. and Van Loan C., 1980, An analysis of the total leastsquares problem, SIAM J. Num. Anal. 17, 883-893

Mahboub V., 2012, On structured weighted total least-squares for geodetic transformations, J Geod., 5, 359-367 DOI:10.1007/s00190-011-0524-5.

Neitzel F., 2010, Gneralization of total least-squares onexample of unweighted and weighted 2D similarity transfor-mation, J Geod., 84, 751-762.

Neri F., Saitta G. and Chiofalo S., 1989, An accurate andstraightforward approach to line regression analysis of error-affected experimental data, J Phys. Ser. E: Sci. Instr. 22, 215-217

Pope A. J., 1972, Some pitfalls to be avoided in the itera-tive adjustment of nonlinear problems. In: Proceedings of the38th Annual Meeting of the American Society of Photogram-metry (Washington, DC) 449-477.

Pope A. J., 1974, Two approaches to nonlinear least squaresadjustments, Can. Surv., 28, 5, 663-669.

Schaffrin B., 2006, A note on constrained total least-squaresestimation, Linear Alg. Appl. 417, 245-258.

Schaffrin B. and Felus Y., 2009, An algorithmic approachto the total least-squares problem with linear and quadraticconstraints, Stud. Geophys. Geod., 53, 1-16.

Schaffrin B., Lee I., Felus Y. and Choi Y., 2006, Total least-squares for geodetic straight-line and plane adjustment, Boll.Geod. Sci. Aff., 65, 141-168.

Schaffrin B. and Wieser A., 2008, On weighted total least-squares adjustment for linear regression, J Geod., 82, 7,415-421.

Schaffrin B. and Wieser A., 2009, Empirical affine refer-ence frame transformations by weighted multivariate TLSadjustment. In: Drewes H. (ed) International Associationof geodesy symposia, vol 134, Geodetic reference frames(Springer, Berlin) 213-218.

Schaffrin B. and Wieser A., 2011, Total least-squares ad-justment of condition equations, Stud. Geophys. Geod., 55,529-536

Shen Y., Li B. and Chen Y., 2011, An iterative solution ofweighted total least-squares adjustment, J Geod., 85, 229-238.

Teunissen P.J.G., 1990, Nonlinear least-squares, Manus.Geod., 153, 3, 137-150.

Teunissen P.J.G., 2000, Adjustment theory: an introduc-tion. Delft University Press, Delft University of Technology,Series on Mathematical Geodesy and Positioning

Tong, X., Jin Y. and Li L., 2011, An improved weighted to-tal least squares method with applications in linear tting andcoordinate transformation. J Surv. Eng., 137, 4, 120-128.

Van Huffel S. and Vandewalle J., 1991, The total least-squaresproblem. Computational aspects and analysis (SIAM, Philadel-phia) Vanicek P., Krakiwsky E., 1986, Geodesy: The concepts(North-Holland, New York, NY).

Xu P., Liu J. and Shi C., 2012, Total least squares adjust-ment in partial errors-in-variables models: algorithm andstatistical analysis, J Geod., DOI: 10.1007/s00190-012-0552-9.


Recommended