Arab J Sci Eng (2014) 39:987–995DOI 10.1007/s13369-013-0684-0
RESEARCH ARTICLE - ELECTRICAL ENGINEERING
Incorporating Observation Quality Informationinto the Incremental LMS Adaptive Networks
Amir Rastegarnia · Azam Khalili
Received: 25 February 2012 / Accepted: 8 July 2013 / Published online: 4 September 2013© King Fahd University of Petroleum and Minerals 2013
Abstract In this paper we investigate the effect of obser-vation quality information (OQI) on the performance of aspecial class of adaptive networks known as distributed incre-mental least-mean square (DILMS) algorithm. To this aim weconsider two different cases: (1) a homogeneous environ-ment where all the nodes have the same observation noisevariance (ONV) and (2) an inhomogeneous environment,where different nodes have different ONVs. In the first casewe show that, for the same steady-state error, the DILMSalgorithm has faster convergence rate in comparison with anon-cooperative scheme. In the second case, we first showthat regardless of what ONVs are, the steady-state curves ofmean-square deviation, excess mean-square error and meansquare error (MSE) in each node are monotonically increas-ing functions of step-size parameter. Then, to use the OQI, wereformulate the parameter estimation as a constrained opti-mization problem with MSE criterion as the cost function andONVs as the constraints. Using the Robbins-Monro methodto solve the resultant problem, a new algorithm (which we callnoise-constrained incremental LMS algorithm) is obtainedwhich has faster convergence rate than the existing incre-mental LMS algorithm. Simulation results are also providedto clarify the performance of proposed algorithm.
Keywords Adaptive networks · Distributed estimation ·Least mean-square (LMS) · Noise-constrained
A. Rastegarnia (B) · A. KhaliliDepartment of Electrical Engineering, University of Malayer,Malayer 65719-95863, Irane-mail: [email protected]
A. Khalilie-mail: [email protected]
1 Introduction
In many applications we face with distributed estimationproblem where a set of nodes are deployed to estimate anunknown parameter of interest. This problem has first beenstudied in the context of distributed control [1], tracking [2],later in data fusion [3], and most recently in wireless sen-sor networks [4,5]. The estimation problem can be solvedby either a centralized approach or a decentralized approach(see [6] and references therein). In a centralized approach,measurements from all nodes are collected and processed bya fusion center. This scheme requires extensive amounts ofcommunication between the nodes and the processor. Theseissues, along with geographical difficulties limit the use offusion-based solutions. An alternative way is decentralized
123
988 Arab J Sci Eng (2014) 39:987–995
Fig. 1 Different modes of cooperation: Incremental (left) and diffusion(right)
solution where the nodes rely solely on their local data andon interactions with their immediate neighbors. The amountof processing and communications is significantly reducedin this scheme [7].
In many applications, however, we need to perform esti-mation task in a constantly changing environment where thestatistical information for the underlying processes of inter-est is not available. This motivates the development of dis-tributed adaptive estimation schemes which are also knownas adaptive networks [8–17]. An adaptive network is a col-lection of adaptive nodes that observe space–time data andcollaborate, according to some cooperation protocol, to esti-mate a parameter [8]. Using cooperative processing in con-junction with adaptive filtering per node enables the entirenetwork (and also each individual node) to track not only thevariations of the environment but also the topology of thenetwork.
In comparison with other distributed approaches that relyon consensus-based techniques [22,23] adaptive networksavoid the need to iterate over data and, even more impor-tantly, do not require all nodes to converge to the same equi-librium (or consensus) state. Instead, both the temporal andspatial-diversity of the data across the network are exploitedto endow networks with learning and tracking abilities, andto permit nodes some level of individuality in assessing andevaluating the quality of their data [13].
Based on the mode of cooperation between nodes in thenetwork, distributed adaptive estimation algorithms can becategorized into incremental-based algorithms and diffusion-based algorithms (see Fig. 1). In the incremental mode, acyclic path through the network is required, and nodes com-municate with neighbors within this path. The incrementalLMS, incremental RLS, incremental techniques based onthe affine projection algorithm, parallel projections, and ran-domized incremental protocols are examples of incremen-tal adaptive networks. In [13] the performance of fusion-based and network-based versions of spatial LMS and incre-mental LMS are analyzed and compared with each other.We have analyzed the performance of incremental adap-tive networks when they are implemented in finite-precisionarithmetic in [14,15]. The performance of incremental adap-
tive networks in the presence of noisy links is analyzed in[16–18].
In the diffusion-based adaptive networks, each node com-bines all estimates form the closest neighbors using somecombiner methodology and then performs adaptation on thiscombined estimate. Finally, the new estimate is then diffusedinto the network. Several diffusion-based algorithms havebeen proposed in the literature [19–22].
In this paper we investigate the effect of OQI on the perfor-mance of incremental adaptive networks. We consider twodifferent cases including (1) a homogeneous environmentwhere the ONV is equal for all nodes of the network, and (2)an inhomogeneous environment, where different nodes havedifferent observation noise variances. In the first case weshow that the DILMS algorithm only improves the conver-gence rate in comparison with a non-comparative scheme. Inthe second case we show that the steady-state curves of MSD,EMSE and MSE in each node are monotonically increasingfunctions of step-size parameter. To use the OQI to enhancethe performance of DILMS algorithm, we use a Robbins—Monro algorithm [23,24] to minimize a mean square errorcriterion subject to a ONV constraint and a penalty termnecessary to guarantee uniqueness of the combined weightsolution. The derived algorithm is a distributed version ofthe noise-constrained LMS Algorithm [25] and as the sim-ulation results show, outperforms the existing incrementalLMS algorithm [8,9] in convergence rate.
The remainder of this paper is organized as follows: inSect. 2, we introduce the DILMS algorithm. In Sect. 3, weintroduce the concept of observation quality information.In Sect. 4, we present our proposed algorithm. Simulationresults are given in Sect. 5, and finally, the conclusions aredrawn in Sect. 6.
Through the paper we use boldface notation for randomquantities. Symbol (·)∗ represents conjugation for scalars andHermitian transpose for matrices. 1M denotes M × 1 vector[1, 1, . . . , 1]T , ‖x‖2
� = x∗�x , for a column vector x .
2 Background
2.1 The Estimation Problem and DILMS Algorithm
Consider a network with N = {1, 2, . . . , N } nodes. At timei > 0, node k obtains measurement dk(i) ∈ C and regressionvector uk,i ∈ C
1×M which are time-realizations of zero-mean spatial data {dk, uk}. These quantities are related via
dk(i) = uk,iwo + vk(i) (1)
where vk(i) is the observation noise term. In this paper weconsider the following assumptions on the statistical proper-ties of the data:
123
Arab J Sci Eng (2014) 39:987–995 989
(from prior node)
(to next node)
Fig. 2 A schematic of DILMS algorithm
(A.1.) The regression data uk,i are temporally and spatiallyindependent and identically distributed (i.i.d.) circularwhite Gaussian random variables with zero mean anddiagonal covariance matrix λIM .
(A.2.) The noise signals vk(i) are temporally and spatiallyi.i.d. circular white Gaussian random variables withzero mean and variances σ 2
v,k .(A.3.) The noise signals vk(i) are independent of d�(i), u�,i
for all �, j .
The network seeks the unknown vector wo ∈ CM×1 thatsolves
arg minw J (w), J (w)�=
N∑
k=1
E{|dk − ukw|2} (2)
where E{} denotes the statistical expectation. The optimalsolution of (1) satisfies the normal equations [8–10]
Rdu = Ruwo (3)
where
Rdu,k =N∑
k=1
E{
dk u∗k
}, and Ru,k =
N∑
k=1
E{u∗
k uk}
(4)
Since the optimization problem involves decoupled costfunctions as Jk(w) = E{|dk −ukw|2}, the incremental meth-ods can be used to seek the solution in a distributed manner. In[8,9] the DILMS is proposed to address the mentioned prob-lems in which, the calculated estimates are sequentially cir-culated from sensor to sensor. The update equation in DILMSis given by{ψ(i)k = ψ
(i)k−1 + μku∗
k,i (dk(i)− uk,iψ(i)k−1)
ψ(i+1)1 = ψ
(i)N
(5)
where ψ(i)k denotes a local estimate of wo at node k at timei A schematic of the DILMS algorithm is shown in Fig. 2.
3 Reliability of Measurement Issue
To study the effect of observation quality information on theperformance of DILMS algorithm we first note that the SNRat node k is defined as
SNRk = 10 log10
(E
∣∣uk,iwo∣∣2
σ 2v,k
)(6)
which using (A.1.) can be rewritten as
SNRk = 10 log10
(λ ‖wo‖2
σ 2v,k
)(7)
Thus, the SNR at node k (or equivalently, the observationquality) is inversely proportional to observation noise vari-ance σ 2
v,k . To proceed, we consider two different cases: (i) ahomogeneous environment where all of the nodes have sameobservation noise variance and (ii) an inhomogeneous envi-ronment, where different nodes have different observationnoise variances.
3.1 Case I: Homogeneous Environment
In this case all of the nodes have same observation noisevariances. According to assumption (A.1) and definition (7)we have
σ 2v,k = σ 2
v , k ∈ N (8)
For this case we assert the following proposition:
Proposition 1 In a homogeneous environment with datamodel (1), the DILMS algorithm has better convergencerate performance than non-cooperative scheme for the samesteady-state error.
Proof see Appendix A. It must be noted that though insteady state the DILMS algorithm performs same as a non-cooperative scheme; nevertheless, the DILMS has faster con-vergence rate. To show this, we consider the M modes ofconvergence of the DILMS algorithm and non-cooperativescheme which are given, respectively, by [9–26]
rinc(μ) = |1 − μλ�|N
rnp(μ) = |1 − μλ�| (9)
where l = 1, 2, ...,M . Figure 3 shows both modes of conver-gence for the case Ru = I for DILMS algorithm and non-cooperative scheme. Note that for all step sizes, the incre-mental algorithm has a faster convergence rate than the non-cooperative solution. ��
3.2 Case II: Inhomogeneous Environment
Now, consider again a distributed network with N nodes ruin-ing in an inhomogeneous environment. In fact, for applica-tions where the sensors either have varying quality/resolutionor are at different distances from the unknown target beingmonitored, the sensor measurement quality cannot be identi-
123
990 Arab J Sci Eng (2014) 39:987–995
0 0.02 0.04 0.06 0.08 0.1
0.2
0.4
0.6
0.8
1
1.2
μ
mod
es o
f con
verg
ence
DILMS algorithm Non−cooperative scheme
Fig. 3 Modes of convergence for DILMS algorithm and non-cooperative scheme
cal (inhomogeneous environment) [27]. At first, for this casewe assert the following proposition:
Proposition 2 Consider a distributed network with N nodesin an inhomogeneous environment where data model (1).Then for μλ � 1 and regardless of what σ 2
v,k’s are, thesteady-state curves of MSD, EMSE and MSE in each nodeare monotonically increasing functions of μ.
Proof see Appendix B. ��
Remark 1 Since MSD, EMSE and MSE in each node aremonotonically increasing functions of μ, there is no μ (interms of λ and σ 2
v,k) that minimizes the MSD, EMSE orMSE. Therefore, if the step size, in each node, is large, theconvergence rate of the DILMS algorithm will be rapid, butthe steady state of MSD, EMSE and MSE will increase andvice versa. Thus, the step size provides a trade off betweenthe convergence rate and the steady state.
Remark 2 As we have shown in [16,17], when links betweennodes are noisy, then the MSD, EMSE and MSE in eachnode are not monotonically increasing functions of μ. So,the Proposition 2 is valid only when the links between nodesare ideal.
It is important to note that although the optimal solutionwo in (3) does not depend on the observation noise varianceσ 2v,k , this does not mean that a (partially) adaptive algorithm
for estimating the optimum weight cannot exploit knowledgeof σ 2
v,k . In particular, away from the optimum, the knowl-edge of observation noise might be useful in selecting searchdirections in an adaptive network. In the sequel we show thatthe knowledge of observation noise variances can be used toimprove the convergence rate of DILMS algorithm.
4 Proposed Incremental Adaptive Network
4.1 Algorithm Derivation
According to (2), the optimal solution of (1) does not dependon the observation noise variance {σ 2
v,k}, k ∈ N . However,
away from the optimum, the knowledge {σ 2v,k}, k ∈ N might
be useful in selecting search directions and/or step-size in anadaptive network [25]. Thus, to exploit the information ofobservation quality we cast the following constrained opti-mization problem:⎧⎨
⎩arg min
w
(N∑
k=1E{|dk − ukw|2}
)
subject to Jk(w) = σ 2v,k,∀k ∈ N
(10)
To have a distributed solution, we first recast (10) as thedecomposed cost functions Jk(w) as{
arg minw
E{|dk − ukw|2}subject to Jk(w) = σ 2
v,k
(11)
Now, using the Lagrangian approach we have
J1,k(w, θk) = Jk(w)+ θk(Jk(w)− σ 2v,k) (12)
The critical values of J1,k(w, θk) are {(w, θk) : w = wo} for∀θk ∈ R. Note that since θk is not unique (or even bounded),it may result in problems for an adaptive algorithm [25]. Toavoid this problem, we subtract a term γ θ2
k (where γ > 0)from J1,k(w, θk) to get the augmented Lagrangian
J2,k(w, θk) = Jk(w)+ γ θk(J (w)− σ 2v,k)− γ θ2
k (13)
Observe that we have also scaled the constraint term by γin (11). The critical point of J2(w, θk) is (w, θk) = (wo, 0),which is a saddle point.1 This implies that we have toperform a stochastic root finding to get (wo, 0), which isthe unique solution of ∇ J2,k(w, θk) = 0. To this aim weuse the Robbins-Monro algorithm given in [23,24]. LetJ (wk, dk(i), uk,i ) and J2,k(wk, θk, dk(i), uk,i ) be defined asfollows:
J (wk, dk(i), uk,i ) = (dk(i)− uk,iwk)2 (14)
J2,k(wk, θk, dk(i), uk,i ) = J (wk, dk(i), uk,i )
+γ θk( J (wk, dk(i), uk,i )− σ 2v,k)
(15)
Applying the stochastic root-finding Robbins-Monro algo-rithm for determining (wo, 0) results in
1 The Hessian of J2,k(w, θk) in (wo, 0) is H = diag{R,−2γ }, whereR = ∑N
k=1 Ru,k . Since the determinant of the Hessian matrix H isnegative thus (w, θk) = (wo, 0) is a saddle point.
123
Arab J Sci Eng (2014) 39:987–995 991
wk = wk−1 − α∇w( J2(wk, θk, dk(i), uk,i )) (16)
θk = θk−1 + β∇θ ( J2(wk, θk, dk(i), uk,i )), (17)
whereα andβ are the positive step-sizes. Using the definitionof J2(wk, θk, dk, uk) in (16) and (17) we obtain
wk = wk−1 + αku∗k,i ek(i) (18)
αk = α(1 + γ θk) (19)
θk = θk−1 + β
(1
2(e2
k (i)− σ 2v,k)− θk−1
)(20)
where ek(i) = dk(i)−uk,iwk−1. Let us define a cycle visitingevery node over the network topology only once such thateach node has access only to its immediate neighbor nodein this cycle. To obtain a distributed (incremental) version of(18)–(20), we define the ψ(i)k as the local estimate of the wo
at node k and time i . Thus we have
ψ(i)k = ψ
(i)k−1 + αk(i)u
∗k,i ek(i) (21)
αk(i) = α(1 + γ θk(i)) (22)
θk(i) = θk(i − 1)+ β
(1
2(e2
k (i)−σ 2v,k)−θk(i − 1)
)(23)
Also note that to reduce the communication complexity,we just allow collaboration for the estimates (ψ(i)k ), whilekeeping the θk evolving locally and independent from theneighbor nodes. The pseudo-code of the proposed algorithmis shown in sequel.
4.2 Comparison of Computational and CommunicationCosts
In this section, we compare the computational and communi-cation costs for the DILMS and proposed scheme in the same
Table 1 Comparison of the estimated computational cost per iterationper node for different incremental algorithms for the case of real andcomplex-valued data
Method X − ÷Real data DILMS 2M+1 2M −
Proposed 2M+7 2M+3 −Complex data DILMS 8M+2 8M −
Proposed 8M+8 8M+3 −
style as the presented complexity formulas in [11]. Table1 shows the estimated number of real multiplications, realadditions and real division that are required for the case ofreal-valued data for DILMS and proposed algorithm per nodeper time instant. For the complex data case, a complex addi-tion involves two real additions, and we assumed a complexmultiplication is computed with four real multiplications andtwo real additions. As such, the estimated number of opera-tions for complex-valued data is depicted in Table 1. Withrespect to communication of parameters, between nodes,both algorithms (DILMS and proposed) require O(M) trans-mission complexity, since only the tap vector needs transfer-ring between nodes.
5 Simulation Results
5.1 Validation of Propositions
In this section we present the simulation results to validate thegiven propositions. To this aim we consider different simula-tion cases. We use the data model (1) to do our simulations.In all cases we assume a distributed network with N = 20nodes and choose M = 5, wo = 1M/
√M , and μ = 0.01.
We examine the network performance by the global averageMSD, global average EMSE and global average MSE whichare defined, respectively, as
ηg(i) = 1
N
N∑
k=1
E
{∥∥∥wo − ψ(i)k−1
∥∥∥2}
(24)
ζg(i) = 1
N
N∑
k=1
E
{∣∣∣uk,i (wo − ψ
(i)k−1)
∣∣∣2}
(25)
ξg(i) = ζg(i)+ σ 2v,k (26)
5.1.1 Case I: Homogeneous Environment
In this case we assume that regressor data arise from inde-pendent Gaussian with covariance matrices Ru,k = 3IM andalso σ 2
v,k = 0.01.The learning curve for global average EMSE and MSD
over different nodes are shown in Fig. 4. As it is clear fromFig. 4, in a homogeneous environment, for the same steady-
123
992 Arab J Sci Eng (2014) 39:987–995
0 200
(a)
(b)
400 600 800 1000−25
−20
−15
−10
−5
0
5
10
iteration
MS
E,(
dB)
DILMS algorithm
non−cooperative
0 200 400 600 800 1000−35
−30
−25
−20
−15
−10
−5
0
5
10
iteration
EM
SE
,(dB
)
DILMS algorithm
non−cooperative
Fig. 4 The learning curves for average MSE and EMSE over differentnodes
state performance error, the DILMS algorithm has faster con-vergence rate than the non-cooperative scheme.
5.1.2 Case II: Inhomogeneous Environment
In this case we examine the validity of Proposition 2. Weassume same setup as in case I with exception that σ 2
v,k ∈(0, 0.1) (see Fig. 5). In Fig. 6 we have shown the steady-state values of MSD and EMSE for node k = 10 in terms ofμ. As it is clear from Fig. 6 regardless of what σ 2
v,k’s are, thesteady-state curves of MSD, EMSE and MSE in each nodeare monotonically increasing function of μ.
5.2 Validation of Proposed Algorithm
Consider a network with N = 20 nodes to seek the twounknown filter with M = 4, whose z-domain transfer func-tions are
0 5 10 15 200
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
node k
σ v,k
2
Fig. 5 The σ 2v,k for every node k
10
(a)
(b)
−310
−210
−1−35
−30
−25
−20
−15
−10
−5
0
μ
EM
SE
(dB
)SimulationTheory
10−3
10−2
10−1
−40
−35
−30
−25
−20
−15
−10
−5
μ
MS
D (
dB)
SimulationTheory
Fig. 6 The steady-state values of MSD and EMSE for node k = 10 interms of μ
123
Arab J Sci Eng (2014) 39:987–995 993
0 100 200 300 400 500−60
−40
−20
0
iteration
MS
D (
dB)
DILMSProposed
0 100 200 300 400 500−40
−20
0
20
iteration
EM
SE
(dB
)
0 100 200 300 400 500−20
−10
0
10
iteration
MS
E (
dB)
DILMSProposed
DILMSProposed
Fig. 7 The global average MSD, global average EMSE and globalaverage MSE
w1 = 1
2
M−1∑
n=0
z−n, and w2 = 1
4
M−1∑
n=0
z−n,
where wo is w1 for i ≤ 250 and wo is w2 for 250 < i ≤500. We assume that the input at each node (the regressordata) arise from independent Gaussian where their eigen-value spread is ρ = 5. We also model the observationsnoise term at each node as a white zero-mean uncorre-lated additive noise with variance σ 2
v,k = 0.02. We selectμ = 0.002 for DILMS and α = 0.0018, β = 0.1 andγ = 25 for the proposed algorithm. The selection of theparameters was designed so that the DILMS algorithm andproposed method have essentially the same convergencerate.
The curves are obtained by averaging over 100 indepen-dent runs. The global average MSD, EMSE and MSE aregiven in Fig. 7 based upon 500 iterations (in fact loops). Fromthe figure, we can see the convergence rate of the proposedalgorithm is faster than the DILMS method, which is gainedof course by a little increase in computational complexity.We can also conclude this result from Fig. 8 where the track-ing behavior for the node k = 1 is shown. It is obvious fromFig. 8 that the proposed algorithm offers better tracking per-formance than DILMS algorithm. The proposed algorithm isa type of variable step-size LMS algorithm where the step-size rule arises naturally from the constraints. As it is clearfrom Fig. 9 the proposed algorithm assigns a larger step-sizewhen the estimate is far from the optimum and a smallerstep-size as it approaches the optimum.
0 100 200 300 400 5000
0.2
0.4
0.6
0.8
1
1.2
1.4
iteration
Fig. 8 The∥∥∥ψ(i)1
∥∥∥2
at node k = 1, for DILMS and proposed algo-
rithms
0100
200300
400500 0
510
1520
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
iterationnode k
Fig. 9 The step-size parameter αk(i) for different nodes
6 Conclusions and Future Work
The incremental adaptive networks have been introduced inthe literature to solve the distributed estimation problems incooperative and adaptive manner. Although they offer excel-lent estimation performance, however, the network process-ing has to be faster than the measurement process. Thus, toimprove their convergence rate, in this paper, we proposean incremental least-mean squares (LMS) adaptive networkthat exploits knowledge of observation quality (in terms ofobservation noise variance). Specifically, we use a Robbins-Monro algorithm to minimize a mean square error criterionsubject to a observation noise variance constraint. The resultsdeveloped in this paper lead to good performance allied withlow communication and computational requirements. Simu-lation results show that the proposed algorithm outperformsthe existing incremental LMS algorithm in convergence rate.
123
994 Arab J Sci Eng (2014) 39:987–995
Appendix
Appendix A: Proof of Proposition 1
In [3] the mean-square performance of DILMS algorithm isstudied using energy conservation arguments. The analysisrelies on data model (1) and the following assumptions:
1. {uk,i } are spatially and temporally independent.2. The regressors {uk,i } arise from a circular Gaussian dis-
tribution with covariance matrix Ru,k .
It is shown in [3] that the MSD and EMSE are given by
ηk = ak(I −�k,1)−1 1M (27)
ζk = ak(I −�k,1)−1λk (28)
where λk = diag{�k} is a column vector containing thediagonal entries of �k and
ak�= gk�k,2 + gk+1�k,3 + · · · + gk−2�k,N + gk−1 (29)
�k,��= Fk+�−1 . . . FN F1 . . . Fk−1, � = 1, . . . , N (30)
Fk = I − 2μk�k + 2μ2k�
2k + μ2
kλkλTk (31)
gk�= μ2
kσ2k λ
Tk (32)
In (30) the subscripts are all mod N .For small step-sizes we can ignore the $\muˆ2$ term in
(31); thus the following approximations can be used
Fk ≈ I − 2μk�k (33)
So Fk is diagonal matrix. As a result, matrix �k,l = � =F1 F2 . . . FN will be diagonal as well. Using Taylor seriesexpansion and some calculations we have the followingexpressions for MSD and EMSE:
ηk ≈(
N∑
k=1
μ2kσ
2v,kλ
Tk
) (N∑
k=1
2μk�k
)−1
1M (34)
Applying the assumptions in Sect. 3 to (34) we get
ηk = μσ 2v
2
(N∑
k=1
λTk
) (N∑
k=1
�Tk
)−1
1M (35)
The first parenthesis in (35) is a 1× M vector b which its j thelement (b j ) is given by
b j =N∑
i=1
λi, j (36)
The second parenthesis in (35) is a M × M matrix B whichis
B = diag{b1, b2, . . . , bM } (37)
Now (�1 + · · · +�N )−1 1M can be calculated as
B−11M = col{b−1, b−2, . . . , b−M } (38)
By replacing (36) and (38) in (35) we obtain
ηk = μσ 2v M
2(39)
which is the steady-state MSD for a non-cooperative scheme[12].
Appendix B: Proof of Proposition 2
We prove the Proposition 2 for a special case N = 2. It iseasy to extend it for N > 2 case. To model an inhomogeneousenvironment we assume that for k ∈ N
Ru,k = λIM , σ2v,k = σ 2
v,� (40)
Applying (40) in MSD expression in (34) (with N = 2)yields
ηk(μ1, μ2) = M
2
(μ2
1σ2v,1 + μ2
2σ2v,2
μ1 + μ2
), k = 1, 2 (41)
Computing ∇ηk(μ1, μ2) = [∂ηk/∂μ1, ∂ηk/∂μ2]T = 0 andignoring the M/2 factor in (41) we have
∂ηk/∂μ1 = μ21σ
2v,1 + 2μ1μ2σ
2v,1 − μ2
2σ2v,2 (42)
∂ηk/∂μ2 = −μ21σ
2v,1 + 2μ1μ2σ
2v,2 + μ2
2σ2v,2 (43)
By adding (42) and (43) we have
2μ1μ2(σ2v,2 + σ 2
v,2) = 0 (44)
Thus the critical points of ηk(μ1, μ2) are
(μ1, μ2) = (0, 0), (μ1, μ2) = (c, 0), (μ1, μ2) = (0, c)
(45)
where c ∈ R. Since we must have μ1 > 0, μ2 > 0, noneof the above are acceptable. Moreover, the Hessian matrix ofηk is given by
H =[
2μ1σ2v,1 + 2μ2σ
2v,1 2μ1σ
2v,1 − 2μ2σ
2v,2
−2μ1σ2v,1 + 2μ2σ
2v,2 2μ1σ
2v,2 + 2μ2σ
2v,2
](46)
The determinant of H is given by
|H | = 4(μ21σ
4v,1 + μ2
1σ2v,1σ
2v,2 + μ2
2σ2v,1σ
2v,2 + μ2
2σ4v,2) (47)
since det(H) > 0, H 0 and ηk(μ1, μ2) is monotonicallyincreasing function of {μ1, μ2}. The extension of proof forN > 2 is straightforward.
123
Arab J Sci Eng (2014) 39:987–995 995
References
1. Castanon, D.A.; Teneketzis, D.: Distributed estimation algorithmsfor nonlinear systems. IEEE Trans. Autom. Control 30(5), 418–425(1985)
2. Willsky, A.S.; Bello, M.; Castanon, D.A.; Levy, B.C.; Verghese,G.: Combining and updating of local estimates and regional mapsalong sets of one-dimensional tracks. IEEE Trans. Automa. Control27(4), 799–813 (1982)
3. Chair, Z.; Varshney, P.K.: Distributed bayesian hypothesis testingwith distributed data fusion. IEEE Trans. Syst. Man. Cybernet.18(5), 695–699 (1988)
4. Mergen, G.; Tong, L.: Type based estimation over multiaccesschannels. IEEE Trans. Signal Process. 54(2), 613–626 (2006)
5. Ribeiro, A.; Giannakis, G.B.: Bandwidth-constrained distributedestimation for wireless sensor networks, Part I: Gaussian case.IEEE Trans. Signal Process. 54(3), 1131–1143 (2006)
6. Xiao, J.-J.; Ribeiro, A.; Luo, Z.-Q.; Giannakis, G.B. : Distributedcompression-estimation using wireless sensor networks. IEEE Sig-nal Process. Mag. 23, 27–41 2006
7. Estrin, D.; Pottie, G.; Srivastava, M.: Intrumenting the world withwireless sensor setworks. In: Proceeding of IEEE InternationalConference Acoustics, Speech, Signal Processing (ICASSP), pp.2033–2036. Salt Lake City, UT (2001)
8. Lopes, C.G.; Sayed, A.H.: Distributed processing over adaptivenetworks. In: Proceeding of Adaptive Sensor Array ProcessingWorkshop, MIT Lincoln Lab., Lexington, MA (2006)
9. Lopes, C.G.; Sayed, A.H.: Incremental adaptive strategies over dis-tributed networks. IEEE Trans. Signal Proces. 55(8), 4064–4077(2007)
10. Sayed, A.H.; Lopes, C.G.: Distributed recursive least-squaresstrategies over adaptive networks. In: Proceeding of Asilomar Con-ference Signals, Systems, Computers, pp. 233–237. Monterey, CA(2006)
11. Li, L.; Chambers, J.A.: A new incremental affine projection basedadaptive learning scheme for distributed networks. Signal Process.88(10), 2599–2603 (2008)
12. Lopes, C.G.; Sayed, A.H.: Randomized incremental protocols overadaptive networks. In: Proceeding of IEEE International Confer-ence Acoustics, Speech, Signal Processing (ICASSP), pp. 3514–3517. Dallas, TX (2010)
13. Cattivelli, F.S.; Sayed, A.H.: Analysis of spatial and incremen-tal LMS processing for distributed estimation. IEEE Trans. SignalProces. 59(4), 1465–1480 (2011)
14. Rastegarnia, A.; Tinati, M.A.; Khalili, A.: Performance analysisof quantized incremental LMS algorithm for distributed adaptiveestimation. Signal Process. 90(8), 2621–2627 (2010)
15. Rastegarnia, A.; Tinati, M.A.; Khalili, A.: Steady-state analy-sis of quantized distributed incremental LMS algorithm withoutGaussian restriction. Signal, Image and Video Process. 7(2), 227–234 (2013)
16. Khalili, A.; Tinati, M.A.; Rastegarnia, A.: Performance analysisof distributed incremental LMS algorithm with noisy links. Int. J.Distrib. Sens. Netw. 2011, 1–10 (2011)
17. Khalili, A.; Tinati M.A.; Rastegarnia, A.: Steady-state analysis ofincremental LMS adaptive networks with noisy links. IEEE Trans.Signal Process. 56(5), 2416–2421 (2011)
18. Khalili, A.; Tinati, M.A.; Rastegarnia, A.: Analysis of incrementalRLS adaptive networks with noisy links. IEICE Electron. Express8(9), 623–628 (2011)
19. Lopes, C.G.; Sayed, A.H.: Diffusion least-mean squares over adap-tive networks: Formulation and performance analysis. IEEE Trans.Signal Proces. 56(7), 3122–3136 (2008)
20. Cattivelli, F.S.; Lopes, C.G.; Sayed, A.H.: Diffusion recursiveleast-squares for distributed estimation over adaptive networks.IEEE Trans. Signal Process. 56(5), 1865–1877 (2008)
21. Cattivelli, F.S.; Sayed, A.H.: Multilevel diffusion adaptive net-works. In: Proceeding of IEEE International Conference Acoustics,Speech, Signal Processing (ICASSP), Taipei, Taiwan (2009)
22. Takahashi, N.; Yamada, I.; Sayed, A.H.: Diffusion least-mean-squares with adaptive combiners. In: Proceeding of IEEE Interna-tional Conference Acoustics, Speech, Signal Processing (ICASSP),pp. 2845–2848. Taipei, Taiwan (2009)
23. Robbins H.; Monro, S.: A stochastic approximation method. Ann.Math. Stat. 22(3), 400–407 (1951)
24. Dufflo, M.: Random Iterative Models. Springer-Verlag, Berlin(1997)
25. Wei, Y.; Gelfand, S.B.; Krogmeier, J.V.: Noise-constrained least-mean squares algorithm. IEEE Trans. Signal Process. 49(9), 1961–1970 (2001)
26. Sayed, A.H.: Fundamentals of Adaptive Filtering. Wiley, New York(2003)
27. Xiao, J.J.; Luo, Z.-Q.: Decentralized estimation in an inhomoge-neous sensing environment. IEEE Trans. Inf. Theory 51(10), 3564–3575 (2005)
123