Date post: | 20-Dec-2015 |
Category: |
Documents |
View: | 219 times |
Download: | 2 times |
SYSTEMSSYSTEMSIdentificationIdentification
Ali Karimpour
Assistant Professor
Ferdowsi University of Mashhad
Reference: “System Identification Theory For The User”
Lennart Ljung
2
lecture 7
Ali Karimpour Nov 2009
Lecture 7
Parameter Estimation MethodParameter Estimation Method
Topics to be covered include: Guiding Principles Behind Parameter Estimation Method.
Minimizing Prediction Error.
Linear Regressions and the Least-Squares Method.
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method.
Correlation Prediction Errors with Past Data.
Instrumental Variable Methods.
3
lecture 7
Ali Karimpour Nov 2009
Models of linear time invariant systemModels of linear time invariant system
Topics to be covered include:
Guiding Principles Behind Parameter Estimation Method.
Minimizing Prediction Error.
Linear Regressions and the Least-Squares Method.
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method.
Correlation Prediction Errors with Past Data.
Instrumental Variable Methods.
4
lecture 7
Ali Karimpour Nov 2009
Guiding Principles Behind Parameter Estimation Method
Parameter Estimation Method
MDMM |)(
)(),()(),()( teqHtuqGty
)(),()(),()|(ˆ:)( tuqWtyqWtyM uy
Suppose that we have selected a certain model structure M. The set of models defined as:
For each θ , model represents a way of predicting future outputs. The predictor is a linear filter as:
Suppose the system is:
Where:
),(),(),(,),(1),( 11 qGqHqWqHqW uy
5
lecture 7
Ali Karimpour Nov 2009
Guiding Principles Behind Parameter Estimation Method
Suppose that we collect a set of data from system as:
)(,)(,...,)2(,)2(,)1(,)1( NuNyuyuyZ N
Formally we are going to find a map from the data ZN to the set DM
MNN DZ ̂
Such a mapping is a parameter estimation method.
6
lecture 7
Ali Karimpour Nov 2009
Guiding Principles Behind Parameter Estimation Method
Evaluating the candidate model
Let us define the prediction error as:
)|(ˆ)(),( tytyt
When the data set ZN is known, these errors can be computed for t=1, 2 , … , N
A guiding principle for parameter estimation is:
Based on Zt we can compute the prediction error ε(t,θ). Select so that the
prediction error t=1, 2, … , N, becomes as small as possible.
N̂
,)ˆ,( Nt
?
We describe two approaches
• Form a scalar-valued criterion function that measure the size of ε.7_2 till 7_4
• Make uncorrelated with a given data sequence. 7_5 and 7_6 )ˆ,( Nt
7
lecture 7
Ali Karimpour Nov 2009
Models of linear time invariant systemModels of linear time invariant system
Topics to be covered include:
Guiding Principles Behind Parameter Estimation Method.
Minimizing Prediction Error.
Linear Regressions and the Least-Squares Method.
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method.
Correlation Prediction Errors with Past Data.
Instrumental Variable Methods.
8
lecture 7
Ali Karimpour Nov 2009
Minimizing Prediction Error
Clearly the size of prediction error
)|(ˆ)(),( tytyt
is the same as ZN
Let to filter the prediction error by a stable linear filter L(q)
),()(),( tqLtF Then use the following norm
N
tF
NN tl
NZV
1
),(1
),(
Where l(.) is a scalar-valued positive function.
The estimate is then defined by:N̂
),(minarg)(ˆˆ NN
D
NNN ZVZ
M
9
lecture 7
Ali Karimpour Nov 2009
Minimizing Prediction Error
),()(),( tqLtF
N
tF
NN tl
NZV
1
),(1
),(
),(minarg)(ˆˆ NN
D
NNN ZVZ
M
Generally the term prediction error identification methods (PEM) is used for the family of this approaches.
Particular methods with specific names are used according
to:
• Choice of l(.)
• Choice of L(.)
• Choice of model structure
• Method by which the minimization is realized
10
lecture 7
Ali Karimpour Nov 2009
Minimizing Prediction Error
),()(),( tqLtF
N
tF
NN tl
NZV
1
),(1
),(
),(minarg)(ˆˆ NN
D
NNN ZVZ
M
Choice of L
The effect of L is best understood in a frequency-domain interpretation. Thus L acts like frequency weighting.
See also >> 14.4 Prefiltering
Exercise: Consider following system
)(),()(),()( teqHtuqGty Show that the effect of prefiltering by L is identical to changing the noise model from
),()(),( 1 qHqLqH
11
lecture 7
Ali Karimpour Nov 2009
Minimizing Prediction Error
),()(),( tqLtF
N
tF
NN tl
NZV
1
),(1
),(
),(minarg)(ˆˆ NN
D
NNN ZVZ
M
Choice of l
A standard choice, which is convenient both for computation and analysis.
2
2
1)( l
See also >> 15.2 Choice of norms: Robustness (against bad data)
One can also parameterize the norm independent of the model parameterization.
12
lecture 7
Ali Karimpour Nov 2009
Models of linear time invariant systemModels of linear time invariant system
Topics to be covered include:
Guiding Principles Behind Parameter Estimation Method.
Minimizing Prediction Error.
Linear Regressions and the Least-Squares Method.
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method.
Correlation Prediction Errors with Past Data.
Instrumental Variable Methods.
13
lecture 7
Ali Karimpour Nov 2009
Linear Regressions and the Least-Squares Method
)()()|(ˆ ttty T We introduce linear regressions before as:
φ is the regression vector and for the ARX structure it is
Tba ntutuntytyt )(...)1()(...)1()( μ(t) is a known data dependent vector. For simplicity let it zero in the reminder of this section.
Least-squares criterion
)()(),( :iserror Prediction ttyt T
Now let L(q)=1 and l(ε)= ε2/2 then
211
)()(2
11),(
1),( tty
Ntl
NZV T
N
t
N
tF
NN
This is Least-squares criterion for the linear regression
14
lecture 7
Ali Karimpour Nov 2009
Linear Regressions and the Least-Squares Method
Least-squares criterion
211
)()(2
11),(
1),( tty
Ntl
NZV T
N
t
N
tF
NN
The least square estimate (LSE) is:
N
t
N
t
TNLSN tyt
Ntt
NZV
1
1
1
)()(1
)()(1
),(minargˆ
)(NR )(Nf
)()(ˆ 1 NfNRLSN
15
lecture 7
Ali Karimpour Nov 2009
Linear Regressions and the Least-Squares Method
Properties of LSE
The least square method is a special case of PEM (prediction error method)
So we have
16
lecture 7
Ali Karimpour Nov 2009
Linear Regressions and the Least-Squares Method
Properties of LSE
17
lecture 7
Ali Karimpour Nov 2009
Linear Regressions and the Least-Squares Method
Weighted Least Squares
Different measurement could be assigned different weights
2
1
)()(1
),(
N
t
Tt
NN tty
NZV
or
2
1
)()(),(),(
N
t
TNN ttytNZV
N
t
N
t
TLSN tyttNtttN
1
1
1
)()(),()()(),(ˆ
The resulting estimate is the same as previous.
18
lecture 7
Ali Karimpour Nov 2009
Linear Regressions and the Least-Squares Method
Colored Equation-error Noise
if the disturbance v(t) is not white noise, then the LSE will not converge to the true value ai and bi .
To deal with this problem, we may incorporate further modeling of the equation error v(t) as discussed in chapter 4, let us say
We show that in a difference equation
)()(...)1(
)(...)1()(
1
1
tvntubtub
ntyatyaty
bn
an
b
a
)()()( teqktv Now e(t) is white noise, but the new model take us out from LS environment, except in two cases:
• Known noise properties
• High-order models
19
lecture 7
Ali Karimpour Nov 2009
Linear Regressions and the Least-Squares Method
Colored Equation-error Noise • Known noise properties
)()(...)1(
)(...)1()(
1
1
tvntubtub
ntyatyaty
bn
an
b
a
Suppose the values of ai and bi are unknown, but k is a known filter (not too realistic a situation), so we have
)()()( teqktv
Filtering through k-1(q) gives
where
Since e(t) is white, the LS method can be applied without problems.
Notice that this is equivalent to applying the filter L(q)=k-1(q) .
20
lecture 7
Ali Karimpour Nov 2009
Linear Regressions and the Least-Squares Method
Colored Equation-error Noise
Now we can apply LS method. Note that nA=na+r, nB=nb+r
• High-order models
)()(...)1(
)(...)1()(
1
1
tvntubtub
ntyatyaty
bn
an
b
a
Suppose that the noise v can be well described by k(q)=1/D(q) where D(q) is a polynomial of order r. So we have
)()()( teqktv
)()(
1)()()()( te
qDtuqBtyqA
or
)()()()()()()( tetuqDqBtyqDqA
21
lecture 7
Ali Karimpour Nov 2009
Linear Regressions and the Least-Squares Method
Consider a state space model as
)()()()(
)()()()1(
tvtDutCxty
twtButAxtx
To derive the system
1- Parameterize A, B, C, D as in section 4.3
2- We have no insight into the particular structure and we would like to find any suitable matrices A, B, C, D.
Note: Since there are infinite number of such matrices that describe the same system (similarity transformation), we will have to fix the coordinate basis of the state space realization.
22
lecture 7
Ali Karimpour Nov 2009
Linear Regressions and the Least-Squares Method
Consider a state space model as)()()()(
)()()()1(
tvtDutCxty
twtButAxtx
Note: Since there are infinite number of such matrices that describe the same system (similarity transformation), we will have to fix the coordinate basis of the state space realization.Let us for a moment that not only y and u are measured the states are also measured. This would, by the way, fix the state-space realization coordinate basis.
Now with known y, u, x the model becomes a linear regression
)(
)()(,
)(
)()(,,
)(
)1()(
tv
twtE
tu
txt
DC
BA
ty
txtY
Then
)()()()(
)()()()1(
tvtDutCxty
twtButAxtx
)()()( tEttY
But there is some problem? States are not available to measure!
23
lecture 7
Ali Karimpour Nov 2009
Linear Regressions and the Least-Squares Method
Estimating State Space Models Using Least Squares Techniques(Subspace Methods)
)(
)()(,
)(
)()(,,
)(
)1()(
tv
twtE
tu
txt
DC
BA
ty
txtY
By subspace algorithm x(t+1) derived from observations.
Chapter 10
24
lecture 7
Ali Karimpour Nov 2009
Models of linear time invariant systemModels of linear time invariant system
Topics to be covered include:
Guiding Principles Behind Parameter Estimation Method.
Minimizing Prediction Error.
Linear Regressions and the Least-Squares Method.
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method.
Correlation Prediction Errors with Past Data.
Instrumental Variable Methods.
25
lecture 7
Ali Karimpour Nov 2009
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
Estimation and the Principle of Maximum Likelihood
);(),...,,;( 21N
yN xfxxxf That is:
N
Ax
Ny
N dxxfAyPN
);()(
The area of statistical inference, deals with the problem of extracting information from observations that themselves could be unreliable.
Suppose that observation yN=(y(1), y(2),…,y(N)) has following probability density function (PDF)
θ is a d-dimensional parameter vector. The propose of the observation is in fact to estimate the vector θ using yN.
dNN RRy )(̂Suppose the observed value of yN is yN
*, then
)(ˆˆ**Ny
26
lecture 7
Ali Karimpour Nov 2009
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
Estimation and the Principle of Maximum Likelihood
dNN RRy )(̂
Many such estimator functions are possible.
),( *N
y yf
A particular one >>>>>>>>> maximum likelihood estimator (MLE) .
),(maxarg)(ˆ**N
yN
ML yfy
The probability that the realization(=observation) indeed should take the value yN* is
proportional to
This is a deterministic function of θ once the numerical value yN* is inserted and it is
called Likelihood function.
A reasonable estimator of θ could then be
where the maximization performed for fixed yN* . This function is known as the
maximum likelihood estimator (MLE).
27
lecture 7
Ali Karimpour Nov 2009
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
Example: Let Niiy ,...,1,)(
Be independent random variables with normal distribution with unknown means θ0
and known variances λi ),()( 0 iNiy A common estimator is the sample mean:
N
i
NSM iy
Ny
1
)(1
)(̂
To calculate MLE, we start to determine the joint PDF for the observations. The PDF for y(i) is:
i
i
i
x
2
)(exp
2
1 2
Joint PDF for the observations is: (since y(i) are independent)
N
i i
i
i
Ny
xxf
1
2
2
)(exp
2
1);(
28
lecture 7
Ali Karimpour Nov 2009
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
Example: Let Niiy ,...,1,)(
Be independent random variables with normal distribution with unknown means θ0
and known variances λi
A common estimator is the sample mean:
N
i
NSM iy
Ny
1
)(1
)(̂
Joint PDF for the observations is: (since y(i) are independent)
N
i i
i
i
Ny
xxf
1
2
2
)(exp
2
1);(
So the likelihood function is:);( N
y yf
Maximizing likelihood function is the same as maximizing its logarithm. So
);(logmaxarg)(ˆ Ny
NML yfy
N
i ii
N
i
iyN
1
2
1
)(
2
1
2
12log
2maxarg
N
i iN
ii
NML
iyy
1
1
)(
/1
1)(ˆ
29
lecture 7
Ali Karimpour Nov 2009
2 4 6 8 10-20
0
20
40
Different experimentsD
iffer
ent
estim
ator
s
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
Example: Let Niiy ,...,1,)( Be independent random variables with normal distribution with unknown means θ0
and known variances λi),()( 0 iNiy
Suppose N=15 and y(i) is derived from a random generation (normal distribution) such that the means is 10 but variances are:
10, 2, 3, 4, 61, 11, 0.1, 121, 10, 1, 6, 9, 11, 13, 15 The estimated means for 10 different experiments are shown in the figure:
N
i
NSM iy
Ny
1
)(1
)(̂
N
i iN
ii
NML
iyy
1
1
)(
/1
1)(ˆ
)(ˆ NSM y
)(ˆ NML y
Exercise:Do the same procedure for another experiments and draw the corresponding figure.
Exercise:Do the same procedure for another experiments and draw the corresponding figure. Suppose all variances as 10.
30
lecture 7
Ali Karimpour Nov 2009
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
Relationship to the Maximum A Posteriori (MAP) Estimate
Maximum likelihood estimator (MLE)
),(maxarg)(ˆ**N
yN
ML yfy
The Bayesian approach is used to derive another parameter estimation problem.
In the Bayesian approach the parameter itself is thought of as a random variable.
Let the prior PDF for θ is:
)()( zPzg
After some manipulation the Maximum A Posteriori (MAP) estimate is:
)().,(maxarg)(ˆ gyfy N
yN
MAP
31
lecture 7
Ali Karimpour Nov 2009
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
Cramer-Rao Inequality
TNN yyEP 00 )(ˆ)(ˆ
The quality of an estimator can be assessed by its mean-square error matrix:
True value of θ
We may be interested in selecting estimators that make P small. Cramer-Rao inequality give a lower bound for P
M is Fisher Information
matrix
32
lecture 7
Ali Karimpour Nov 2009
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
Asymptotic Properties of the MLE
TNN yyEP 00 )(ˆ)(ˆ Calculation of Is not an easy task.
Therefore, limiting properties as the sample size tends to infinity are calculated instead.
For the MLE in case of independent observations, Wald and Cramer obtain
Suppose that the random variable {y(i)} are independent and identically distributed, so that
N
iiiyNy xfxxxf
1)(21 );(),...,,;(
Suppose also that the distribution of yN is given by fy(θ0 ;xN) for some value θ0. Then tends to θ0 with probability 1 as N tends to infinity, and )(ˆ N
ML y
0)(ˆ NML yN
converges in distribution to the normal distribution with zero mean covariance matrix given by Cramer-Rao lower bound M-1.
33
lecture 7
Ali Karimpour Nov 2009
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
Likelihood function for Probabilistic Models of Dynamical Systems
)|(ˆ)(),(
);,()|(ˆ:)( 1
tytyt
ZtgtyM t
Suppose
thehave andt independen is
txfe
Recall this kind of model a complete probabilistic model.
We note that, the output is:);,( PDF thehas ),( where),()|(ˆ)( txftttyty e
Now we must determine the likelihood function
);( Ny yf
Probabilistic Models of Dynamical Systems
34
lecture 7
Ali Karimpour Nov 2009
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
?);( Ny yf
Lemma: Suppose ut is given as a deterministic sequence, and assume that the generation of yt is described by the model
),( is )( of PDF lconditiona the where)(),()( 1 txfttZtgty et
Then the joint probability density function for yt , given ut is:
)(),,()()|,( 1
1 IkZkgkyfuytft
k
ke
ttm
Proof: CPDF of y(t), given Zt-1 , is tZtgxfZxp tte
tt ),,()|( 11
Using Bayes’s rule, the joint CPDF of y(t) and y(t-1), given Zt-2 can be expressed as:
1),,1(.),,( 21
1
tZtgxftZtgxf tte
tte
)|().,)1(|()|,( 21
21
21
t
tt
ttt
tt ZxpZxtyxpZxxp
Similarly we derive (I)
35
lecture 7
Ali Karimpour Nov 2009
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
)|(ˆ)(),(
);,()|(ˆ:)( 1
tytyt
ZtgtyM t
Suppose
thehave andt independen is
txfe
Now we must determine the likelihood function
Probabilistic Models of Dynamical Systems
By previous lemma
N
t
te
Ny tZtgtyfyf
1
1 ;),;,()();(
N
te ttf
1
;),,(
Maximizing this function is the same as maximizing
N
te
Ny ttf
Nyf
1
;),,(log1
);(logN
1
If we define
);,(log),,( tftl e
36
lecture 7
Ali Karimpour Nov 2009
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method
Probabilistic Models of Dynamical SystemsMaximizing this function is the same as maximizing
N
te
Ny ttf
Nyf
1
;),,(log1
);(logN
1
If we define
);,(log),,( tftl eWe may write
N
t
NML ttl
Ny
1
);),,((1
minarg)(ˆ
The ML method can thus be seen as a special case of the PEM.
Exercise: Find the Fisher information matrix for this system.
Exercise: Derive a lower bound for .ˆNCov
37
lecture 7
Ali Karimpour Nov 2009
Models of linear time invariant systemModels of linear time invariant system
Topics to be covered include:
Guiding Principles Behind Parameter Estimation Method.
Minimizing Prediction Error.
Linear Regressions and the Least-Squares Method.
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method.
Correlation Prediction Errors with Past Data.
Instrumental Variable Methods.
38
lecture 7
Ali Karimpour Nov 2009
Correlation Prediction Errors with Past Data
0),()(1
1
N
t
ttN
Ideally, the prediction error ε(t,θ) for good model should be independent of the past data Zt-1
If ε(t,θ) is correlated with Zt-1 then there was more information available in Zt-1 about y(t) than picked up by )|(ˆ ty
To test if ε(t,θ) is independent of the data set Zt-1we must check
This is of course not feasible in practice.
Uncorrelated with
All transformation of ε(t,θ)
All possible function of Zt-1
Instead, we may select a certain finite-dimensional vector sequence {ζ(t)} derived from Zt-1 and a certain transformation of {ε(t,θ)} to be uncorrelated with this sequence. This would give
Derived θ would be the best estimate based on the observed data.
39
lecture 7
Ali Karimpour Nov 2009
Correlation Prediction Errors with Past Data
),()(),( tqLtF
Choose a linear filter L(q) and let
Choose a sequence of correlation vectors
),,(),( 1 tZtt
Choose a function α(ε) and define
N
tF
NN tt
NZf
1
),(),(1
),( Then calculate
0),(ˆ
NN
DN Zfsol
M
Instrumental variable method (next section) is the best known representative of this family.
40
lecture 7
Ali Karimpour Nov 2009
Correlation Prediction Errors with Past Data
0),(ˆ
NN
DN Zfsol
M
N
tF
NN tt
NZf
1
),(),(1
),(
Normally, the dimension of ξ would be chosen so that fN is a d-dimensional vector.
Then there is many equations as unknowns. Sometimes one use ξ with higher dimension than d so there is an over determined set of equations, typically without solution. so
),(minargˆ NN
DN Zf
M
Exercise: Show that the prediction-error estimate obtained from
),(minarg)(ˆˆ NN
D
NNN ZVZ
M
can be also seen as a correlation estimate for a particular choice of L, ζ and α.
41
lecture 7
Ali Karimpour Nov 2009
Correlation Prediction Errors with Past Data
),()|(ˆ tty T
Pseudolinear Regressions
We saw in chapter 4 that a number of common prediction models could be written as:
Pseudo-regression vector φ(t,θ) contains relevant past data, it is reasonable to require the resulting prediction errors be uncorrelated with φ(t,θ) so:
)(
),(),( tt
N
t
TPLRN ttyt
Nsol
1
0),()(),(1ˆ
Which the term PLR estimate.
42
lecture 7
Ali Karimpour Nov 2009
Models of linear time invariant systemModels of linear time invariant system
Topics to be covered include:
Guiding Principles Behind Parameter Estimation Method.
Minimizing Prediction Error.
Linear Regressions and the Least-Squares Method.
A Statistical Framework for Parameter Estimation and the Maximum Likelihood Method.
Correlation Prediction Errors with Past Data.
Instrumental Variable Methods.
43
lecture 7
Ali Karimpour Nov 2009
Instrumental Variable Methods
)()|(ˆ tty T
Consider linear regression as:
The least-square estimate of θ is given by
N
t
TLSN ttyt
Nsol
1
0)()()(1ˆ
So it is a kind of PEM with L(q)=1 and ξ(t,θ)=φ(t)
Now suppose that the data actually described by
)()()( 00 tvtty T
We found in section 7.3 that LSE will not tend to θ0 in typical cases.N̂
44
lecture 7
Ali Karimpour Nov 2009
Instrumental Variable Methods
N
t
TLSN ttyt
Nsol
1
0)()()(1ˆ )()()( 00 tvtty T
We found in section 7.3 that LSE will not tend to θ0 in typical cases.N̂
N
t
TIVN ttyt
Nsol
1
0)()()(1ˆ
Such an application to a linear regression is called instrumental-variable method.
The elements of ξ are then called instruments or instrumental variables.
Estimated θ is:
)()(1
)()(1ˆ
1
1
1
tytN
ttN
N
t
N
t
TIVN
45
lecture 7
Ali Karimpour Nov 2009
Instrumental Variable Methods
N
t
TLSN ttyt
Nsol
1
0)()()(1ˆ
?method IVin asˆ Does 0 NN
Exercise: Show that will be exist and tend to θ0 if following equations exists.
IVN̂
0)()(
rnonsingula be)()(
0 tvtξE
ttE T
We found in section 7.3 that LSE will not tend to θ0 in typical cases.N̂
N
t
TIVN ttyt
Nsol
1
0)()()(1ˆ )()(
1)()(
1ˆ1
1
1
tytN
ttN
N
t
N
t
TIVN
46
lecture 7
Ali Karimpour Nov 2009
Instrumental Variable Methods
Suppose an ARX model:
)()(...)1(
)(...)1()(
1
1
tentubtub
ntyatyaty
bn
an
b
a
Choices of instruments
A natural idea is to generate the instruments similarly to above model. But at the same time not let them be influenced by this leads to
0{ ( )}v t
( ) ( ) ( 1) ( 2)... ( ) ( 1)... ( )T
a bt K q x t x t x t n u t u t n
Where K is a linear filter and x(t) is generated from the input through a linear system
47
lecture 7
Ali Karimpour Nov 2009
Instrumental Variable Methods
Here
( ) ( ) ( ) ( )N q x t M q u t
11
10 1
( ) 1 ...
( ) ...
nnnn
nmnm
N q n q n q
M q m m q m q
( )tMost instruments used in practice are generated in this way. Obviously, is obtained from past inputs by linear filtering and can be written, consequently, as
1( ) ( , )tt t u
48
lecture 7
Ali Karimpour Nov 2009
Instrumental Variable Methods
If the input is generated in open loop so that is does not depend on the noise in the system. Then clearly the following property holds:
0 ( )v t
0( ) ( ) 0E t v t
Since both the -vector and -vector are generated form the same input sequence, it might be expected that the following property should hold in general.
( ) ( ) be nonsingularTE t t
49
lecture 7
Ali Karimpour Nov 2009
Instrumental Variable Methods
Model-dependent Instruments
It may be desirable to choose the filetrs N and M to those of the true system
0 0( ) ( ); ( ) ( )N q A q M q B q
They are clearly not known, but we may let the instruments depend on the parameters in the obvious way
( , ) ( )[ ( 1, )... ( , ) ( 1)... ( )]
( ) ( , ) ( ) ( )
Ta bt K q x t x t n u t u t n
A q x t B q u t
50
lecture 7
Ali Karimpour Nov 2009
Instrumental Variable Methods
The IV method could be summarized as follows
( , ) ( )[ ( ) ]TF t L q y t
( , ) ( , ) ( )ut K q u t
ˆ ( , ) 0IV NN N
DMsol f Z
N
tF
NN tt
NZf
1
),(),(1
),(
( , ) ( , ) ( )ut K q u t
In general, we could write the generation of ( , )t
Where is a d-dimentional column vector of linear filters( , )uK q
where