Process Dynamics & ControlLECTURE 1: INTRODUCTION OF MODEL PREDICTIVE CONTROL
A Multivariable Control Technique forthe Process Industry
Jong Min LeeChemical and Biological Engineering
Seoul National University
What is MPC?
Process
Computer
SensorActuator Low-level Loopsor
Optimal ProcessAdjustment
Up-to-date ProcessInformation
Simulation/Optimization PackageDatabase
Dynamic Process ModelObjective & ConstraintsConnection to Information System
PID Gp
2J.M. Lee 458.604
Main Algorithm
ymaxtarget
FuturePast
umax
future input
t+1 t+m-1 t+pt
input
measurements
Horizon
projected output
3J.M. Lee 458.604
Some Key Features
• Computer based: sampled-data control• Model based: requires a dynamic process model• Predictive: makes explicit prediction of the future time behaviour of CVs
within a chosen window• Optimization based: performs optimization (numerical search) online
for optimal control adjustments. No explicit form of control law – just model, objective, and constraints are specified
• Integrated: constraint handling and economic optimization with regulatory and servo control
• Receding Horizon Control: repeats the prediction and optimization at each sample time step to update the optimal input trajectory after a feedback update
4J.M. Lee 458.604
Exemplary AlgorithmPlant
Observer
Optimizerreference
reference
t=k
u
y
X̂
Yt(·) = f⇣X̂t,Ut(·)
⌘
Ut(·)
Receding Horizon Control
only the adjustment for the current sample timeis implemented and the rest are re-optimized at the next sample time step after a new feedback update.
min
Ut(·)
Z t+p
t`1 [Error(⌧)] + `2 [Input(⌧)] d⌧
U(·) 2 U, Yt(·) 2 Y
Control Objective
Constraints
5J.M. Lee 458.604
Analogy
6J.M. Lee 458.604
Industrial Use of MPC
• Initiated at Shell Oil and other refineries during late 70s and early 80s.
• Various commercial software– DMCplus– Aspen Tech
– RMPCT – Honeywell
– Dozen+ other players (e.g., 3DMPC-ABB)
• > 3000 worldwide installations
• Predominantly in the oil and petrochemical industries but the range of applications is expanding.
• Models used are predominantly empirical models developed through plant testing.
• Technology is used not only for multivariable control, but for most economic operation within constraint boundaries.
7J.M. Lee 458.604
Survey Result (1)Applications by 5 major MPC vendors in North America / Europe (Badgwell and Qin, 2003)
8J.M. Lee 458.604
Survey Result (2) - Japan (Oshima, 1995)
9J.M. Lee 458.604
Reason for Popularity (1)MPC provides a systematic, consistent, and integrated solution to process control problems with complex features:
- Delays, inverse responses and other complex dynamics.- Strong interactions (e.g., large RGA)- Constraints (e.g., actuator limits, output limits)
SupervisoryControl
Selectors, Switches, Delay Compensations, Anti-
windups, Decouplers, etc.
Low-levelPID Loops
Process Optimization
Advanced Multi-Variable Control
MPC
Low-levelPID Loops
More and moreoptimization is done@ the MPC level.
10J.M. Lee 458.604
Example 1: Blending System Control
Blending SystemModel
ValvePositions
Stock
Additive A
Additive B
u1
u2u3
rB =Additive B
stock
rA =Additive A
stock
q total blend flow
• Control rA and rB.• Control q if possible.• Flowrates of additives are limited
11J.M. Lee 458.604
Classical Solution
VPC FC
FT FT
<Selector
Setpoint
Setpoint95%
Feedback
Stock
X
FT
FC >
X
Ratiosetpoint
Additive A
Ratiosetpoint
Highselector
Additive BFT
FC
Blend of A and B
Valve-positioncontroller
12J.M. Lee 458.604
MPC: Solve @ each time k
p: size of prediction window
minu1(j),u2(j),u3(j)
pX
i=1
(rA(k + i|k)� r⇤A)2 + (rB(k + i|k)� r⇤B)
2 + � (q(k + i|k)� q⇤)2
j = k, . . . , k + p� 1
(ui)min ui(j) (ui)max, i = 1, . . . , 3, �
Advantages of MPC over Traditional APC
• Integrated solution– Automatic constraint handling– Feedforward/feedback– No need for decoupling or delay compensation
• Efficient utilization of degrees of freedom– Can handle nonsquare systems (e.g., more MVs and CVs)– Assignable priorities, ideal settling values for MVs
• Consistent, systematic methodology• Realized benefits
– Higher online times– Cheaper implementation– Easier maintenance
14J.M. Lee 458.604
Reason for Popularity (2)
• Emerging popularity of online optimization• Process optimization and control are often conflicting
objectives– Optimization pushes the process to the boundary of constraints.– Quality control determines how close one can push the process to
the boundary
• Implications for process control– High performance control is needed to realize online optimization.– Constraint handling is a must.– The appropriate tradeoff between optimization and control is time-
varying and is best handled within a single frameworkModel Predictive Control
15J.M. Lee 458.604
Bi-Level Optimization Used in MPCSteady-State Optimization
(LP)
Dynamic Optimization(QP)
Optimal settingvalues for theinputs and outputs(setpoints)
Adjustments tosetpoints of lowlevel loops or control valves
Newmeasurements(Feedback update)
Steady-stateprediction model
Economics based Objective(Maximum profit or thruput, minimum utility)Control Based Constraints
Minimization of Error(= setpt – output and input)Constraints on actuator limitsand safety-sensitive variables
16J.M. Lee 458.604
New Operational Hierarchy and Role of MPC
Low-Level Control
Real Time Optimizer
Production Planning
StrategicPlanning
Customer
Plant Scheduling
Model Predictive Control
$
$
$
$
sec
min~day
week~month
month~year
Move the plant to the current optimal condition fast and smoothly w/o violating constraints:Local optimization + control
17J.M. Lee 458.604
An Exemplary Application: Ethylene Plant
Furnaces
PrimaryFractionator
QuenchTower
ChargeGasCompressor
Chilling
DemethanizerDeethanizer
EthyleneFractionator
DebutanizerPropyleneFractionator
DepropanizerFuelOil
Hydrogen
Methane
Ethylene
Ethane
Propylene
Propane
B- B
GasolineLightH-C
NaphthaFeedstock
18J.M. Lee 458.604
Importance of Modeling/Sys-ID
• Model is the most critical element of MPC that varies the most from application to application.
• Almost all models used in MPC are typically empirical models identified through plant tests rather than first-principles models.– Step responses, pulse responses from plant tests– Transfer function models fitted to plant test data
• Up to 80% of time and expense involved in designing and installing a MPC is attributed to modeling/system identification
• Keep in mind that obtained models are imperfect (both in terms of structure and parameters)– Importance of feedback update to correct model prediction or model
parameters/states.– Penalize excessive input movements.
19J.M. Lee 458.604
Design Effort for Two Approaches
ProcessAnalysis
Designand
Tuning of Controller
Modeling&
Identifica-tion
ControlSpecification
Traditional Control MPC
20J.M. Lee 458.604
Challenges for MPC
• Efficient identification of control-relevant model• Managing the sometimes exorbitant online computational load
– Nonlinear models è Nonlinear Programs (NLP)– Hybrid system models (continuous dynamics + discrete events or switches,
e.g., pressure swing adsorption) è Mixed Integer Programs (MIP)
– Difficult to solve these reliably online for large-scale problems
• How do we design model, estimator (of model parameters and state), and optimization algorithm as an integrated system – that are simultaneously optimized, rather than as disparate components?
• Long-term maintenance of control system
21J.M. Lee 458.604
Control Relevant ModelingCoupling between Modeling and Control
System ID
• Test signal characteristics• Model structure• Data filtering• Parameter fitting
MPC Design
• Choice of objective functionand constraints
• Choice of horizon sizes• Choice of online estimator
Model Quality (Error or “Uncertainty”)
Sensitivity of Control Performance to Model Errors
Model
22J.M. Lee 458.604
Iterative Model/Controller Refinement
Gc Gp
Closed-loop Operation and Testing
Identification &Controller Design
Closed-loopData
NewController Gc
23J.M. Lee 458.604
Comparison of Computational Load
OfflineComputation
Online
Computation
(Estimation,
Prediction, &
Optimization)
Offline Analysis
and Computa-
tion OnlineComputation
Classical Optimal Control MPC
ExplicitControl Law
u = f(x) min f(x)g(x) > 0
Model; Obj Fcn;Constraints
Limited by the ability to derive the explicitcontrol law analytically or with reasonable offline computation
Limited by available online computationalpower and numerical methods to solve online optimization reliably
24J.M. Lee 458.604
Coupling between Online Estimation and Control Calculation
Modeling(System ID)
Online Estimation ofState & Model Parameters
Online Optimal ControlCalculation
Model w/ parameters & states to estimate
Prediction
Uncertainty in Prediction =“Risk”
Real-TimeAdjustment
Quality of Information forEstimation
25J.M. Lee 458.604
Integrated MPC, Performance Monitoring, and Closed-Loop Identification
MPC
Online ModelIdentification
Process Monitoring
MeasurementsAdjustments
Detection and Diagnosis ofAbnormal Situation• Operation shifts: model parameter
changes• Abnormal disturbances
(size & pattern)• Instrumentation/Equipment Faults,
Poisoning, etc.
26J.M. Lee 458.604
Conclusion
• MPC is the established advanced multivariable control technique for the process industry. It is already an indispensable tool and its importance is continuing to grow.
• It can be formulated to perform some economic optimization and can also be interfaced with a larger-scale (e.g., plant-wide) optimization scheme.
• Obtaining an accurate model and having reliable sensors for key parameters are key bottlenecks.
• A number of challenges remain to improve its use and performance.
27J.M. Lee 458.604
458.604 Process Dynamics & Control
Lecture3: Dynamic Matrix Control (DMC)
Jong Min Lee School of Chemical and Biological Engineering
Seoul National University
1
In this lecture, we will discuss
• Process representation: step response model
• Prediction (perfect model)
• Incorporation of “feedback”
• Optimization: unconstrained and constrained QP
• Implementation
2
Dynamic Matrix Control
• First appeared in the open literature in 1979 (Cutler and Ramaker; Prett and Gillette)
- with notable success on several Shell processes for many years
• Reformulation as a quadratic program by Garcia and Morshedi in 1986 - “Quadratic Dynamic Matrix Control”
• AspenTech: DMCplus
• Prototype of commercial algorithms presently used in the process industry
3
Process representation
Dynamic System
unit step-response function:
u
1
Ts
y
Ts
S0S1
S2
Sn Sn+1
Sn Sn+1= Sn+2= S∞= =...
S0 0=
u
1
Ts
y
Ts
S0S1
S2
Sn Sn+1
Sn Sn+1= Sn+2= S∞= =...
S0 0=
S = [S1, S2, S3, . . . , Sn]T
Stable, SISO:
Complete description of the process requires n step response coefficients
4
1
2
1/2
1 1=
-3/2
�u(0) �u(1)
�u(2)
+ +
u u u u
y
?y(0)
y(1)
y(2)
y(3)y(4)
y(5)
Principle of superposition
�u(k � i+ 1) = u(k � i+ 1)� u(k � i)
y(1) = y(0) + S1�u(0)
y(2) = y(0) + S1�u(1) + S2�u(0)
... =...
y(k + 1) = y(0) +n�1X
i=1
Si�u(k � i+ 1) + Sn {�u(k � n+ 1) +�u(k � n) + · · ·�u(0)}
= y(0) +n�1X
i=1
Si�u(k � i+ 1) + Snu(k � n+ 1)
5
Elements of DMC
target
FuturePast
k+1 k+m-1 k+pk
inputHorizon
∆u(k)∆u(k+1)
y(k)~y(k+1)~
y(k+2)~
6
Predictions
7
1. Prediction (stable, SISO)
ŷ(k + 1): prediction of y(k+1) made at time k
At time k: we know y(k) and need to compute ∆u(k), which we don’t know yet.
Assume y(0) = 0
ŷ(k + 1) =n�1X
i=1
Si�u(k � i+ 1) + Snu(k � n+ 1)
Effect of current control action Effect of past control actions
Substitute k = k+1, and
Effect of future control action
Effect of current control action Effect of past control actions
ŷ(k + 2) = S1�u(k + 1) + S2�u(k) +n�1X
i=3
Si�u(k � i+ 2) + Snu(k � n+ 2)
ŷ(k + 1) = S1�u(k) +n�1X
i=2
Si�u(k � i+ 1) + Snu(k � n+ 1)
8
j-step ahead prediction
Effect of current and future control actions
Effect of past control actions
Let
This is referred to as “predicted unforced response” with past inputs only
U = [. . . , u(k � 2), u(k � 1), 0, 0, 0, . . .]T
ŷ(k + j) =jX
i=1
Si�u(k + j � i) + ŷ0(k + j)
ŷ(k + j) =jX
i=1
Si�u(k + j � i) +n�1X
i=j+1
Si�u(k + j � i) + Snu(k + j � n)
ŷ0(k + j)�=
n�1X
i=j+1
Si�u(k + j � i) + Snu(k + j � n)
for j = 1
9
Multiple predictions
Ŷ0(k + 1)�=
⇥ŷ0(k + 1), ŷ0(k + 2), · · · , ŷ0(k + p)
⇤T
U(k)�= [�u(k), �u(k + 1), · · · , �u(k +m� 1)]T
p: prediction horizon, m: control horizon m p n+m
In a matrix form:
Ŷ(k + 1)�= [ŷ(k + 1), ŷ(k + 1), · · · , ŷ(k + p)]T
Ŷ(k + 1) = S�U(k) + Ŷ0(k + 1)
S �=
2
66666666664
S1 0 0 · · · 0S2 S1 0 · · · 0S3 S2 S1 · · · 0· · · · · · · · · · · · · · ·Sm Sm�1 Sm�2 · · · S1
Sm+1 Sm Sm�1 · · · S2· · · · · · · · · · · · · · ·Sp Sp�1 Sp�2 · · · Sp�m+1
3
77777777775
10
Output feedback and bias correctionSo far, we have not utilized the latest observation, y(k).
The fact is that there is no perfect model.
Corrected prediction by adding a constant bias term.
ỹ(k + j)�= ŷ(k + j) + b(k + j)
b(k + j) = y(k)� ŷ(k)
ŷ(k): one-step ahead prediction made at the previous time instance, k-1
ỹ(k + j) = ŷ(k + j) + [y(k)� ŷ(k)]
Ỹ(k + 1) = S�U(k) + Ŷ0(k + 1) + [y(k)� ŷ(k)]1
Ỹ(k + 1) = [ỹ(k + 1), ỹ(k + 2), · · · , ỹ(k + p)]T
1 = [1, 1, , · · · , 1]T
11
Recursive update of unforced response
For stable models, one can update the predicted unforced response after
is computed.
or
2
6664
ŷ0(k + 1)ŷ0(k + 2)
...ŷ0(k + p)
3
7775=
2
6666664
0 1 0 · · · 0
0 0 1. . . 0
......
.... . . 0
0 0 · · · 0 10 0 · · · 0 1
3
7777775
2
6664
ŷ0(k)ŷ0(k + 1)
...ŷ0(k + p� 1)
3
7775+
2
6664
S1S2...Sp
3
7775�u(k)
u(k) works like a state; hence you need “n” not “p”
n n n
Ŷ0n(k + 1) = MŶ0n(k) + S⇤�u(k)
12
Why??
• To achieve good control performance
- should be close to the true open-loop output
- This requires that n, the number of coefficient matrices in S* is chosen that Sn = Sn+1 (i.e., plant should be stable), otherwise will be in error. It also requires the feedback term stays approximately constant. (step disturbance)
Ỹ(k + 1)
MŶ0
13
1. Prediction (stable, MIMO)
14
2-by-2 system
ŷ1(k + 1) =n�1X
i=1
S11,i�u1(k � i+ 1) + S11,nu1(k � n+ 1)
+n�1X
i=1
S12,i�u2(k � i+ 1) + S12,nu2(k � n+ 1)
ŷ2(k + 1) =n�1X
i=1
S21,i�u1(k � i+ 1) + S21,nu1(k � n+ 1)
+n�1X
i=1
S22,i�u2(k � i+ 1) + S22,nu2(k � n+ 1)
15
Vector notation
y =
2
6664
y1y2...ym
3
7775u =
2
6664
u1u2...ur
3
7775mp-by-1
Ỹ(k + 1) =
2
6664
ỹ(k + 1)ỹ(k + 2)
...ỹ(k + p)
3
7775�U(k) =
2
6664
�u(k)�u(k + 1)
...�u(k +m� 1)
3
7775
S =
2
666666666664
S1 0 · · · 0
S2 S1 0...
......
. . . 0Sm Sm�1 · · · S1
Sm+1 Sm · · · S2...
.... . .
...Sp Sp�1 · · · Sp�m+1
3
777777777775
Si =
2
6664
S11,i S12,i · · · S1r,iS21,i · · · · · · S2r,i...
......
...Sm1,i · · · · · · Smr,i
3
7775
Ỹ(k + 1) = S�U(k) + Ŷ0(k + 1) + Ip [y(k)� ŷ(k)]
Ip =
2
64I...I
3
75 pny-by-ny
Use up to p only out of n
16
Ŷ0(k + 1) =
2
6664
ŷ0(k + 1)ŷ0(k + 2)
...ŷ0(k + p)
3
7775
Recursive update of unforced response
2
6664
ŷ0(k + 1)ŷ0(k + 2)
...ŷ0(k + p)
3
7775=
2
6666664
0 Im 0 · · · 0
0 0 Im. . . 0
......
.... . . 0
0 0 · · · 0 Im0 0 · · · 0 Im
3
7777775
2
6664
ŷ0(k)ŷ0(k + 1)
...ŷ0(k + p� 1)
3
7775+
2
6664
S1S2...Sp
3
7775�u(k)
or
n n n
Ŷ0n(k + 1) = MŶ0n(k) + S
⇤�u(k)
17
Control calculations
18
Objective function
At time k, minimize the predicted deviation of the output from the setpoint with some penalty on the input movement size measured in terms of the quadratic norm.
min�U(k)
(pX
i=1
(yr(k + i)� ỹ(k + i))T Q (yr(k + i)� ỹ(k + i))
+m�1X
`=0
�uT (k + `)R�u(k + `)
)
Q,R : weighting matrices (diagonal)
19
Constraints
ymaxtarget
FuturePast
umax
future input
t+1 t+m-1 t+pt
input
measurements
Horizon
projected output
Input magnitude
umin
u(k + `) umax
Input rate
|�u(k + `)| �umax
Output magnitude
ymin ỹ(k + i) ymin
20
Solve: quadratic program
A�U(k) b
min�U(k)
⇢1
2�UT (k)H�U(k) + fT�U(k)
�
Hf
A
b
�U(k)
: Hessian matrix
: gradient vector
: constraint matrix
: constraint vector
: decision variable
We need to convert the MPC objective and constraints to the standard QP form.
21
Unconstrained problemmin
�U(k)
⇢1
2�UT (k)H�U(k) + fT�U(k)
�
Take the gradient w.r.t. the input:
H�U(k) + f = 0
�U(k) = �H�1f
22
Objective function in quadratic formmin
�U(k)
(pX
i=1
(yr(k + i)� ỹ(k + i))T Q (yr(k + i)� ỹ(k + i)) +m�1X
`=0
�uT (k + `)R�u(k + `)
)
2
6664
yr(k + 1)� ỹ(k + 1)yr(k + 2)� ỹ(k + 2)
...yr(k + p)� ỹ(k + p)
3
7775
T 2
6664
. . .Q
3
7775
2
6664
yr(k + 1)� ỹ(k + 1)yr(k + 2)� ỹ(k + 2)
...yr(k + p)� ỹ(k + p)
3
7775
+
2
6664
�u(k)�u(k + 1)
...�u(k +m� 1)
3
7775
T 2
6664
RR
. . .R
3
7775
2
6664
�u(k)�u(k + 1)
...�u(k +m� 1)
3
7775
⇣Yr(k + 1)� Ỹ(k + 1)
⌘TQ̄
⇣Yr(k + 1)� Ỹ(k + 1)
⌘+�UT (k)R̄�U(k)
23
Not done yet!⇣Yr(k + 1)� Ỹ(k + 1)
⌘TQ̄
⇣Yr(k + 1)� Ỹ(k + 1)
⌘+�UT (k)R̄�U(k)
Ỹ(k + 1) = S�U(k) + Ŷ0(k + 1) + Ip [y(k)� ŷ(k)]
This yields
"(k + 1) = Yr(k + 1)� Ŷ0(k + 1)� Ip [y(k)� ŷ(k)]where
is a known term.
"T (k + 1)Q̄"(k + 1)�2"T (k + 1)Q̄S�U(k) +�UT (k)�ST Q̄S+ R̄
��U(k)
Hessian (a constant matrix): H = ST Q̄S+ R̄gradient vector (must be updated at each time): fT = �"T (k + 1)Q̄S
24
Constraints in linear inequality form
umin
u(k + `) umax
|�u(k + `)| �umax
ymin ỹ(k + i) ymin
A�U(k) b
i = 1, · · · , p` = 0, · · · ,m� 1
25
Input magnitude constraintumin
u(k + `) umax
, ` = 0, · · · , m� 1
�u(k � 1)�X̀
i=0
�u(k + i) umin
u(k � 1) +X̀
i=0
�u(k + i) umax
2
666666666666664
�
2
66664
I 0 · · · 0
I I 0...
......
. . . 0I I · · · I
3
77775
2
66664
I 0 · · · 0
I I 0...
......
. . . 0I I · · · I
3
77775
3
777777777777775
2
6664
�u(k)�u(k + 1)
...�u(k +m� 1)
3
7775
2
6666666666664
�
2
6664
umin
� u(k � 1)umin
� u(k � 1)...
umin
� u(k � 1)
3
7775
2
6664
umax
� u(k � 1)umax
� u(k � 1)...
umax
� u(k � 1)
3
7775
3
7777777777775
IL
26
Input rate constraints|�u(k + `)| �u
max
` = 0, · · · ,m� 1
��umax
�u(k + `) �umax
�u(k + `) �umax
��u(k + `) �umax
2
666666666666664
2
66664
I 0 · · · 0
0 I 0...
......
. . . 00 0 · · · I
3
77775
�
2
66664
I 0 · · · 0
0 I 0...
......
. . . 00 0 · · · I
3
77775
3
777777777777775
2
6664
�u(k)�u(k + 1)
...�u(k +m� 1)
3
7775
2
6666666666664
2
6664
�umax
�umax
...�u
max
3
7775
2
6664
�umax
�umax
...�u
max
3
7775
3
7777777777775
I
27
Output magnitude constraintsymin
ỹ(k + i) ymax
, i = 1, · · · , p
ỹ(k + i) ymax
�ỹ(k + i) �ymin
S�U(k) + Ŷ0(k + 1) + Ip (y(k)� ŷ(k))
�S�U(k)� Ŷ0(k + 1)� Ip (y(k)� ŷ(k))
�
Y
max
�Ymin
�
Ymax
=
2
6664
ymax
ymax
...ymax
3
7775Ymin =
2
6664
yminymin...
ymin
3
7775
S�S
��U(k)
Y
max
� Ŷ0(k + 1)� Ip (y(k)� ŷ(k))�Y
min
+ Ŷ0(k + 1) + Ip (y(k)� ŷ(k))
�
28
In summary,
2
6666664
�ILIL�II�SS
3
7777775�U(k)
2
666666666666666666666666664
�
2
64umin
� u(k � 1)...
umin
� u(k � 1)
3
75
2
64umax
� u(k � 1)...
umax
� u(k � 1)
3
75
2
64�u
max
...�u
max
3
75
2
64�u
max
...�u
max
3
75
Y
max
� Ŷ0(k + 1)� Ip (y(k)� ŷ(k))�Y
min
+ Ŷ0(k + 1) + Ip (y(k)� ŷ(k))
�
3
777777777777777777777777775
A�U(k) b29
Solving QP
• Quadratic program: minimization of quadratic function subject to linear inequality constraints.
• QPs are convex and therefore fundamentally tractable.
• Off-the-shelf solvers (e.g., QPSOL, QUADPROG) are available but further customization is desirable (to exploit the structure in the Hessian and constraint matrices)
• Complexity of a QP is a complex function of the dimension/structure of Hessian, as well as the number of constraints.
30
• Active set method
• Interior point method
- Barrier function
31
Real-time implementation1. Initialization: Initialize the memory vector and the reference vector
and the reference vector. Set k = 0.
Ŷ0(k + 1) = MŶ0(k) + S⇤�u(k)2. Memory update:
Ŷ(0)
3. Reference vector update
4. Measurement intake: Take in new measurement y(k) �d(k)and
5. Calculation of the gradient vector and constraint vector
6. Solve QP
7. Implementation of input
8.Go back to step 2 after setting
u(k) = u(k � 1) +�u(k)
32
458.604 Process Dynamics & Control
Lecture 4: Sampling and Representation ofSampled Signals
Jong Min LeeChemical & Biomolecular Engineering
Seoul National University
April 1, 2015
1 / 16
Overview
D/A System A/D
Z.O.H.uk
u(t) y(t) yk
A computer oriented mathematical model (or discrete-time model)
relates uk to yk.does not give information on intersample behaviour.
can be described using difference equation or a pulse transferfunction.
2 / 16
Input-Output Model
Input-output model describes a relationship between input uk and output yk.Generally, it takes the form of the following difference equation:
yk = −a1yk−1 − · · · − anyk−n + b1uk−1 + · · ·+ bmuk−mWith some abuse of notation, the above is written as
(1 + a1q−1 + · · ·+ anq−n)yk = (b1q−1 + · · ·+ bmq−m)uk
⇓ z-transform
Y(z)U(z) =
b1z−1+b2z−2+···bmz−m1+a1z−1+···anz−n
The order of the transfer function is determined by max(n, m).Denominator: Autoregressive Terms
Numerator: Moving Average Terms
3 / 16
Discrete-Time Pole
Consider the first-order system
yk = ayk−1 + uk−1 →Y(z)U(z) =
z−11− az−1
One can expand the above as a power series of z−1 around z−1 = 0:
Y(z)U(z) =
1
z +az2 +
a2z3 + · · ·+
aN−1zN + truncation error
Obvious convergence (stability) condition is |a| < 1.Note that a is the pole of Y(z)U(z)
4 / 16
State-Space ModelA model can also be given in terms of the following matrix difference equation:
xk+1 = Φxk + Γukyk = Cxk + Duk
xk is called a state vector and stores the effect of past input (uk−1, uk−2, · · · )on thecurrent and future output. The state variables may or may not have physical meanings.
An equivalent input-output representation can easily be derived by performing thez-transform to the above:
zX(z) = Φx(z) + Γu(z)Y(z) = Cx(z) + Du(z)
⇓Y(z)U(z) = C(zI − Φ)
−1Γ + D
5 / 16
Input-Output Models from Discretizationof Continuous TF (Optional)
D/A A/D
Z.O.H.uk
G(s)
G(z)
yk
z-Transform describes a discrete signal as ``impulse train" whenviewed in continuous time.A zero-order hold converts the sampled signal to a piece-wiseconstant signal (train of pulses).Hence, we need to derive a pulse transfer function for zero-orderhold.The basic idea is
Y(z)U(z) = G(z) = Z
{L−1
{G(s)1− e
−sh
s
}}6 / 16
Input-Output Models from IdentificationSuppose one is interested in fitting an nth-order transfer function model
Y(z)U(z) =
b1z−1 + b2z−2 + · · ·+ bmz−m1 + a1z−1 + a2z−2 + · · ·+ anz−n
In the time domain, this corresponds to
y(k) =− a1y(k − 1)− a2y(k − 2)− · · · − any(k − n)+ b1u(k − 1) + b2u(k − 2) + · · ·+ bmu(k − m)
Notice that there is at least one time-delay between the input and output due to thepresence of ZOH element.The above is a linear regression model
y(k) = ϕT(k)θ
where ϕT(k) = [−y(k − 1) · · · − y(k − n) u(k − 1) · · · u(k − m)],θT = [a1 · · · an b1 · · · bn].
7 / 16
Estimating θ from N sample data points
Given N input-output samples,.Solution to the Least Squares problem..
......
θ̂N =(ΦTNΦN
)−1ΦTNYN where ΦN =
ϕT(1)...
ϕT(N)
Example: Given the transfer function:
G(z) =b1z−1 + b3z−3
1 + a1z−1 + a2z−2
write the difference equation corresponding to G(z) and also form theΦmatrix suitable to estimate the parameters using the LS
method.
8 / 16
Linear Regression Solution
y(k) + a1y(k − 1) + a2y(k − 2) = b1u(k − 1) + b3u(k − 3)
The ΦN matrix corresponding to any difference equation is formed by first forming theYN matrix, which is constructed by looking at the term with the largest sampledistance between y(k) and any of y(k − n) and u(k − m). In the above example thisterm is u(k − 3). This is done to avoid negative sample indices while writing Φ matrix.Thus,
YN =
y(4)y(5)...
y(N)
=⇒ ΦN =
−y(3) −y(2) u(3) u(1)−y(4) −y(3) u(4) u(2)
......
......
−y(N − 1) −y(N − 2) u(N − 1) u(N − 3)
9 / 16
State-Space Models from Discretization
Suppose we are given a model described by a system of linear differential equation:
dxdt = Ax + Bu
y = Cx + Du
In the above, x is an n-dimensional vector. Suppose that (1) a zero-order hold is usedand (2) sampling is synchronized for all inputs and outputs. Then, treating t = kh asthe initial time and xk ∆= x(kh) as an initial condition we have
x(t) = eA(t−kh)xk +∫ t
kheA(t−τ)Bu(τ)dτ
10 / 16
Discretization of Continuous SS ModelEvaluating the above at t = kh + h with the fact that u(t) = uk for kh ≤ t < kh + h (dueto the zero-order-hold assumption), we obtain
xk+1 = eA(kh+h−kh)xk +∫ kh+h
kheA(kh+h−τ)Bu(τ)dτ
= eAhxk +(∫ h
0
eAsds)
Buk [s = kh + h − τ ]
Now we can write the propagation of variables from one sample time to next as
xk+1 = Φxk + Γukyk = Cxk + Duk
where
Φ = eAh
Γ =
∫ h0
eAsdsB
11 / 16
Delays Can Be Easily Incorporated into theDiscrete Model
dxdt = Ax + Bu(t − θ)
Case I: 0 < θ ≤ hRecall
x(kh + h) = eAhx(kh) +∫ kh+h
kheA(kh+h−τ)Bu(τ − θ)dτ
Note that
u(τ − θ) ={
uk kh + θ ≤ τ < kh + huk−1 kh ≤ τ < kh + θ
Substituting the above and making the change of variable s = kh + h − τ
xk+1 = eAhxk +(
eA(h−θ)∫ θ0
eAsdsB)
uk−1 +(∫ h−θ
0
eAsdsB)
uk
= Φxk + Γ1uk−1 + Γ0uk
12 / 16
Discrete SS Model with Delays (Cont'd)
We can put the above in the standard form as follows:[xk+1uk
]=
[Φ Γ10 0
] [xk
uk−1
]+
[Γ0I
]uk
yk = Cxk + Duk =[
C 0] [
xkuk−1]+ Duk
Hence the state vector at the kth time consists of xk and uk−1. This makes sense sincewhen we have delay (≤ h), the effect of uk−1 has not been fully stored in xk.
Case II: θ = (d − 1)h + θ′ where 0 < θ′ ≤ h and d ≥ 1.Note that for d = 1, we have the previous case. As before
x(kh + h) = eAhx(kh) +∫ kh+h
kheA(kh+h−τ)Bu(τ − θ)dτ
13 / 16
But this time
u(τ − θ) ={
uk−d+1 kh + θ′ ≤ τ < kh + huk−d kh ≤ τ < kh + θ′
Hence,
xk+1 = eAhxk +(
eA(h−θ′)∫ θ′0
eAsdsB)
uk−d
+
(∫ h−θ′0
eAsdsB)
uk−d+1
= Φxk + Γ1uk−d + Γ0uk−d+1
14 / 16
We can put the above in the standard form as follows:
xk+1uk−d+1
...
...uk−1uk
=
Φ Γ1 Γ0 0 · · · 00 0 I 0 · · · 0...
. . .. . .
. . .. . .
......
. . .. . .
. . .. . .
......
. . .. . .
. . .. . . I
0 · · · · · · · · · · · · 0
xkuk−d......
uk−2uk−1
+
00......0I
uk
yk =[
C 0 0 · · · · · · 0]
xkuk−d......
uk−2uk−1
+ Duk
Note that the state vector at the kth time must include uk−1, · · · , uk−d since theeffect of past d inputs has not been stored in xk.
15 / 16
State-Space Models from Identification
One can also obtain a discrete state-space model from data. This canbe done by
Using methods called subspace ID that directly gives a model inthe discrete state-space form.
Identifying a transfer function model and then performing a``realization" on it (which means finding an I/O-wise equivalentstate-space model representation).
16 / 16
458.604 Process Dynamics & Control
Lecture 5: System Identification:Introduction
Jong Min LeeChemical & Biomolecular Engineering
Seoul National University
April 20, 2015
1 / 1
References
L. Ljung, System Identification: Theory for the User, Prentice Hall.
Soderstrom, T. and P. Stoica, System Identification, Prentice Hall
Box, G. E. P. and G. M. Jenkins, Time Series Analysis: Forecastingand Control, Holden-Day, 1994.
2 / 1
.First-Principles Modeling..
......
Usually involves fewermeasurements; requiresexperimentation only for theestimation of unknown parameters.
Provides information about internalstate of the process.
Promotes fundamentalunderstanding of the internalworkings of the process.
Requires fairly accurate andcomplete process knowledge.
Not useful for poorly understoodand/or complex processes.
Naturally produces both linear andnonlinear models.
.System Identification..
......
Requires extensive measurements.
Provides information only about theportion of the process.
Treats the process like a ``blackbox".
Requires no such detailedknowledge.
Quite often proves to be the onlyalternative for poorlyunderstood/complex processes.
Requires special methods fornonlinear models.
3 / 1
Objective of Sys IDFrom I/O Data Set: {y(k), u(k), k = 1, · · · ,N}.Identify........y(k) = G(q)u(k) + H(q)ε(k)
where
G(q): plant transfer function (deterministic part)H(q): disturbance transfer function (stochastic, noise part)ε: white noise
or
x(k + 1) = Ax(k) + Bu(k) + Kε(k)y(k) = Cx(k) + ε(k)
System identification at a more general level includes other tasks suchas data generation, data pretreatment, and model validation.
4 / 1
Plant vs. Noise ModelPlant
Modelu(k) y(k)
G(q)
According to the figure above, the output can be exactlycalculated once the input is known.In most cases, this is unrealistic.There are always signals beyond our control that also affect thesystem.Assume that such effects can be lumped into an additive termw(k) at the output.
y(k) =∞∑τ=1
g(τ)u(k − τ) + w(k)
Note: g(τ) is the impulse response, which is obtained by the unit pulse input.
5 / 1
Then, we have
Plant
Model +u(k) y(k)
G(q)
w(k)
The value of disturbance (w(k)) is not known beforehand.So, we employ a probabilistic framework to describe futuredisturbances.
We assume that w(k) is driven by a white noise sequence ε(k) forsimplicity.
6 / 1
Plant
Model +
Disturbance
Model
u(k) y(k)
G(q)
H(q)
ε(k)
ε(k): white noise.
7 / 1
Parametric vs. Nonparametric Methods
...1 Parametric MethodsSelect the best one among a confined set of possible models.Finite dimensional parameters.Ex) transfer function (matrix) of given order, ``Finite" impulseresponse identification.
...2 Nonparametric MethodsTime domain: Step response, Impulse response, CorrelationanalysisFrequency domain: Fourier analysis, Spectral analysis
End Objective:Obtain a model providing a good (multi-step) prediction with theintended feedback control loop in place.
8 / 1
Model Structure for ParametricIdentification
Standard Form (SISO):
y(k) = G(q, θ)u(k) + H(q, θ)ε(k)
ε(k): a white noise sequenceH(q): stable and stable invertible transfer function
Differenced Form: If the process mean shifts continuously or timeto time use
∆y(k) = G(q, θ)∆u(k) + H(q, θ)ε(k)
9 / 1
ARX (Auto Regressive eXogenous)
y(k) + a1y(k − 1) + · · · any(k − n) = b1u(k − 1) + · · ·+ bmu(k − m) + ε(k)
G(q, θ) = B(q)A(q)∆=
b1q−1 + · · ·+ bmq−m1 + a1q−1 + · · ·+ anq−n
H(q, θ) = 1A(q)∆=
1
1 + a1q−1 + · · ·+ anq−n
For ARX structure, use of a very high order model is often necessary.
10 / 1
ARMAX (Auto Regressive Moving AverageeXogenous)
y(k) + a1y(k − 1) + · · ·+ any(k − n) = b1u(k − 1) + · · ·+ bmu(k − m)+ ε(k) + c1ε(k − 1) + · · ·+ cℓε(k − ℓ)
H(q) = C(q)A(q)∆=
1 + c1q−1 + · · ·+ cℓq−ℓ1 + a1q−1 + · · ·+ anq−n
11 / 1
OE (Output Error) Structure
ỹ(k) + a1ỹ(k − 1) + · · ·+ anỹ(k − n) = b1u(k − 1) + · · ·+ bmu(k − m)y(k) = ỹ(k) + ε(k)
ỹ: deterministic output
G(q, θ) = A(q)B(q) and H(q) = 1
The OE structure can also encompass the case where the noise modelis set a priori:
y(k) = G(q, θ)u(k) + H(q)ε(k)1
H(q)y(k) = G(q, θ)1
H(q)u(k) + ε(k)
12 / 1
Orthogonal Expansion Model
A special kind of OE structure where
G(q) =n∑
i=1biBi(q)
where Bi(q) are orthogonal basis functions. For example,
Bi(q) = q−i ⇒ Finite Impulse Response modelBi(q) =
√1−α2q−α
(1−αqq−α
)i−1⇒ Laguere model
13 / 1
Other Structures
Box-Jenkins structure:
y(k) = B(q)A(q)u(k) +C(q)D(q)ε(k)
ARIMAX structure (Auto Regressive Integrator Moving AverageeXogenous)
y(k) = B(q)A(q)u(k) +1
1− q−1C(q)A(q)ε(k)
14 / 1
Nonparametric Model: Impulse Response
u = {1, 0, 0, · · · }y = {0, H1, H2, · · · , Hn, Hn+1, 0, · · · }
Time
1
Time
u
h
y
H1
H2
H3
15 / 1
Nonparametric Model: Step Response
u = {1, 1, 1, · · · }y = {0, S1, S2, S3, · · · }
Time
1
Time
u
h
y
S1
S2
S3
16 / 1
Major Steps
Gathering of data through a plant test
Data conditioning and pretreatment
Transition of data to a model: model structure selection andparameterization plus parameter estimation
Validation
17 / 1
The System Identification Loop
ExperimentDesign
Data
ChooseModel Set
ChooseCriterion
of Fit
Calculate Model
ValidateModel
PriorKnowledge
Not OK:
Revise
OK: Use It
18 / 1
Step TestingProcedure
...1 Assume operation at steady-state withcontrolled var. (CV): y(t) = y0 for t < t0manipulated var. (MV): u(t) = u0 for t < t0
...2 Make a step change in u of a specified magnitude, ∆u for
u(t) = u0 +∆u for t ≥ t0...3 Measure y(t) at regular intervals:
y(k) = y(t0 + kh) for k = 1, 2, · · · ,N
whereh: the sampling intervalNh: is approximate time required to reach steady state
...4 Calculate the step response coefficients from the data
S(k) = y(k)− y0∆u for k = 1, 2, · · · ,N
19 / 1
Discussions1. Choice of sampling period h
For modelling, best h is one such that N = 30 ∼ 40.
Ex: If G(s) = Ke−θs
τs + 1 , then settling time ≈ 4τ + θ.
Therefore, h ≈ 4τ + θN =4τ + θ
40= 0.1τ + 0.025θ
May be adjusted depending on control/operation objectives.
2. Choice of Step Size (∆u)Too small:
May not produce enough output changeLow signal to noise ratio
Too big:Shift the process to an undesirable conditionNonlinearity may be induced
Trial and error is needed to determine the optimum step size.
20 / 1
Discussions on Step Testing (Cont'd)
3. Choice of number of experimentsAveraging results of multiple experiments reduces impact ofdisturbances on calculated S(k)'s.Multiple experiments can be used to check model accuracy bycross-validation. (Data sets for Identification ↔ Data set forValidation)
4. An appropriate method to detect steady state is required.
5. While the steady state (low frequency) characteristics areaccurately identified, high frequency dynamics may beinaccurately characterized.
21 / 1
Procedure for Pulse Testing (ImpulseResponse)
...1 Steady operation at y0 and u0
...2 Send a pulse of size δu lasting for 1 sampling period.
...3 Calculate pulse response coefficients
H(k) = y(k)− y0δu for k = 1, · · · ,N
22 / 1
Discussions on Pulse Testing
...1 Select h and N as for the step testing
...2 Usually need δu ≫ ∆u for adequate S/N ratio
...3 Multiple experiments are recommended for the same reason as inthe step testing.
...4 An appropriate method to detect steady state is required.
...5 Theoretically, pulse is a perfect (unbiased) excitation for linearsystems.
23 / 1
Input Design
Why use test inputs other than a step or pulse?Pure step tests or pulse tests usually take too long and areimpossible for some inputs.More system excitation produces more informationCompletely random inputs (e.g., RBS, PRBS) excite all frequencieswith equal energy.
24 / 1
Type of Inputs
Random Binary Signal (RBS) orPseudo-Random Binary Signal(PRBS)
0 20 40 60 80 100−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
Time
u
Random Noise
0 20 40 60 80 100−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time
u
25 / 1
PRBS
Size of u(t) is fixed and switches between two levels.Choice of whether to switch or stay is random: flip a coinSequence design choices are:
Levels to switch betweenBase length of time between switch (period)Duration of experiment
Trade-off between size of PRBS and duration of experimentLarger size and longer duration give better estimatesPower of this signal is that you can do a small size (unnoticeable) fora long time to get a good result.
Base switching periodReflect process dynamicsSet a ``dominant" time constant
26 / 1
PRBS: Distillation Column Example
Time to steady-state is 30--45 mins (τ = 10− 15mins)Length of experiment (6 hours)
24--36 switches (not very many)
LevelsReflux RateSteam Rate
Sequence Design: May want to start with a step of 3− 4τChoose start (+1 or -1 level)At next switch time flip a coin.
27 / 1
Frequency Range of Input Excitation
...1 Based on the step response, obtain τp.
...2 Calculate the corner frequency ωCF = 1τp [rad/time]
...3 Choose a sampling interval h based on earlier discussion.
...4 Nyquist frequency:1
2h [cycles/time] =π
h [rad/time]...5 Choose lower bound for the input frequencies as zero (in order toobtain a good estimate of the gain).
...6 Choose upper bound for the input frequencies as 2.5 ∼ 3× ωCFωN
...7 In MATLAB,u = idinput(2000, 'rbs', [0 0.01], [-1 1]);
28 / 1
Model Types and Transfer Function
Model TypesOutput Error (least general)ARXARMAXBox-Jenkins (most general)
Process Transfer Function
Gp(q−1) =B(q)F(q)q
−(d+1)
Zeros: roots of B(q)Poles: roots of F(q)Time Delay: d -- Note that extra 1 time-delay is naturally introducedby zero order hold and sampling, and d is pure time delay.
29 / 1
Disturbance Modelling: StochasticProcesses
ParametricAutoregressive (denominator)Moving average (numerator)AutoRegressive and Moving Average (ARMA)
w(k) = C(q)D(q)ε(k)
ARIMA (AutoRegressive Integrated Moving Average) Model
w(k) = C(q)D(q)1
(1− q−1)d ε(k)
30 / 1
Least Squares Identification
Recall (from Lecture 5) that least square estimate of parameters isgiven as
θ̂N =(ΦTNΦN
)−1ΦTNYN where ΦN =
ϕT(1)...
ϕT(N)
for
y(k) = ϕT(k)θ
where ϕT(k) = [−y(k − 1) · · · − y(k − n) u(k − 1) · · · u(k − m)],θT = [a1 · · · an b1 · · · bn].
31 / 1
458.604 Process Dynamics & Control
Lecture 6b: Disturbance Modelling
Jong Min LeeChemical & Biomolecular Engineering
Seoul National University
May 9, 2015
1 / 21
Disturbance Modelling
Why?Predict its effect on the output so that they can be eliminated.
Deterministic vs. stochastic disturbancessteps, pulses, sinusoids -- deterministicwhite noise, colored noise, integrated white noise, etc. -- random
Stochastic processes are convenient vehicles to describe them.Stochastic disturbances and noise are almost always present.Most disturbances, even deterministic ones, are unpredictable interms of size, direction and time of occurrence.
2 / 21
Linear Stochastic ModelsImportant: In linear systems, it is not necessary to identify andmodel actual physical disturbance sources. It is sufficient to modeltheir overall effect on the output.
w: Physical disturbance variables or signals representing thecollective effect of disturbances on the output
Driven by white noise ε
Transfer function model
w(k) = H(q)ε(k)
StochasticModelH(q)
ε(k) w(k)
3 / 21
Linear Stochastic Models: ExampleAmbient temperature / pressure at Edmonton International Airport
Power / water consumption in Edmonton
Stock market
Any ``unknown" or ``indescribable" disturbances of a process unit
A stochastic process may look like
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000-80
-60
-40
-20
0
20
40
60
Time
y
random behaviouraround trends
grosstrends
4 / 21
General Structure
w(k) =
moving average component︷ ︸︸ ︷1 + θ1q−1 + θ2q−2 + · · ·+ θmq−m
1 + ϕ1q−1 + ϕ2q−2 + · · ·+ ϕnq−n︸ ︷︷ ︸autoregressive component
· 1(1− q−1)d︸ ︷︷ ︸
integrating component
·ε(k)
Each part gives a different relationship between the current valueof stochastic output (w(k)) with its past values(w(k − 1), w(k − 2), · · · ) or with the input itself.Our goal is to identify each part: identify how the output is relatedto itself and to the input, ε.
5 / 21
Time SeriesSequence of observations taken sequentially over time.
If the variable has randomness, the sequence is stochastic process.
0 100 200 300 400 500−50
−40
−30
−20
−10
0
10
20
30
40
Time (k)
w
realization 1realization 2realization 3
w(k) = w(k − 1) + ε(k)
6 / 21
Description of Stochastic Processes
Two Relevant Questions:...1 Does probability of an outcome (or realization) of w(k+ τ) dependon outcome of w(k)?
Are w(k) and w(k + τ) are independent?...2 Does the distribution of w(k) or the joint distribution of{w(k), w(k + τ)} depend on k?
Does the mean change with time?Are the covariances, cov{w(1), w(5)} and cov{w(11), w(15)},different?
In the last lecture, we learned
Autocovariance
Weakly Stationary Process
7 / 21
In the Context of Our Applications...
We assume ``weakly stationary processes".constant meansconstant variancesautocovariances depend only on lags
Autocovariance
Rw(τ) = E{(w(k)− w̄)(w(k + τ)− w̄)T}
8 / 21
Autocorrelation
Autocovariance has scale
Normalized quantity: autocorrelation
ρw(τ) =cov{w(k),w(k + τ)}√
var(w(k + τ))√
var(w(k))
=E{(w(k)− w̄)(w(k + τ)− w̄)T}
σ2w
=Rw(τ)Rw(0)
Note: because of the stationarity, var(k + τ) = var(k) = σ2w
9 / 21
Autocorrelation & Autocovariance...1 Variance is simply the autocovariance at lag 0
σ2w = Rw(0)
...2 Autocorrelation and autocovariance are symmetric in lag τ
Rw(τ) = Rw(−τ)ρw(τ) = ρw(−τ)
...3 Autocorrelation is bounded and normalized
−1 ≤ ρw(τ) ≤ 1
...4 Autocorrelation and autocovariance are parameters summarizingthe probability behaviour of the stochastic process w(k)
Sample autocorrelation / autocovarianceusing sample data
10 / 21
Disturbance Example 1w(k) = ε(k) + 0.75ε(k − 1)
where ε ∼ N (0, σ2ε).
Autocorrelations?Lag 0: ρw(0) = 1
Current output always perfectly correlated with itself.Lag 1: ρw(1) = 0.48Lag > 1: ρw(τ > 1) = 0
0 1 2 3 40
0.5
1
τ
ρw(τ
)
11 / 21
Disturbance Example 1Non-zero values to lag 1
Lag 1 moving average disturbance ∆= MA(1) disturbance.
5 10 15 20 25 30
−3
−2
−1
0
1
2
3
Time (k)
wDisturbance Ex 1
Notice local "trends"
Time Response
12 / 21
Disturbance Example 2Dependence on past output
w(k + 1) = 0.6w(k) + ε(k + 1)Autocorrelations
lag 0: ρw(0) = 1lag 1: ρw(1) = 0.6lag 2: ρw(2) = (0.6)2 = 0.36lag k: ρw(k) = (0.6)k
0 1 2 3 4 50
0.5
1
τ
ρw(τ
)
13 / 21
Disturbance Example 2: AutoregressiveThe disturbance w is weakly stationary
Sum of stationary stochastic processes; an infinite sum of the whitenoise sequence ε(k)'s.AR coefficient is 0.6 → convergentMean is zero and variance is constant
5 10 15 20 25 30
−3
−2
−1
0
1
2
3
Time (k)
w
Disturbance Ex 2
Notice local trends
14 / 21
Two examples were...
Example 1 is a moving average disturbance:
w(k) = ε(k) + 0.75ε(k − 1)= {1− θ1q−1}ε(k)
Example 2 is an autoregressive disturbance:
w(k) = 0.6w(k − 1) + ε(k)
=1
1− ϕ1q−1ε(k)
15 / 21
Detecting Model Structure from Data
Given time series data of a stochastic process
Examine autocorrelation plotIf a sharp cut-off at lag k is detected, then the disturbance is amoving average, order k, disturbance.If a gradual decline is observed, then the disturbance contains anautoregressive component
Long tails indicate either a higher-order autoregressive component,or a pole near 1
If the autocorrelations alternate in positive and negative values oneor more of the roots is negative.
16 / 21
Estimating Autocovariances from DataSample autocovariance function
R̂w(τ) =1
N
N−τ∑k=1
(w(k)− w̄)(w(k + τ)− w̄)T
N is the number of data pointsR̂w(0) is sample variance of w(k)When R̂w(τ) is computed, confidence limits should be considered.
Sample autocorrelation function
ρ̂w(τ) =R̂w(τ)R̂w(0)
Confidence limits for the autocorrelation are derived byexamining how variability propagates through calculations.
17 / 21
Example 1: Estimated AutocorrelationPlot
Sharp cut-off: Moving Average
0 1 2 3 4 5 6 7 8 9 10−0.2
0
0.2
0.4
0.6
0.8
1
1.2
lag
auto
corr
elat
ion
Moving Average Disturbance Process
18 / 21
Example 2: Estimated AutocorrelationPlot
Gradual decay: Autoregressive
0 1 2 3 4 5 6 7 8 9 10−0.2
0
0.2
0.4
0.6
0.8
1
1.2
lag
auto
corr
elat
ion
Autoregressive Disturbance Process
19 / 21
Partial AutocorrelationIt is difficult to identify the order of the AR component due to thegradual decay for the AR structure.
Q. How do we find the order of the AR component?
A. Partial Autocorrelation:Compute autocorrelation between w(k) and w(k + τ) after taking intoaccount the dependence on values k + 1, k + 2, k + τ − 1.The partial autocorrelation at lag τ is the autocorrelation between w(k)and w(k + τ) that is not accounted for by lags 1 through τ − 1.The partial autocorrelation of an AR(τ ) process is zero at lag τ + 1 andgreater.
If the sample autocorrelation plot indicates that an AR model isappropriate, the sample partial autocorrelation is examined to identifythe order of an AR model.
20 / 21
How to Use the Partial Autocorrelation
For an autoregressive process of order p, a sharp cut-off will beobserved after lag p in which the partial autocorrelations go tozero.
No more explicit dependence beyond lag p.
The partial autoregressive plot for moving average processes willexhibit a decay.
The autocorrelation and partial autocorrelation behaviours are dualfor autoregressive and moving average processes.
21 / 21
SysID Example
458.604Jong Min Lee
2
Autocorrelation Function (ACF)
• Explains the dependency of samples– Correlation between x(k) and x(k+τ)
• What can we infer?– Impulse-type ACF: white noise– Sharp cut-off after n lags: MA(n)– Tails off to several lags: AR – “Very slow” decrease: Integrated process
• e.g., η(k+1) = η(k) + e(k+1)
3
Partial Autocorrelation Function (PACF)
• Reveals what cannot be explained by ACF• Useful for AR(n) processes
• What we can infer:– Impulse-type PACF: white noise signal– Sharp cut-off after n lags: AR(n)– Tails off to several lags: MA
4
Identification of Linear I/O Models
G(q)
H(q)
++u(k) y(k)
e(k) or η(k)
Given u(k) and y(k), determine a plant (deterministic) model (G) and a disturbance (noise, stochastic) model (H).
yp(k)yd(k)
5
Box-Jenkins Structure
)()()()(
)()()( ke
qDqCnkku
qFqBky +−=
1121)(
+−− +++= nbnbqbqbbqB !nf
nf qfqfqF−− +++= !111)(
doc bj
C = 1, D = F: ARX
C = 1, D = 1: OE
6
Output Error (OE) Models
)()()()()( kenkkuqFqBky +−=
• Yields best unbiased estimate of the plant model
• Disturbance model is not considered.
• Useful for identifying plant models in BJ
oe(iddata, [nb nf nk])
7
OE Models• Assumes that the disturbance (or the prediction errors) is white.
• The best OE model is what restricts the correlations between residuals and input within a confidence bound.
• ACF(residuals) need not be white because the assumption of OE may not be true.
• If ACF(res) is white, then the system posesses OE structure.
• If ACF(res) is non-white, build an ARMA model to the residuals to obtain a disturbance model.
8
General Procedure for Identification
ARX
ARMAX
BJOE + ARMA
Identification Example
10
Example
0 50 100 150 200 250 300-4
-2
0
2
4
Time
y1
0 50 100 150 200 250 300-2
-1
0
1
2Lab 2 example: IO data
time
• 4000 samples• Unknown process order, time-delay, and structure.• Detrending: detrend(data, ‘constant’)• Divide data set into training and validation sets• cra: impulse response coefficients and time-delay (nk = 2)
0 2 4 6 8 10 12 14 16 18 20-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6Impulse response estimate
lags
0.2987
0.4685
11
Step Response
-20 -10 0 10 20 30 40 50 60-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8From u1
To y
1
Time
>> z = iddata(y, u);>> step(z)
175.37.1)(+
≅−
sesGs
• Discrete-time pole: exp(-(1/3.75)x1) = 0.766• No imaginary pole: discrete-time poles are real and positive• Step response is used to compare with that of the identified model.
y starts increasing from t=2
12
Fitting an ARX Model
• Fitting plant + disturbance models simultaneously
• Use identified time-delay for nk
• nk = pure delay + z.o.h. delay (1)
• Start with low orders
• Relevant functions: arxstruc, selstruc
)()()()()( kenkkuqBkyqA +−=
model=arx(data, [na nb nk])
13
ARX Models
0 5 10 15 20 25-0.5
0
0.5
1Correlation function of residuals. Output y1
lag
-25 -20 -15 -10 -5 0 5 10 15 20 25-0.5
0
0.5
1Cross corr. function between input u1 and residuals from output y1
lag
resid(mARX112, ztd) resid(mARX442, ztd)
0 5 10 15 20 25-0.5
0
0.5
1Correlation function of residuals. Output y1
lag
-25 -20 -15 -10 -5 0 5 10 15 20 25-0.1
-0.05
0
0.05
0.1Cross corr. function between input u1 and residuals from output y1
lag
14
ARX 442
2000 2500 3000 3500 4000-5
-4
-3
-2
-1
0
1
2
3
4
y1
Time
Measured Output and Simulated Model Output
Measured OutputmARX442 Fit: 89.68%
-20 -10 0 10 20 30 40 50 60-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8From u1
To y
1
Time
ProcessARX 442 Model
15
OE for Plant Model (Gp)
0 5 10 15 20 25-0.5
0
0.5
1Correlation function of residuals. Output y1
lag
-25 -20 -15 -10 -5 0 5 10 15 20 25-1
-0.5
0
0.5
1Cross corr. function between input u1 and residuals from output y1
lag
resid(OE112, ztd) )()2(1
)( 11
1 kekuqfbky +−
+=
−
OKWe can model the residuals later with disturbance modelling (AR, MA…)
Not Acceptable!u and y are still correlated → further room to improve
16
0 5 10 15 20 25-0.5
0
0.5
1Correlation function of residuals. Output y1
lag
-25 -20 -15 -10 -5 0 5 10 15 20 25-0.1
-0.05
0
0.05
0.1Cross corr. function between input u1 and residuals from output y1
lag
OE Model: [2 2 2]
• If correlation is not satisfactory, increase numerator or denominator order
[nb nf nk]: [1 1 2] → [1 2 2] → [2 1 2] → [2 2 2]
Acceptable!
ACF of residuals is almost white.(Not a concern here. However, there may not be a need to constructa disturbance model.)
17
OE222 Model>> present(mOE222)Discrete-time IDPOLY model: y(t) = [B(q)/F(q)]u(t) + e(t) B(q) = 0.2981 (+-0.003374) q^-2 + 0.2034 (+-0.008552) q^-3F(q) = 1 - 0.8972 (+-0.012) q^-1 + 0.1985 (+-0.008682) q^-2
Gain: 0.1665 (G(1))
Poles: 0.5010 and 0.3962 (>>roots([1 -0.8972 0.1985]))
18
Comments• Use training data for checking residuals (resid)
• Use validation data for checking prediction performance (compare)
• Compare model’s response on the validation data set:– compare(ztd, mOE222)– step(ztd, mOE222, ‘r*-’); legend(‘process’, ‘OEmodel’)– g=spa(ztd); bode(g, mOE222, ‘r-.’)
19
10-2 10-1 100 10110-2
10-1
100
101
Ampl
itude
From u1 to y1
10-2 10-1 100 101-400
-300
-200
-100
0
Phas
e (d
egre
es)
Frequency (rad/s)
ProcessOE222
2000 2500 3000 3500 4000-5
-4
-3
-2
-1
0
1
2
3
4
y1Time
Measured Output and Simulated Model Output
Measured OutputmOE222 Fit: 89.72%
-20 -10 0 10 20 30 40 50 60-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8From u1
To y
1
Time
ProcessmOE222
OK… We finished modelling the plant (G) using OE.
Are we really done?
21
Disturbance Modelling: Gd?yd = y – yp
(residuals from the OE222 model)
1. Plot ACF and PACF of the residuals (yd)
2. Determine whether the disturbance process (G) is MA, AR, or ARMA
3. Use the MATLAB function ARMAX
doc armax
If data has no input channels and just one output channel (that is, it is a time series), then orders = [na nc]
and armax calculates an ARMA model for the time series
)()()()( keqCkyqA =
22
Identifying Gd
0 2 4 6 8 10 12 14 16 18 20-0.2
0
0.2
0.4
0.6
0.8
1
1.2Auto-correlation Function
lag0 2 4 6 8 10 12 14 16 18 20
-0.2
0
0.2
0.4
0.6
0.8
1
1.2Partial-autocorrelation Function
lag
MA(1) or AR(1) ?: AR is a more general. I will choose AR(1)[na nc] = [1 0]
>> errOE = pe(ztd, mOE222);>> autocf(errOE, 20, 0); pautocf(errOE, 20);>> mDist = armax(errOE.y, [1 0]);>> present(mDist)Discrete-time IDPOLY model: A(q)y(t) = e(t)A(q) = 1 - 0.1323 (+-0.02332) q^-1
23
The residuals of yd from GderrmDist = pe(mDist, errOE);figure; [R3, CI3, NR3] = autocf(errmDist.y, 20, 0);
0 2 4 6 8 10 12 14 16 18 20-0.2
0
0.2
0.4
0.6
0.8
1
1.2Auto-correlation Function
lag
24
Putting Together: BJ Model
• You can refine G and H by putting them together in BJ. (see the manual)
• In this case, OE and ARMA modelling steps provide an initial estimate for the BJ structure.
• One can directly use the BJ by specifying model orders.
458.604 Process Dynamics & Control
Lecture 7: Linear Quadratic ControlDeterministic Case
Jong Min LeeChemical & Biomolecular Engineering
Seoul National University
May 19, 2015
1 / 22
Outline
Basic problem setup
Deterministic system
Stochastic system
2 / 22
Basic Problem SetupLinear Deterministic System:
x(k + 1) = Ax(k) + Bu(k) (1)y(k) = Cx(k) (2)
We consider time-invariant system for simplicity.For a linear state feedback controller
u(k) = −L(k)x(k) (3)
The closed-loop response is:
x(k + 1) = (A − BL(k))x(k).Stability..
......
The state feedback controller (3) stabilizes the system if all theeigenvalues of (A − BL) lie within the unit disk
3 / 22
Objective of LQA system visits a sequence of states of x(0), x(1), . . . , x(p), anddesired sequence of states x(0), x̄(1), . . . , x̄(p).Without loss of generality, the desired trajectory, x̄, can be set asthe origin.
Objective function
minp−1∑k=0
[xT(k)Qx(k) + uT(k)Ru(k)
]+ xT(p)Qtx(p) (4)
Q and R are symmetric positive definite; Qt is positive semi-definiteQ provide relative importance to the errors in various statesR accounts for the cost of implementing input moves
if p = ∞, it is infinite horizon problem.
4 / 22
Open-Loop Control vs. Feedback ControlOptimal open-loop control problem
Find the optimal sequence of u(0), . . . , u(k) for given (as afunction of) distribution of x(0).
Optimal feedback control problemFind the optimal feedback law u(k) = f(x(k)) oru(k) = f(y(k), y(k − 1), . . .).
For completely deterministic systems, the two should provide thesame performance.
State Feedback vs. Output Feedback
u(k) = f(x(k)) ⇒ State feedbacku(k) = F(y(k)) ⇒ Output feedback
F would be a dynamic operator in general.
5 / 22
Least Squares SolutionOpen-Loop Optimal Feedback Control
Using (1) recursively gives,
x(k + 1) = Ax(k) + Bu(k) = A (Ax(k − 1) + Bu(k − 1)) + Bu(k)= A2x(k − 1) + Bu(k) + ABu(k − 1)
=...
= Ak+1x(0) +(Bu(k) + ABu(k − 1) + . . .+ AkBu(0)
)Thus, we can write
x(0)x(1)x(2)...
x(p)
︸ ︷︷ ︸
X
=
IAA2...
Ap
︸ ︷︷ ︸
Sx
x(0) +
0 0 · · · 0B 0 · · · 0
AB B · · · 0...
.... . . 0
Ap−1B Ap−2B · · · B
︸ ︷︷ ︸
Su
u(0)u(1)u(2)...
u(p − 1)
︸ ︷︷ ︸
U
6 / 22
System equationX = Sxx(0) + SuU
Quadratic cost function
V0(x(0);U) =p−1∑k=0
[xT(k)Qx(k) + uT(k)Ru(k)
]+ xT(p)Qtx(p)
= XTΓxX + UTΓuU
Γx = blockdiag {Q, . . . , Q, Qt}; Γu = blockdiag {R, . . . , R}Optimal cost
V0(x(0)) = minU
{xT(0)SxTΓxSxx(0)+
UT[SuTΓxSu + Γu
]U + 2xT(0)SxTΓxSuU
}
Optimal solution
U∗ = −H−1g = −[SuTΓxSu + Γu
]−1SuTΓxSxx(0)
7 / 22
OLOFC
U∗ = −[SuTΓxSu + Γu
]−1SuTΓxSxx(0) (5)
V∗0(x(0)) = xT(0)[SxTΓxSx−
SxTΓxSu(SuTΓxSu + Γu
)−1SuTΓxSx
]x(0)
Open-loop optimal control finds a sequence u∗(0), u∗(1), . . . , u∗(p − 1)for a given x(0)
Recursively use (5) as in Receding Horizon Control
Not efficient computationally
Not generalizable to the stochastic case
8 / 22
Closed-Loop Optimal Feedback Control
One obtains the optimal control move as a function of state ateach time
Solved using Dynamic Programming
More elegant and closed-loop optimal solution
9 / 22
Dynamic Programming
At the stage p − 1, Bellman's equation is
Vp−1(x(p − 1)) = minu(p−1)
{xT(p − 1)Qx(p − 1) + uT(p − 1)Ru(p − 1)
+xT(p)S(p)x(p)}
(6)
where S(p) = Qt.Noting that x(p) = Ax(p − 1) + Bu(p − 1), we get:
Vp−1(x(p − 1)) = minu(p−1)
{xT(p − 1)
(ATS(p)A + Q
)x(p − 1)+
2xT(p − 1)ATS(p)Bu(p − 1)+uT(p − 1)
(BTS(p)B + R
)u(p − 1)
}
10 / 22
As before, the optimal solution can be obtained as:
u∗(p − 1) = −(BTS(p)B + R
)−1BTS(p)A︸ ︷︷ ︸L(p−1)
x(p − 1)
Substitution of u∗(p − 1) gives
Vp−1 (x(p − 1)) = xT(p − 1)S(p − 1)x(p − 1)
where S(p − 1) is given by the following Riccati Equation
S(p − 1) = ATS(p)A + Q − ATS(p)B(BTS(p)B + R
)−1 BTS(p)A
11 / 22
Stage: p − 2
Vp−2(x(p − 2)) = minu(p−2)
{xT(p − 2)Qx(p − 2) + uT(p − 2)Ru(p − 2)+
Vp−1(x(p − 1))}= min
u(p−2)
{xT(p − 2)Qx(p − 2) + uT(p − 2)Ru(p − 2)+
xT(p − 1)S(p − 1)x(p − 1)}
This equation is in the same for as (6). The optimal solution is
u∗(p − 2) = −(BTS(p − 2)B + R
)−1 BTS(p − 1)A︸ ︷︷ ︸L(p−2)
x(p − 2)
12 / 22
Generalization
Successively solving for cost-to-go Vk(x(k)), we get:
u∗(k) = −L(k)x(k), for k = p − 1, . . . , 0
where
L(k) =(BTS(k + 1)B + R
)−1 BTS(k + 1)AS(k) = ATS(k + 1)A + Q−
ATS(k + 1)B(BTS(k + 1)B + R
)−1 BTS(k + 1)A (7)Note that (7) is the familiar Riccati Difference Equation that weencounter in Kalman Filtering as well.
13 / 22
Comments
For a deterministic case, OLOFC and CLOFC yield the samesolution
The optimal p-stage cost is: V0(x0) = xT(0)S(0)x(0)Receding Horizon solution to optimization is computationallydemanding
Dynamic Programming leads to the optimal control solution as anexplicit linear function, u(k) = −L(k)x(k)Recursive solution of Riccati equation, required in DP, isstraightforward
Note that the results hold only for the unconstrained system
14 / 22
Extension of DP to Infinite Horizon
Assuming the RDE solution converges to S∞,
u(k) = −(BTS∞B + R
)−1 BTS∞A︸ ︷︷ ︸L∞
x(k)
S∞ = ATS∞A + Q − ATS∞B(BTS∞B + R
)−1 BTS∞A (8)Note that (8) is known as Algebraic Riccati Equation
The RDE (7) converges to S∞ in the infinite horizon case if (A,B) isstabilizable pair
The converged solution gives stable controller if (Q1/2,A) isdetectable pair
15 / 22
Extension of OLOCP to Infinite Horizon
Q. Direct extension of OLOCP to infinite horizon seems impossiblebecause of the infinite number of inputs to optimize. What can we do?
A. For certain choices of Qt, the finite horizon problem:
minu(0),...,u(p−1)
{V0 (x(0);U)}
can be made equivalent to the infinite horizon problem.
16 / 22
OLOCP: Equivalence with the InfiniteHorizon Problem
Option I: We can choose Qt such that:
xT(k + p)Qtx(k + p) = minu(k+p),...
{∞∑
i=pxT(k + i)Qx(k + i) + uT(k + i)Ru(k + i)
}
It is clear that we can compute such Qt by solving the ARE of
Qt = ATQtA + Q − ATQtB(BTQtB + R
)−1 BTQtAWith this choice of Qt, the optimal solution of p-horizon problemis equivalent to that of ∞-horizon one.
17 / 22
Option II: We may also choose Qt such that
xT(k + p)Qtx(k + p) =∞∑
i=pxT(k + i)Qx(k + i)
The above equation is under the assumption that no controlaction is taken beyond the horizon k + pThen, the autonomous system x(k + 1) = Ax(k) describes theevolution of the state
This assumption is meaningful only when the system is stable,otherwise the cost is infinite
We can show that the above Qt is a solution to Lyapunov Equation:
Qt = Q + ATQtA
18 / 22
Option II: Lyapunov FunctionGeneralized energy function
Zero @ equilibrium point and positive elsewhere
The equilibrium will be stable if Lyapunov function (Vl) decreases alongthe trajectories of the system
∆Vl(x) or V̇l(x)
Note that the system equation from time k + p is given asx(k + i + 1) = A(k + i)x(k + i).
Vl(x(k + p)) = xT(k + p)Qtx(k + p)∆Vl(x(k + p)) = Vl(x(k + p + 1))− Vl(x(k + p))
= Vl(Ax(k + p))− Vl(x(k + p))= xT(k + p)
(ATQtA − Qt
)x(k + p) := −x(k + p)TPx(k + p)
Usually, Qt is found by specifying P as a positive definite matrix19 / 22
In this case, P = Q from the definition of the terminal cost:
xT(k + p)Qtx(k + p) = xT(k + p)Qx(k + p) +∞∑
i=p+1xT(k + i)Qx(k + i)
= xT(k + p)Qx(k + p) + xT(k + p + 1)Qtx(k + p + 1)= xT(k + p)Qx(k + p) + xT(k + p)ATQtAx(k + p)
This gives a discrete time Lyapunov equation
Qt = ATQtA + Q
20 / 22
OLOCP: Equivalence with the InfiniteHorizon Problem
Option III: Solve the finite horizon problem with x(k + p) = 0 as aconstraint
Note that Option II cannot be used if the system is unstable.
Option III can then be used for unstable systems
21 / 22
Extension to Output Feedback CaseSo far, we assumed that the full state feedback is available. In case of outputfeedback, the control actions are based on state estimates x̂(k)
Observer:
x̂(k) = Ax̂(k − 1) + Bu(k − 1) += K [y(k)− C (Ax̂(k − 1) + Bu(k − 1))]
Controller: u(k) = −Lx̂(k)
If we define xe(k)∆= x(k)− x̂(k), we get:[
x(k + 1)xe(k + 1)
]=
[A − BL −BL
0 A − KCA
] [x(k)xe(k)
].Separation Principle..
......
Since the above equation is one-way coupled, the system is guaranteed to bestable if the controller and the filter are guaranteed to be stableindependently.
22 / 22
07.pdfDeterministic CaseOLOFCCLOFC
ExtensionsInfinite horizonOutput Feedback