Micro 01

micro December 6, 2010

Chapter One

Noise

xxx [?].

In this chapter we introduce the typical disturbances that are encountered in stochas-tic system, including shot noise, Brownian motion and Johnson-Nyquist noise, andamplifier noise. Models for these disturbances can in many cases be obtained fromfirst principles. The development of the theory has been important parts of physicsand applied mathematics developed by giants like Einstein and Wiener. The math-ematical treatment is based on the notion of stochastic processes. A full treatmentrequires sophisticated mathematics. Calculation of the standard deviation can how-ever be done in a reasonably straight forward way.

1.1 Fluctuations in Physical Systems

In all control problems it is important to model the disturbances. Because of theirsmall dimensions micro systems are very sensitive to phenomena that can normallybe neglected for micro systems. Thermal motion of air molecules createBrownianmotionwhich disturbes small masses and inertias. Sensing small distances requiressmall currents which are susceptible to thermal resistor noise. Sensitive electricamplifiers also generate noise so calledJohnson-Nyquist noise. Quantum tunnel-ing is a highly sensitive effect that can be used to measure small distances. Sincetunneling is intrinsically a discrete phenomenon it is alsonoisy. Many of thesephenomena can fortunately be well modeled by first principlesand noise analysisof microsystems can therefore conveniently be done analytically and numerically.The signal levels are often very low when working with micro systems. In a labo-ratory there may typically be significant disturbances from many electromagneticfields. To avoid additional disturbances it is important to becareful with groundingand shielding of cables. Even with good precautions there aretypically differencesbetween condition during day and night depending on what equipment in the labthat is used. A value of using theoretical models is that theyhelp to trace the rootcauses of the disturbances.

Current is a flow of electrons, on a short time scale there are fluctuations be-cause the discrete nature of the electrons. These fluctuationsare calledshot noiseor Shottky noiseafter [?] who first discovered the phenomena. Other noise sourcesare the motion generated by thermal motion of molecules and thermal motion ofelectrons in conductors. Yet other sources are generated byconductivity variationscaused by material imperfections in resistors and semiconductors.


2 CHAPTER 1. NOISE

Statistical mechanics is the fundamental underpinning of noise in micro sys-tems. A fundamental result is due to Boltzmann who considered a collection ofparticles in thermal equilibrium. He found that all particles have the average en-ergy 1

2kBT, wherekB = 1.38×10−23 J/K is Boltzmann’s constant(Boltzmann’sequipartition law). For more general systems the energy is distributed so thateachstate (degree of freedom) has the energy1

2kBTBrowninan motion is caused by the thermal motion of molecules. A simple way

to estimate the effect is to use Boltzmann’s equipartition law which says that theaverage energy per degree of freedom is1

2kBT. A massm with velocity v has theenergymv2/2 and Boltzmann’s law gives

12

mv2 =12

kBT, σv =√

kBT/m,

where v2 denotes the mean square velocity andσv is the standard deviation ofthe velocity. For a mass of 1 µg andT = 300 K we get a standard deviation ofthe velocity ofσv = 2.3 µm/s. Velocities of this magnitude are clearly visible inprecision instruments. A mass suspended elastically with spring coefficientk hasthe energykx2/2 and Boltzmann’s law gives

12

kx2 =12

kBT, σx =√

kBT/k,

wherex is the position deviation andσx the standard deviation of the fluctuationsin x. Fork= 1 N/m andT = 300 K we getσx = 0.64A, a deviation that is clearlyvisible when a tunneling sensor is used.

Noise in electronic circuit is generated because of thermalmotion of the elec-trons. The effect was studied by Shottky in 1918 who conjectured two physicalmechanism, shot noise and thermal noise. Johnson at AT&T madevery carefulmeasurements of thermal noise. He found that the thermal noise in a resistor wasproportional to resistanceRand temperatureT. His colleague Nyquist gave a nicephysical explanation by combining statistical thermodynamics with transmissionline theory. In particular Nyquist found that the mean square voltage fluctuationswhen current flows through a resistor is

V2 = 4kBTR∆ f

wherekB is Boltzmann’s constant,T [K] abslute temperature and∆ f [Hz] the band-width.

Nyquist’s result, which is now known as afluctuation-dissipation theorembe-cause it shows that thermal noise is associated with energy dissipation, is derivedin the sidebar.

Example 1.1 Nyquist’s FormulaConsider a long transmission line, with no energy dissipation, which has resistorsR at each end. Let the inductance and capacitance per unit length beL andC andassume thatR=

√

L/C which implies that there is perfect impedance match. As-sume that the resistors have the same temperature and are in thermal equilibrium.


1.2. MODELS OF RANDOM FLUCTUATIONS 3

Energy is then transmitted between the resistors. The transmitted energy can betrapped by short circuiting the line which gives standing waves with perfect reflec-tion. By computing the energy stored in the line we can obtainthe energy trans-mitted by the resistors. To compute the trapped energy we introduce the lengthof the lineℓ and the transmission velocityv. The standing wave has frequenciesf = nv/(2ℓ), wheren is an integer. Provided thatℓ is sufficiently large the fre-quencies can be arbitrarily dense. The number of modes in a frequency interval∆ f is thusn= 2ℓ∆ f/v. According to Boltzmann’s equipartition law the energy ofthe standing waves is thenkBTn= 2ℓkBT∆ f/v. Since the time to pass the cable isℓ/v the average power is thus 2kBT∆ f . The power transmitted from each resistoris thus 2kBT∆ f . Let the average voltage generated by thermal noise in a resistorbeV, the current is thenI = V/2R and the power isRI2/4 equating this with thetransmitted power gives

V2 = 4kBTR∆ f

∇

1.2 Models of Random Fluctuations

Stochastic processes are the natural mathematical tool to model random fluctua-tions. There is a a rich theory for stochastic processes. A full treatment does unfor-tunately require quite sophisticated mathematics. Fortunately, by mastering a fewconcepts, it is possible to develop a working knowlege that allow us to calculatemean values and standard deviations of the fluctuations. In this section we try todevelop this knowledge.

Random Variables

A probability spaceΩ is a triple(S,P,µ) whereS is a set,P is a collection ofsubsets ofSandµ is a measure on the subset. We useω to denote an element ofP.In our applicationsSwill be the real numbers, andP is the set of intervals and thesets obtained by taking countable number of intersection and union of intervals.The measureµ is a measure onP that is additive for disjoint sets. The elementsof P are called events. The measureµ can be given by a nondecreasing functionF called theprobability distribution functiondefined by

F(ξ ) = Px≤ ξ.F(ξ ) is thus the probability that the random variablex is less than or equal toξ ,or the measure of the the setω|x(ω) ≤ ξ. The functionF has the propertiesF(−∞) = 0 andF(∞) = 1. If the functionF is differentiable its derivativef (x) =F ′(x) is called the probability density function.

A normalor Gaussianrandom variable has the probability density function

f (x) =1√2σπ

e− 1

2σ2 (x−m)2,


4 CHAPTER 1. NOISE

wherem is themean value, σ the standard deviation andσ2 the variance.We will also be interested in random variables that are vector valued,x ∈ Rn.

Since we are primarily interested in fluctuations in a physicalsystem we will fre-quently calculate the mean valuem which is given by

m= Ex=∫

x(ω)P(dω) =∫ ∞

ξ1=−∞· · ·

∫ ∞

ξn=−∞ξF(ξ1, . . . ,ξn)dξ1 · · ·dξn =

∫ ∞

−∞ξdF,

(1.1)and the fluctuations which are given by the covariance

cov(x,y) = E(

x(ω)−m)(

y(ω)−m)T

=∫

(

x(ω)−m)(

y(ω)−m)

dP(ω). (1.2)

We will also use cov(x) = cov(x,x) as a short hand notation for the covariance ofa vector. Whenx is a scalar we also use var(x) = cov(x). The mean value is thusthe vectorm∈ Rn and the covarianceR is a symmetric, nonnegativen×n matrix

m=

m1m2...

mn

= Ex, R=

r11 r12 . . . r1n

r12 r22 . . . r2n...

r1n r2n . . . rnn

.

The multivariable normal process is characterized by the probability density

f (x) =1

(2π)n/2(detR)1/2e−

12(x−m)TR−1(x−m).

Stochastic Processes

A stochastic process can be thought of as a function of two variablesx(t,ω), whichtakes values inRn, the variablet ∈ T represents time and the variableω ∈ Ω be-longs to a probability space. For fixedω = ω0 we can think ofx(t,ω0) as a timefunctionR→Rn, and for fixedt = t0 x(t0,ω) is a random variable. We can also saythat a stochastic process is a family of random variablesx(t), indexed byt ∈ T.

The stochastic process is completely specified if the probability distributionof x(t1,ω),x(t2,ω), . . . ,x(tn,ω) is specified. Assuming thatx ∈ R, a probabilitydistribution for the multi-dimensional random variable

x(t1),x(t2), . . . ,x(tk)

can be assigned by the probability distribution function

F(ξ1,ξ2, . . . ,ξk; t1, t2, . . . , tk) = Px(t1)≤ ξ1, . . . ,x(tk)≤ ξk.The functionF should be symmetric in all pairs(ξi , ti) and it should also have theproperty

F(ξ1, . . . ,ξk−1; t1, t2, . . . , tk−1) = limξk→∞

F(ξ1, . . . ,ξk; t1, t2, . . . , tk).

We can think of a stochastic process as time functions that are generated bysome stochastic mechanism. To simplify the notation we willoften suppress the



stochastic argumentω. Since it is only required to specify the probability distri-bution at a finite number of time instances there are some events that we cannotassign probabilites to, for example the event that the random process is less than agiven value for all times.

Stationary Processes

A stochastic processx(t), t ∈ T is time-invariantor stationaryif the probabilitydistributions are invariant to time shifts. The distribution of

x(t1),x(t2), . . . ,x(tk)

is then identical to the distribution of

x(t1+ τ),x(t2+ τ), . . . ,x(tk+ τ)

The process isweakly stationaryif the first and second moments are invariant totranslation of the time.

A stationary stochastic process isergodicif the ensemble average and the timeaverage are the same, i.e.

Ex=∫

Ωx(t,ω)P(dω) = lim

T→∞

12T

∫ T

−Tx(t,ω)dt

The stochastic processx(t), t ∈ T is singular if there exist a linear operatorL such that

Pω;L x 6= 0= 0

The processx(t,ω) = a(ω)

wherea is a random variable is a singular process, because we havedx/dt = 0 forall ω. Notice that this process is not ergodic. Since each realization is constant thesample average is also constant but it is different for each realization.

Mean Values and Covariances

Consider the stochastic processx(t), t ∈ T. Assume that the probability distribu-tions are such that the first and second moments exist. The function

m(t) = Ex(t) =∫

Ωx(t,ω)P(dω) =

∫ ∞

−∞ξdF(ξ , t)

is called themean value functionand the function.

r(s, t) = cov(

x(s),x(t))

= E[x(s)−m(s)][x(t)−m(t)]T

=∫

Ω[x(s,ω)−m(s)][x(t,ω)−m(t)]TP(dω)

=∫ ∞

−∞[ξ1−m(s)][ξ1−m(t)]TdF(ξ1,ξ2;s, t))

is called thecovariance function.


6 CHAPTER 1. NOISE

If the stochastic process is stationary then the mean value function is constantand the covariance funtion only depends on the differencet −s.

Gaussian Proceses

The Gaussian process is particularly simple because the probability distributioncan be given explicitely. A stochastic processx(t), t ∈ T is normalor Gaussianif the joint distribution of

x(t1),x(t2), . . . ,x(tk)

is normal for allk and allti ∈ T. Such a process is completely characterized by itsmean and covariance functions.

Now consider the Gaussian stochastic processx(t), t ∈ T. The joint distribu-tion of

x(t1),x(t2), . . . ,x(tk)

is characterized by the density function

f (ξ ) =1

(2π)k/2(detR)1/2e−

12(ξ−m)TR−1(ξ−m)

where

m=

m(t1)m(t2)

...m(tk)

= Ex, R=

r(t1, t1) r(t1, t2) . . . r(t1, tk)r(t2, t2) r(t2, t1) . . . r(t2, tk)

...r(tk, t1) r(tk, t2) . . . r(tk.tk)

Markov Processes

The Markov process is another process where the probability distributions havea simple characterization. Letti and t be elements of the index setT such thatt1 < t2 < .. . < tk < t. The stochastic processx(t), t ∈ T is a Markov process if

Px(t)≤ ξ |x(t1),x(t2), . . . ,x(tk)= Px(t)≤ ξ |x(tk)whereP ·|x(tk) is the conditional distribution givenx(tk). Since the probabilitydistribution of future states are completely given by thex(tk) it follows that theMarkov process is a generalization of state models to stochastic systems.

It is convenient to introduce two distributions; theinitial distribution

F(ξ1, t1) = Px(t1)≤ ξ1,and thetransition probability

F(ξt , t|ξs,s) = Px(t)≤ ξt |x(s) = ξs.If these distributions are known the joint distribution of

x(t1),x(t2), . . . ,x(tk)



is given by

F(ξ1,ξ2, . . . ,ξk; t1, t2 . . . ,ξk)

= F(ξk, tk|ξk−1, tk−1) . . .F(ξ2, t2|ξ1, t1)F(ξ1, t1)

The process is thus completely specified by the initial distribution and the transi-tion probability.

Spectral Density and Covariance Function

A stationary stochastic process which takes real values canbe characterized by itsspectral density and its covariance function. Assuming that the mean value is zerothe covariance function is defined as

r(τ) = Ex(t + τ)x(t) (1.3)

The valuer(0) = Ex2 is the variance of the signal. The functionr(τ)/r(0) tells thecorrelation between values of the function that are separated byτ in time.

The covariance function of a stationary process is symmetric, r(−t) = r(t) andit has other interesting properties. Letx(t) be a stochastic process and letf be acontinuous function. the quantity

E(

∫ ∞

−∞f (t)x(t)

)2=

∫ ∫ ∞

−∞f (s) f (t)E

(

x(t)x(s))

dsdt

=∫ ∫ ∞

−∞f (s) f (t)r(s− t)dsdt≥ 0

(1.4)

is non-negative. A function with these properties can be represented as

r(t) =∫ ∞

−∞eiωtdF(ω)

whereF(ω) is a nondecreasing function called thespectral distribution functionof the stochastic process. This function can be decomposed inthree components

F(ω) = Fa(ω)+Fd(ω)+Fs(ω)

whereFa is has a continuous derivative which corresponds to continuous spectrum.The termFd is piece-wise constant and corresponds to a discrete spectrum withisolated frequencies. The termFs is continuous and constant almost everywhere.

A process with continuous spectrum has a differentiable spectral distributionfunction whose derivativeϕ(ω) = F ′(ω) is called thespectral density function. Itis related to the covariance function through the relations

ϕ(ω) =1

2π

∫ ∞

−∞e−iωtr(t)dt, r(t) =

∫ ∞

−∞eiωtϕ(ω)dω. (1.5)

It follows from (??) that the spectral density and the covariance function are Fouriertransform pairs. We illustrate by an example.


8 CHAPTER 1. NOISE

Figure 1.1: A spectral analyzer.

Example 1.2 Covariance Function and Spectral DensityA process with the covariance function

r(t) = ae−a|t|

has the spectral density

ϕ(ω) =1

2π

∫ ∞

−∞ae−a|t|eiωtdt =

a2

π(a2+ω2).

Notice that the variance of the signal is var(x) = r(0) = a and thatϕ(0) = 1/π. ∇

The spectral density function has useful physical interpretations. The area un-der the spectral density is the total energy of the signal andthe spectral densityitself tells how the energy of the signal is distributed overfrequency. To see thiswe make the thought experiment of sending the process through an ideal band-passfilter which passes signals in the frequency range(ω1 < ω ≤ ω2) the variance ofthe filtered signal is

Ey2 =∫ −ω1

−ω2

ϕ(ω)dω +∫ ω2

ω1

ϕ(ω)dω = 2∫ ω2

ω1

ϕ(ω)dω.

Spectra and covariance functions can be determined by recording realizationsof the process and using standard software. There are also instruments (spectralanalyzers) that determine the spectral density directly from signals, see Figure 1.1.

Problems with Units

In the above analysis we have used the frequency variableω ,which ranges from−∞ to ∞, with the unit rad/s to characterize frequency. One reason for this isthat the covariance funtion and the spectral density are then Fourier transform pair.However, many instruments use the frequency variablef , which ranges from 0 to



∞, with the unit [Hz]. There are also other combinations. The result is that thereis often a confusion about units and factors of 2 andπ. In pure theoretical work itis customary to useϕ(ω) with the−∞ < ω < ∞ but in much practical workS( f )with only positive frequencies 0≤ f < ∞ is preferred. We will therefore use bothnotations. With either of the definitions

ϕ(ω) =1

2π

∫ ∞

−∞e−iωtr(t)dt, r(t) =

∫ ∞

−∞eiωtϕ(ω)dω

S( f ) = 4∫ ∞

0r(t)cos(2π f t)dt, r(t) =

∫ ∞

0S( f )cos(2π f t)d f

the area under the spectral density represents the mean square fluctuations. We willalso use the units rad/s and Hz to distinguishϕ andS. The spectral density hasthe dimensionpower/ f requency. Some instruments useamplitude/

√f requency

instead. It is therefore very important to always pay attention to the units and thefactors 2 andπ.

White Noise

By analogy to optics a signal with constant spectral densityis calledwhite noise.White noise is not physically realizable since it has infinitevariance (the area underthe spectral density is infinite). A realizable signal calledbandlimited white noiseisa signal where the spectral density is constantϕ(ω) = ϕ0 over a limited frequencyranges, say−ω1 < ω < ω1 and zero outside this interval, hence.

ϕ(ω) =

ϕ0 for|ω|< ω0,

0 otherwise,(1.6)

Such a signal has the covariance function

r(t) =∫ ω1

−ω1

ϕ0e−iωtdω = 2ϕ0sinω0t

t.

White noise is a useful abstraction but it cannot be realizedphysically because ithas infinite variance. Notice thatr(0) = 2ϕ0ω0i and that the covariance functionvanishes fort = π/ω0.

Figure 1.2 shows covariance funtions, spectra and realizations of bandlimitedwhite noise with different parameters. Notice thatr(0) = 2cω0 which means thatthe variance of the signal increases linearly with the bandwidth ω0. Also noticethat the peak of the covariance functions increases with increasingω0. Correlationbetween two values of the process with a constant time separation decreases withincreasingω0. In the limit as the bandwidth goes to infinity we have

r(τ)→ 2πϕ0δ (τ) as ω0 → ∞.

It has thus been shown that white noise can be obtained as a limit of bandlimitednoise when the bandwidth goes to infinity. There are many other ways to obtainwhite noise, here is another example


10 CHAPTER 1. NOISE

−10 0 100

0.5

1

−10 0 100

0.5

1

−10 0 100

0.5

1

−10 0 10−0.5

0

0.5

1

1.5

2

−10 0 10−0.5

0

0.5

1

1.5

2

−10 0 10−0.5

0

0.5

1

1.5

2

Figure 1.2: Spectral densityϕ(ω), covariance functionsr(t) and realizationsx(t,ω∗) ofbandlimited white noise.

Example 1.3 White noise as a limit of RC noiseConsider the system in Example 1.2. A slight change of scale gives the followingpairs of spectral density and covariance function

ϕ(ω) =a2

π(ω2+a2), r(t) = ae−a|t|.

We haveϕ(0) = 1/π, and the spectral density essentially constant forω < a.The variance isr(0) = a and the decay of the covariance is governed bya. As aincreases the spectrum is approximately white over a wider region and the varianceincreases. In the limit asω0 → ∞ we have

ϕ(ω)→ 1π, r(τ)→ 2δ (τ)

Figure 1.3 shows realizations, covariances and spectral densities for different val-ues of the parametera. ∇

Operational amplifiers have voltage noise which can be modeled as an inputvoltage to the amplifier. The noise is often modeled as white noise and the nu-merical values can be found from data sheets. Typical valuesare in the rangeS( f ) = 1−500(nV)2/Hz.

Intuitively we may expect bandlimited noise to be a good approximation ofwhite noise if the noise bandwidth is significantly higher than the bandwidth of thesystem driven by the noise. This is true for linear systems butunfortunately not fornonlinear systems where some unexpected phenomena may occur. The difficultiescan be avoided by using stochastic differential equations and stochastic calculus.Care should however be taken when dealing with nonlinear systems driven bynoise.

Simulation tools like MATLAB and SIMULINK have facilities for generat-ing bandlimited noise. The noise generator in SIMULINK has three parameterslabeled: noise power (P), sampling period (ts) andseed. The generator runs pe-



0 5 10−5

0

5

0 5 10−5

0

5

0 5 10−5

0

5

−5 0 50

0.5

1

−5 0 50

0.5

1

1.5

2

−5 0 50

2

4

6

10−1

100

101

10−2

10−1

100

10−1

100

101

10−2

10−1

100

10−1

100

101

10−2

10−1

100

Figure 1.3: Realization (top row), covariance functins (middle row) and spectral densities(bottom row) for processes witha= 1 (left column),a= 2 (middle column) anda= 5 (rightcolumn)

riodically with periodts. The random number generator is initialized by seed. Anormal random variable with varianceP/ts is generated at each sampling instant.The equivalent continuous time noise has a triangular covariance function wherethe correlation is zero for|t| > ts and the area under the covariance function isP. The generator thus mimics white noise with a covariance function Pδ (t). Thespectral density is

ϕ(ω) =P2π

[s], or S( f ) = 2P [Hz−1].

To resemble white noise the sampling period should be significantly shorter thanthe inverse of the bandwidthfB of the system to be simulated (ts fB < 0.01).

Processes with Independent Increments

White noise in discrete time is conceptually a very simple stochastic process. It canbe thought of as a random process where the random variables at different timesare independent random variables. Generation of discrete time white noise onlyrequires a generator of independent random variables. Simulation of discrete timesystems driven by white noise is straight forward. We can simply run a determin-istic discrete time model and add disturbances in the form ofindependent randomvariables at each sampling instant.

Processes withindependent incrementsis a generalization of discrete time white


12 CHAPTER 1. NOISE

noise. The stochastic processx(t), t ∈ T hasindependent incrementsif the ran-dom variables

x(tk)−x(tk−1), x(tk−1)−x(tk−2), . . . , x(t2)−x(t1), x(t1) (1.7)

are independent. Notice in particular that we have assumed thatx(t2)−x(t1) is in-dependent ofx(t1). If the variables are only uncorrelated the process is said to haveuncorreleted increments. A process hasstationary incrementsif the distribution orx(t)−x(s) only depends on the time differencet −s.

To find the covariance funciton of a process with uncorrelatedincrements wehave fors> t

r(s, t) = cov(x(s),x(t)) = cov(x(t)+x(s)−x(t),x(t))

= cov(x(t),x(t))+cov(x(s)−x(t),x(t)) = cov(x(t),x(t)),

where the last equality follows from the variables (1.7) being uncorrelated. Thecovariance function is thus given by

r(s, t) =

cov(x(t),x(t)) for t ≤ s

cov(x(s),x(s)) for s≤ t.

To explore the properties of the process further we considera process withstationary increments and introduce the function

F(t) = var(

x(t+ τ)−x(τ))

,

which is the increase of the variance over an interval of length τ. We have

F(t +s) = var(

x(t+s+ τ)−x(τ))

= var(

x(t+s+ τ)−x(t+ τ)+x(t+ τ)−x(τ))

= var(

x(t+s+ τ)−x(t+ τ))

+var(

x(t+ τ)−x(τ))

= F(s)+F(t)

The variance functionF thus has the property

F(s+ t) = F(s)+F(t). (1.8)

If F is continuous this implies that the function is linearF(t) = Rt. The varianceover an interval is thus proportional to the square root of the interval. The fluctu-ations of the process over a small intervaldt can thus be significant because it isproportional to

√dt.

We will now give two examples of processes with independent increments.

Example 1.4 The Wiener ProcessA Wiener process is a process with independent increments where the incrementshave a Gaussian distribution. If the increments are stationary with zero mean theprocess is uniquely given by the variance functionF(t) = Rt. The incrementalcovarianceis cov(dx,dx) =EdxdxT =Rdt. Notice that the Wiener process strictlyspeaking does not have a derivative because the expressionδx/δ t goes to infinityasδ t goes to zero. The derivative of the Wiener process is actuallywhite noisewith the covariance functionr(t) = Rδ (t).



0 10 20 30 40 50 60 70 80 90 100−6

−4

−2

0

2

4

6

Figure 1.4: Realizations of a Wiener process with incremental covariancedt.

Consider a Wiener process that starts at the origin at timet = 0. The mean valueof the process at timet is zero and the variance isRt, which increases linearlywith time. The Wiener process is therefore a model that can capture drift. A fewrealizations of the Wiener process are shown in Figure??. ∇

The Wiener process is the continuous time equivalent of a random walk, whichis a discrete time process where a step to the right and a step to the left is taken withequal probability. The Wiener process is a good model for drifting phenomena.

Shot noiseis another stochastic process that is very different from the Wienerprocess, but in spite of this it shares some of its properties.

Example 1.5 Shot NoiseElectric current is a stream of electrons. There will be fluctuations, calledshotnoise, because of the discrete arrival of the electrons. If the interaction betweenthe electrons are neglected it is natural to assume that the electrons arrive inde-pendently. If the average rate of arrival isλ [electrons/s], the average currentIbecomes

I = λq,

whereq= [C] is the charge of the electron. The total chargeQ is a stochasticprocess that increases withq at the arrival of an electron. Assuming that the arrivalsare independentQ is then a process with independent increments. LetdQ be theincrease of the charge during a time interval of lengthdt. Since the probability ofarrival of an electron in the interval(t, t +dt) is λ dt, the random variabledQ has


14 CHAPTER 1. NOISE

0 2 4 6 8 10 12 14 16 18 200

0.5

1

1.5

2

2.5

3

Figure 1.5: Realizations of a the charge function for shot noise with incremental covariance0.1dt.

the mean valuesEdQ= qλdt, E(dQ)2 = q2λ dt.

The incremental covariance becomes

cov(dQ) = E(

dQ− (EdQ))2

= q2λdt= qIdt.

The charge is thus a stochastic process with independent increments and the incre-mental covariancdqI dt. The current which is the derivative of the charge is thenwhite noise with the covariancer(t) = qIδ (t) and the spectral density

ϕ(ω) =qIπ

[A2s], S( f ) = 2qI [A2/Hz].

Figure??shows a few realizations of the charge function for shot noise. ∇

Two very different phenomena, the derivative of the Wiener process and theshot noise thus are white noise even if the time behaviors arevery different.

Operational amplifiers have current noise that can be modeledas shot noise.The magnitude depends on the current. Typical values vary widely in the rangeS( f ) = 10−8−10(pA)2/Hz, see [?].

Perhaps a picture

Filtering Stochastic Signals

The spectral density changes when a signal is passed through alinear system. Letthe input of the system beu, the outputy and the transfer functionG(s). There is avery simple formula that tells how the spectral density changes with filtering

ϕy(ω) = G(iω)G(−iω)ϕu(ω). (1.9)

This formula can be understood intuitively by viewing the signal as a sum of pe-riodic components and filtering each component separately. By combining thisequation with the principle of superposition it is straightforward to compute thespectral density of a linear system with many noise sources.The variance of thefluctuations can then be obtained by integration.



Spectral Factorization

Consider a stable linear system with transfer functionG(s) whose input is whitenoise. The output then has the spectral density

ϕ(ω) = G(iω)G(−iω).

Any stochastic process whose spectral density can be written on this form canthus be thought of as beeing generated by sending white noisethrough the systemG(s). If the spectral densityϕ is a rational function we can always findG by firstdetermining the poles and zeros ofϕ. The transfer functionG is then obtained bypicking the poles and zeros that are in the left half plane.

The spectral factorization is very useful because all stochastic processes withrational spectra can be thought of as generated by sending white noise through alinear system. From a practical point of view it means that white noise is the onlystochastic process we have to deal with, all other processescan be generated bysending white noise through a linear system. It also means that if we understandthe effect of white noise on a linear system then we can deal with the effect of anystochastic disturbance on a system.

All noise is not rational. The random fluctuations that occur inturbulence is oneexample where the spectral density is proportional toω−5/2 which is not a rationalfunction. Another is low frequency noise that occurs in resistors and electronicamplifiers that is proportional to 1/ f . This process will be discussed in the nextsection.

Pink Noise or 1/ f Noise

Add remark on modeling of drift, Wiener or 1/fResistors, semiconductors and bonds all have many small defects that create

variations in conductivity. The conductivity variations generate signal variationswhen driven by a current. The fluctuations depend on the strength of the current.The noise is not white but it varies as 1/ f and can therefore be dominant at lowfrequencies where it appears as a slow drift. The noise goes bymany namespinknoise, flicker noiseor 1/ f noise. The noise level is often specified by giving thefrequencyf0, where the 1/ f noise matches the level of input white noise or currentnoise. The level varies significantly with the type and qualityof the operationalamplifier, typical ranges aref0 = 1−1000 Hz, see [?].

There are many ways to model 1/ f noise. A simple model is obtained by con-sidering an ensemble of first order system. If the systems are in thermal equilib-rium they have the same energy. If the bandwidth of a subsystem bea, the spectraldensity is proportional to

ϕ =a

ω2+a2 ,

Assuming that all systems are independent the spectral density of the total noiseis the sum of the spectral densities of the subsystems. Moreover assuming that the


16 CHAPTER 1. NOISE

energy is distributed logarithmically we find that the total spectrum is given by∫ ∞

0

aa2+ω2d loga=

∫ ∞

0

1a2+ω2da=

1ω

arctanaω

∣

∣

∣

∞

0=

π2ω

.

The spectral density is thus proportional to 1/ω or 1/ f , pink noise is thereforealso called 1/ f -noise.

Since∫ f2

f1

d ff

= logf2f1

it follows that the energy of pink noise is evenly spread on a logarithmic scale,there is the same energy in each octave.

10−2

10−1

100

101

102

10−2

10−1

100

101

102

f

S(f)

1.3 Physical Systems with Noise

Having developed the basic concepts we will now analyze the effect of noise ontypical micro systems. Since the deterministic aspect of theprocesses are modeledby differential equations we will start with the idea of stochastic differential equa-tions which is a natural tool for modeling physical systems with noise. We willthen show how the idea can be used to calculate fluctuations in small systems. Thenet result is very simple. Since fluctuations are generated by the energy dissipatingcomponents we can simply replace ideal components like resistors and dampersby ideal components combined with signal sources that represent the disturbances.In later chapter we will apply the ideas to more complex systems.

Stochastic Differential Equations

Spectra and covariance functions are input-output characerizations of stochasticprocesses. It is also useful to have state model representations. Stochastic differen-tial equations (SDE) are a convenient time-domain model for stochastic processes.A linear stochastic differential equation has the form

dx= Axdt+dw, x(0) = N(x0,R0) (1.10)

wherex∈ Rn is a vector andw∈ Rn is a Wiener process which is a random processwith independent increments, which is a stochastic processwith zero mean andcovarianceEdwdwT = Rdt. It is also assumed thatdw is independent of x and ofthe initial state. The incrementdw thus has a magnitude proportional to

√dt which


1.3. PHYSICAL SYSTEMS WITH NOISE 17

implies that we cannot divide bydt to get an ordinary differential equation. If weinsist to do this formally we obtain the expressiondw/dt which is white noisewhich has infinite variance.

The stochastic differential equation (1.10) is thus similarto an ordinary dif-ferential equation written in difference form, but the initial condition is a randomvariable with zero mean and covarianceR0, and the incrementdw has magnitude√

dt. If we accept these facts it is possible to do calculus for example to calcu-late mean values and covariances. To compute the mean value of x we simply takemean values of (1.10) and obtain

dm(t) = Edx= E(Axdt+dw) = A(Ex)dt+Edw= Amdt,

where the last equality is obtained from the fact thatw has zero mean value. Sum-marizing we find that the mean value function is governed by theordinary differ-ential equation by

dmdt

= Am, m(0) = m0. (1.11)

The mean value thus propagates with the dynamics of the system(1.10) with nodisturbances (w= 0).

To compute the variances we subtract the mean valuem from x and take meanvalues. This is equivalent to computing the covariance for (1.10) assuming zeroinitial condition. Introducing the covarianceP(t) = Ex(t)xT(t) we find

P(t +dt) = E(x+dx)(x+dx)T = ExxT +EdxxT +ExdxT +EdxdxT .

Notice that the matrixP=ExxT is symmetric. In normal calculus the last term willbe neglected because it is of second order indt. In stochastic calculus it must betaken into accound becausedx is of order

√dt. Evaluating the three terms in the

above equation gives

EdxxT = E(Axdt+dw)xT = EAxxT +EdwxT = AExxT dt = APdt

ExdxT = PAT dt

EdxdxT = E(Axdt+dw)(Axdt+dw)T

= EAxxTAT(dt)2+EAxdwTdt+EdwxTATdt+EdwdwT

= APAT(dt)2+Rdt,

where we have used the fact thatExdw= 0 becausex anddw are independent.Summarizing we find that

P(t +dt) = P+APdt+PATdt+APAT(dt)2+Rdt

SubtractingP(t) from both sides and dividing bydt gives the following ordinarydifferential equation

dPdt

= AP+PAT +R, P(0) = R0. (1.12)

The covariance matrixP can be thought of as the uncertainty of the state of (1.10).


18 CHAPTER 1. NOISE

The equation (1.12) describes how the uncertainty develops with time. The termsAP+APT tells how uncertainty propagates due to system dynamics andthe termR represents the additional uncertainty generated by the disturbancew.

Equation (1.12) is very convenient for computing the variances of fluctuationsin a linear system. The steady state variance is given by

AP+PAT +R= 0. (1.13)

There are efficient codes for solving this equation in MATLAB.

Physical Systems driven by Noise

Brownian motion generates forces that are white noise. A convenient way to modelthem is to do the ordinary mechanics and electronics and add forces or voltagesthat correspond to the thermal noise. Having obtained the equations formally wethen convert the model to a stochastic differential equation so that we can use(1.13) to compute the variance of the fluctuations.

We illustrate the procedure with analysis of a mechanical system.

Example 1.6 Spring-mass System with Thermal NoiseConsider the standard spring mass system with a forceF that represents the forcesdue to thermal motion. The system is governed by the formal equation

md2xdt2

+cdxdt

+kx= f ,

wherem is the mass,c the damping coefficient,k the spring constant andf theforce. We use the notationf to indicate that the force is white noise. To obtain astochastic differential equation we first introduce the state variablesx andv andthe equation then becomes

dxdt

= v

dvdt

=− km

x− cm

v+1m

f

Since f is white noise the acceleration will also be white noise and the variance ofdv/dt is infinite. To obtain a stochastic differential equation we then simply writethe equation in difference form and we get

d

xv

=

0 1−k/m −c/m

xv

dt+

01/m

d f (1.14)

This equation is a linear stochastic differential equation in standard form (1.10)with

A=

0 1

− km

− cm

, R=

0 0

01

m2

r f ,



where the covariance ofd f is r f dt. The steady state covariance matrix is thengiven by (1.13). We have

AP=

0 1

− km

− cm

p11 p12p12 p22

=

p12 p22

−kp11+cp12

m−kp12+cp22

m

Writing the equationAP+PAT +R= 0 component wise gives

2p12 = 0

p22−kp11+cp12

m= 0

−2kp12+cp22

m+

r f

m2 = 0,

which has the solution

p11 =r f

2ck, p12 = 0, p22 =

r f

2cm

To determiner f we use Boltzmann’s equipartition law

12

kEx21 =

12

kp11 =12

kr f

2ck=

12

kBT,

and we obtainr f = 2ckBT. Notice that we get the same result if compute the av-erage energy based on the velocity. We leave this calculation as an exercise. Thevariances of position and velocity are then given by

Ex21 = p11 =

r f

2ck=

kBTk

, Ex22 = p22 =

r f

2cm=

kBTm

Notice that these expression can be obtained directly from Boltzmann’s equiparti-tion principle as was done in the beginning of this subsection. ∇

In the example that the incremental covariance of the disturbance forcef isEd f d f = 2ckBT dt [N2s], which means that the forcef is white noise with thecovariance function

r f (t) = 2ckBTδ (t)

and the spectral density

ϕ(ω) =ckBT

π[N2s/rad] = 2ckBT [N2/Hz], S( f ) = 4ckBT [N2/Hz] (1.15)

Notice that the intensity of the disturbance depends on the damping coefficientcbut not on any other property of the system. This is an example of the fluctuationdissipation theoremin statistical physics.

We can now formulate a general principle for modeling mechanical systemssubject to Brownian motion. For each dissipative element weintroduce forces andtorques which are white noise processes with spectral densities 2ckBT. The damp-ing coefficient has dimension N2s/rad for forces and N2m2s/rad for the torques.


20 CHAPTER 1. NOISE

Figure 1.6: Electrical circuit with noise.

The formal equations obtained in this way are then converted to stochastic differ-ential equations as was done in the example.

Next we will consider an electrical system.

Example 1.7 Simple electrical circuitConsider the RL circuit in Figure??.

Noise is only generated in the dissipative component, the resistor in this case.Assume that the resistor is modeled as an ideal resistor in series with a voltagesource that generates white noiseV. Let the loop current beI , the voltage dropacross the inductor isLdI/dt and we get

LdIdt

+RI = V

Writing this equation as a stochastic differential equation we get

LdI+RIdt= dV, or dI =−RL

Idt +1L

dV

whereV is a Wiener process with incremental covariancerVdt. Introduce the vari-ance of the current fluctuationsP= E(I2). It follows from (1.12) that

dPdt

=−2RL

P+rV

L2

In steady state we haverv = 2RLP. The average energy stored in the inductor isLE(I2)/2= PL/2 and Boltzmann’s equipartition law gives

12

LP=12

kBT

which givesPL = kBT and rv = 2kBTR. The noiseV generated by the thermalmotion of the electrons in the resistor is then white with covariances and spectral



density given by

rV(τ) = 2kBTRδ (τ), ϕV(ω) =kBTR

πV2s/rad= 2kBTRV2/Hz (1.16)

Since formallyV = dw/dt the voltage variations has the covariance function

rV(t) = 2kBTRδ (t)

The corresponding spectral density is

ϕ(ω) =2kBTR

2π=

kBTRπ

[V2s/rad] = 2kBTR [V2/Hz]

Thermal noise in a resistor can thus be represented by white noise with the spectraldensitykBRT/π[V2 s/rad] = 2kBTR [V2/Hz]. The corresponding Wiener processhas incremental covarianceEdw2 = 2RkBT.

Nyquist’s formula isS( f ) = 4kTR V2/Hz. Since he only considers positivefrequencies his bandwidth half of ours. ∇

Again we observe the fluctuation-dissipation property that the noise is gen-erated by the energy consuming component, the resistor. A systematic way ofanalysing noise of electric circuits is to replace all resistors by ideal resistors inseries with voltage sources representing the Johnson-Nyquist noise which is whitewith spectral density 2kBRT [V2s/rad]. If the circuit has active components likeoperational amplifiers we have to include the amplifier noise. Information aboutthis is found in the manufacturers handbooks.

Example 1.8 Operational AmplifierConsider the operational amplifier circuit shown in Figure 1.7. Assume that thetransfer function of the amplifier is

G(s) =k

s+a

wherek= 107 [rad/s] is the gain bandwidth product anda= 100[rad/s] is the lowfrequency pole. Let the resistors beR1 = 10 MΩ andR2 = 10 kΩ. The open loopdynamics of the operational amplifier is then

vout =− ks+a

v

or in differential equation form

dvout

dt=−avout−kv (1.17)

To analyze the fluctuations in the output voltage caused by thethermal noise inthe resistors we first replace the resistors with ideal resistors in series with whitenoise voltage sources representing the resistor noise. Following our convention wedenote the volt sources as ˙vn1 andvn2 to indicate that they are white noise. Let the


22 CHAPTER 1. NOISE

Figure 1.7: Schematic diagram of operational amplifiers with noisy resistors. The resistorsare represented as ideal resistors in series with ideal voltage sources that generate Johnson-Nyquist noise.

noise voltages bevn1 andvn2. Assuming that the input impedance is so large thatthe currentI can be neglected we find

vn1−vR1

=v− vn2−vout

R2

Solving this equation forv gives

v=R2vn1+R1vn2+R1vout

R1+R2

Inserting this expression forv in the differential (1.17) gives

dvout

dt=−

(

a+kR1

R1+R2

)

vout−k

R1+R2(R2vn1+R1vn2),

and the corresponding stochastic differential equation is

dvout =−(

a+kR1

R1+R2

)

voutdt+k

R1+R2(R2dvn1+R1dvn2) (1.18)

Since the incremental covariances ofvi is 2RkBTRi it follows from (??) that thesteady state varianceP of the fluctuations in the output voltage is given by

−2(

a+kR1

R1+R2

)

P+( k

R1+R2

)2(R2

22kBR1T +R21kBR2T) = 0

The variance of the output signal is thus

varvout = P=KBR1R2k2

a(R1+R2)+kR1≈ kkBR2T

For large values of the gain-bandwidth productk the noise is thus determined onlyby the noise in the feedback resistanceR2. ∇


1.4. REFERENCES 23

1.4 References

Micro systems are good illustrations of a wide range of modeling tasks that cutsacross a rich sets of physical phenomena from vibrations, acoustics, electromag-netics, molecular and even quantum effects. Modeling also covers a wide range ofdeterministic to stochastic from linear to nonlinear.

Classical mechanicsElectrical engineeringGGSenturiaBeranekChua Desoer Kuh

• Robert Brown 1827

• Einstein 1905

• Wiener 1923

• Boltzmann

• Shottky

• Johnson

• Nyquist

• Boltzmann

• dissipation fluctuations

A classic paper on noise in electric amplifiers was written by Schottky in 1918who

Johnson:The results were discussed with Dr. H. Nyquist, who in a matterof amonth or so came up with the famous formula

V2 = 4kBTR∆ f .

for the effect, based essentially on the thermodynamics of atelephone line, andcovering almost all one need to know about thermal noise.

Theoretical physicist from Bell:Nyquist’s fusing of concepts from two quitedifferent fields, statistical mechanics and electrical engineering, points out whathas been a particular strength of Bell Labs work in theoreticalphysics: the diver-sity of expertise among the theoretical staff, and the propensity of many of them toshift their attention from one area to another, transferring useful concepts in theprocess.

Nyquist’s DerivationE. B. Johnson Thermal agitation of electricity in conductors.Phys. Rev. 32(1928)97-

109.H. Nyquist Thermal agitation of electric charge in conductors. Phys. Rev. 32(1928)110-

113.



Chapter Two

Control Design

PID control - the practitioners delight and the theoreticians blight

K. J. Astrom.

A particular feature of micro systems is that process designis a key element andthat the process and the controller are often designed jointly. There is a wide rangeof methods for control design. There are many books that give adetailed treat-ment of each particular design method, but unfortunately there are few sourcesthat give a compact treatment of many design techniques. In this chapter we willtry to cover this gap by giving a broad presentation of several methods for con-trol system design. Fortunately many design methods results in controllers havinga similar structure differing only in parameter values and methods for computingthe controller gains. A particular feature of micro systemsis that the dynamics areoften poorly dampled. The control veteran Truxal says in his book [?]: control ofvery lightly damped processes is perhaps the most basic of the difficult controlproblems. We will therefore pay particular attention to this problem.

It is assumed that the reader has a basic understanding of feedback at the levelof the bookAstrom and Murray [?], references to specific chapters are given atappropriate places. We will start by discussing the broad issues of architecture andfundamental limitations. We will then proceed to discuss PIDcontrol design of aspring-mass system in some detail. We will then proceed withbrief discussion ofa wide range of control strategies.

2.1 Architecture and Fundamental Limitations

A typical micro system may include the physical process, electronics, interfaces,computers, communication and human interfaces. Architecture is to design andcombine these elements to obtain a system that performs the desired function.Typically it is necessary to study several architectures before arriving at a goodsolution. When exploring system architecture it is essential to be aware of the fun-damental limitations of a system. It is a cardinal sin for control engineers to believethat the process to be controlled is given, because a processmay inadvertently bedesigned so that it is difficult to control. In many cases the control system can besimplified or performance can be increased considerably by minor changes of aprocess. Typical modifications are to move sensors and actuators or to add sen-sors. The situation is even better in micro systems and in mechatronics becausethe process itself is designed jointly with the control system. It is then possible to


26 CHAPTER 2. CONTROL DESIGN

design the system so that it is well suited for control. To exploit the design spaceit is necessary that the designers master both process and control design, and thatthey are familiar with properties that limit the achievableperformance. Key issuesinclude actuation power, sensor noise, quantization, and dynamics limitations. It isassumed that the reader knows the material in Chapters 11 and12 of [?].

Large Signal Effects

Many limitations are associated with constraints on how large signals and variablescan be. Motors have limited torque, amplifiers have limits on currents, pumps havelimited flow, the temperature of a component may not be too high. Limitations mayappear both in the form of restrictions on the control signalor its rate of changebut there may also be restriction on internal process variables.

Response time is a common requirement which is crucial for drive systems,disk drives and optical memories where it determines the search time. The achiev-able response time depends critically on actuation power and physical limitationsof the process. To determine the response time we can determine the minimumtime to make transitions from one state to the other, subjectto the physical con-straints on the process and the actuator. The limitations aretypically associatedwith the behavior of a system for large signals. The theory of time optimal controlis well developed [?]. A particularly attractive feature is that optimal control candeal with control and state constraints. Software for solving optimal control prob-lem are available [?], and software for joint modeling and control is emerging [?].We will illustrate with a simple example.

Example 2.1 Actuator LimitationsConsider the problem of a controller for a disk drive or an optical memory. Thedisk is read using an optical system consisting of a laser diod and a optical sensor.To read different tracks the sensor package must be moved rapidly. Assuming thatthe motion of the sensor package can me modeled by

Jd2ϕdt2

= T, or md2xdt2

= F

whereJ is the moment of inertia,ϕ the arm angle andT the torque. It is commonin the industry to replace the angleϕ by the distancex= rϕ, wherer is the distancefrom the center to the read head andm= J/r2 the equivalent mass andF = T/rthe equivalent force. The model requires that the actuator iscurrent driven. Themaximum accelerationamax= Fmax/m= kI Imax/m is given by the maximum cur-rentImax. Typically there is also a limitation on the maximum velocity. For a voicecoil drive the maximum velocity isvmax= Vmax/kI whereV is the largest supplyvoltage andkI is the motor constant, compare with Section XXX3.1?. The prob-lem of moving the mass from one position to another in minimumtime is simplyto apply maximum acceleration until the mid position is reached and then applymaximum deceleration. If there is a velocity limitation, the maximum acclerationis only applied until the maximum velocity is reached. The minimum time solutionis illustrated in Figure 2.1.


2.1. ARCHITECTURE AND FUNDAMENTAL LIMITATIONS 27

0 2 4 6 80

0.005

0.01

0 2 4 6 80

1

2

3

0 2 4 6 8

−0.5

0

0.5

0 5 10 150

0.01

0.02

0.03

0 5 10 150

1

2

3

0 5 10 15

−0.5

0

0.5

Pos

ition

[m]

Pos

ition

[m]

Velo

city

[m/s]

Velo

city

[m/s]

Cur

rent

[A]

Cur

rent

[A]

Time [ms]Time [ms]

Figure 2.1: Minimum time transition for disk drive. The left figure shows the case of shortmovements when there is no constraint on the velocity, the velocity never saturates, thecontrol is of the bang-bang type where maximum current is applied to accelerate or brake.The right figure illustrates what happens for large motions. Full accleration is applied untilt = 5 ms when maximum velocity is reached and the drive circuit saturates. The current isthen zero until timet = 10 ms when full braking current is applied.

When the acceleration is constanta, the velocity increases asv = at and theposition isamaxt2/2 = v2/(2amax). A straightforward calculation shows that theminimum time for a transition over a distanceℓ is

t =

√

ℓ/amax if ℓ≤ v2max/amax

√

ℓ/amax+ ℓ/vmax−vmax/amax if ℓ > v2max/amax

.

From this expression it is straightforward to determine the requirements of theactuator and the drive circuit.

The parameters used to generate Figure 2.1 areJ = 5×10−6 kgm2, r = 0.05 m,kt = 0.1 Nm/A, Imax= 0.5 A andVmax= 5 V or equivalentlyk= k/r = 2 N/A,m= J/r2 = 2×10−3 kg. The maximum current is 0.5 A which gives a maximumacceleration ofamax= 500 m/s2. The supply voltage is V=5 V which gives a maxi-mum velocity ofvmax= 2.5 m/s. In the simulation the maximum velocity is reachedafter t = vmax/amax= 0.005 s when the arm tip has traveled about 100 tracks,(v2

max/(2amax) = 6.2510−3 m). If motions are restricted to 13 mm full accelerationcan be used all the time. ∇

In the simple example all calculations can be made analytically. In other casesit may be necessary to resort to numerical calculations which is not too difficult



because software is available. In the example, the limitations on acceleration anddeceleration were symmetric, it frequently happens that the constraints are asym-metric. Optimal control is powerful, but it is important to exercise care when for-mulating objectives and contraints. It is also important topay attention to modelingand robustness. The limitations on the control signal are strongly influenced by theelectronics of the drive circuits. There are cases where significant improvementshave been obtained by redesigning the drive amplifiers and accounting for theirdynamics when computing optimal transitions. It may be advantageous to use dualactuation in order to cope with stringent requirements. A nice example is opticaldrives, where a dual actuator is used to resolve compromisesof speed and power.Such systems will be dicussed in Chapter??.

Small Signal Effects

There are also limitations that are associated with low signal levels, typical phe-nomena are measurement noise, friction and backlash. There are many sources ofmeasurement noise, it can be caused by the physics of the sensor or the electronics.In computer controlled systems quantization in the analog to digital converters andround-off in numerical computations also cause limitations. The effects of noisecan be estimated using linear methods by calculating the transfer function fromthe noise sources to the control signal and the process variables. Noise spectra aresometimes available, simpler calculations can be made by using the noise levels.Approximations can often be used since the noise typically has high frequencies.Quantization can be approximated as white noise with a variance ofδ 2/12, whereδ is the quantization level and the effects can then be estimated using linear meth-ods. The describing function method is also useful for preliminary assessment, seeSection 9.5 of [?].

A fast response requires a controller with high gain. When the controller hashigh gain measurement noise is also amplified and fed into the system. This will re-sult in variations in the control signal and in the process variable. It is essential thatthe fluctuations in the control signal are not so large that they cause wear and evensaturation of the actuator. Since measurement noise typically has high frequenciesthe high frequency gainMc of the controller is thus an important quantity. Mea-surement noise and actuator saturation thus gives a bound onthe high frequencygain of the controller and therefore also on the response speed. Consider for ex-ample a computer controlled system with 12 bit AD and DA converters. Since 12bits correspond to 4096 it follows that if the high frequencygain of the controlleris Mc = 4096 one bit conversion error will make the control signal change overthe full range. To have a reasonable system we may require that the fluctuations inthe control signal due to measurement noise cannot be largerthan 5% of the signalspan. This means that the high frequency gain of the controller must be restrictedto 200.

Actuators have non-ideal behavior at low signal levels. Friction in valves andmotors cause stiction. A certain signal level is then required to move the motor.Quantization of AD- and DA-converters have similar effectsas has quantization



Figure 2.2: Block diagram of a drive system with friction. The left figure is a block diagramwhere friction force isF , velocityv and positionx. The simplified block diagam is shown tothe left (left) and simplified block (right).

and limited word length in computation. The net result is often an oscillation. Sincethe effects are nonlinear a detailed analysis is complicated. An approximative anal-ysis based on describing functions (page 288-290 in [?]) is useful to estimate theeffect. We illustrate with and example.

Example 2.2 Effect of friction in a positioning systemConsider the system shown in Figure 2.2 where friction is modeled as Coulombfriction. The friction force then has constant magnitude anddirected to oppose themotion, see see Figure?? in Section XXX2.6?. Friction can then be representedas a nonlinear feedback from velocity to force in the block diagram. By redrawingthe diagram as shown in the figure we find that the system can be considered as afeedback loop with the friction model and a linear system with the transfer function

G(s) =s

ms2+C(s)

An intuitive understanding of the behavior of the system in Figure 2.2 can be ob-tained by approximating the relay characteristics with a linearity with saturation.We can expect and oscillation at the frequencyω180, where the loop transfer func-tion has a phaselag of 180, and an amplitude approximately equal to the saturationlevel. This reasoning is captured by describing function analysis, which is an ap-proximative way of finding the oscillation by exploring how sine waves propagatein the system, see Section 9.5 of [?]. In this particular case it predicts an oscilla-tion at the frequencyω f where the phase ofG(iω) is 180 and the amplitude ofthe position is approximatelyaf /(mω2

f ), whereaf is the magnitude of the frictionforce. Examples with more accurate calculation is given [?]. ∇

Quantization of AD converter limits the accuracy of a system, because it isclearly not possible to control the output with a precision that is larger than theresolution of the converter. Quantization also gives rise to oscillations. In this par-ticular case the oscillations are approximately at the phase crossoverωpc whichis the frequency where the loop transfer functionL(s) has a phase lag of 180.Quantization of the DA converter also creates low amplitudeoscillations at ap-proximately the phase crossover frequency. Describing funtion analysis indicates



that the oscillations have the amplitudeδDA|P(iωpc)|, whereδDA is the quantizationerror andP(s) is the process transfer function. The amplitude of the oscillationscan thus be quite small if|P(iωpc)| is small. The quantization errors of AD andDA converters are thus balanced if

δAD = |P(iωpc|δDA.

A course quantization of the DA converter can be permitted if|P(iωpc)| is small.One way to achieve this is to shape the loop transfer functionso thatωpc is large.The requirements on actuator resolution are thus influenced byproperties of thecontroller beyond the gain crossover frequency.

Dynamics Limitation

Process dynamics imposes limitations on the performance that can be achievedwith robust control. Comprehensive treatments are given inSection 11.5 of [?], [?]and []. We will briefly present some of the essential results. Additional material ispresented in the following sections.

High bandwidth gives a fast response and good disturbance rejection, but it un-fortunately requires high controller gain. Measurement noise is then amplified andwill generate large control signals. The gain crossover frequencyωgc is a param-eter that characterizes response speed and disturbance attenuation. We introducethe parameter

Mc = maxω>ωgc

|C(iω)|, (2.1)

to capture the high frequency gain of the system. Notice thatmany controllers haveinfinite gain at zero frequency.

Many simple controllers like I, P and PI controllers have gains that decreasemonotonically with frequency. Since

|L(iωgc)|= |P(iωgc)C(iωgc)|we find for these controllers thatMc = 1/|P(iωgc)|. More advanced controller mustprovide phase lead. The controller gain then increases at thegain crossover fre-quency because phase lead is associated with a gain increase. The precise relationbetween gain and phase is given by Bode’s phase area formula [?]. The followingapproximate formula is derived in [?]

K = e2γϕℓ , (2.2)

whereϕℓ [rad] is the largest phase advance, andγ a contstant which depends onthe details of the compensating network. A reasonable valuefor estimates isγ = 1.The phase lead required at the crossover frequency isϕℓ =−argP(iωgc)−π +ϕm,whereϕm is the desired phase margin. An estimate of the largest high frequencygain of a controller

Mc =1

|P(iωgc)|max(1,eγϕℓ). (2.3)

Notice that the gain increases exponentially with the required phase lead.



10−1

100

101

10−4

10−2

100

102

104

10−1

100

101

−360

−180

0

|P(i

ω)|,

M∠

P(i

ω)

Gain crossover frequencyωgc

IPI

PID

Figure 2.3:Assessment plot for a system with two masses connected by springs. The systemhas resonances atω = 1 rad/s andω = 3 rad/s. The dotted line represents 1/|P(iω)| and thedashed line is an estimate of the high frequency gain of the controller.

To assess the effect of measurement noise we can use a Bode plot of the looptransfer function where we have added the plot of the high frequency gain as isillustrated by the following example.

Example 2.3 A spring-mass systemConsider a with two masses connected by a spring with the transfer function

P(s) =ω2

1ω22

(s2+2ζ1ω1s+ω21)(s

2+2ζ2ω2s+ω22),

with ω1 = 1, ζ1 = 0.5, ω2 = 3, andζ2 = 0.2. Figure 2.3 shows a Bode plot ofthe transfer function where we have added a plot of the gain curve of Mc. Thepointsω45, ω90 andω135 where the phase lag of the process is 45, 90 and 135

are marked with circles in the phase plot. For monotone systems they indicategain the crossover frequencies that can be achieved with I, PIand PID control. Aswill be discussed later, the situation is different for highly resonant systems. Forexample for a system with one dominant highly resonant system the achievablegain crossover frequency for an integrating controller is 2ζ ω90/gm, wheregm isthe gain margin. This frequency may be significantly lower thanω45.

The plot which we refer to as anassessment plotis very useful to exploreachievable performance and controller complexity. The plotindicates that the gaincrossover frequencies achievable by I, PI and PID control are 0.7 1.0 and 1.4.Higher bandwidths can be obtained but more complex controllers and low noisesensors are required. The gains are resonable up to the secondresonance but thegain increases very rapidly after that. To obtainωgc = 6 rad/s the high frequencygain of the controller is 104. The difference between the dashed and the dottedcurves show that a large part of the gain increase is due to theneed for phaseadvance. ∇



There are severe limitations for systems with time delays andright half planepoles and zeros which can be captured with a few simple formula. Time delaysτ and right half plane zeroszrhp limit the achievable bandwidth, approximativebounds on the gain crossover frequency are given by

ωgcτ < 0.5, ωgc <zrhp

2.

Robust control of a system with a right half plane pole requires a high closed loopbandwith

ωgc > 2prhp.

Systems with a right half plane pole and time delay or a right half plane zero canonly be controlled robustly under certain conditions. Approximate conditions aregiven by

prhpτ < 0.7, zrhp > 5prhp or zrhp < 0.2prhp.

It is therefore necessary that the product of the time delay and the right half planepole is not too large and that the ratio of right half plane pole and zero is sufficientlylarge. When designing a mechatronics system which is inherently unstable it istherefore useful to make a preliminary model to see if these conditions are satisfied.If this is not the case the process must be redesigned.

Remedies

Having understood factors that cause fundamental limitations it is interesting toknow how they should be overcome. Problems with sensor noise are best ap-proached by finding the roots of the noise and trying to eliminate them. Increasingthe resolution of a converter is one example or changing a sensor is another. Ac-tuation problems can be dealt with in a similar manner. Limitations caused by ratesaturation can be reduced by replacing the actuator or by doing dual actuation. Ze-ros depend on how the states are coupled to inputs and outputs. Problems causedby RHP zeros and time delays and can be resolved by moving or adding sensors.Notice that system where all states are measured has no zeros. Poles are inherentproperties of a system, problems due to unstable poles can thus only be resolvedby redesign of the system. Redesign of the process is the final remedy. Since staticanalysis can never reveal the fundamental limitations it isvery important to makean assessment of the dynamics of a system at an early stage of the design. This isone of the main reasons why all system designers should have abasic knowledgeof control.

2.2 PID Control

The PID controller is the most common control algorithm. The term PID if com-monly used even if only proportional and integral actions are used. It is also com-mon to add filters to the basic PID algorithm and still keep the name. In this section


2.2. PID CONTROL 33

10−2

100

102

10−2

100

10−2

100

102

−270

−180

−90

ω/ω0

|L(i

ω)|

∠L(i

ω)

−1 −0.5 0 0.5−1

−0.5

0

0.5

ReL(iω)

ImL(

iω)

Figure 2.4: Bode (left) and Nyquist (right) and plots of the loop transfer function forthespring-mass system with integrating control. The numerical values of the parameters areζ0 = 0.01 andki = 0.008ω0. The gain margin isgm = 2.5.

we will show how it can be applied to oscillatory systems. Derivative action is use-ful because it can provide damping for the oscillatory modes, compare with Exam-ple XXX. Lightly damped systems are difficult to control because the resonancesamplify signals. The resonances appear as high peaks in Bode plots or as loops orcusps in the Nyquist plots. If the gain of the loop transfer function is larger thanone at the resonances a small phase shift can bend the Nyquistcurve towards thecritical point so that the system becomes unstable. A conservative way to design acontroller for a system with resonances is therefore to makesure that the gain ofthe loop transfer function is always less than one at the resonances. Such a designmay however be quite conservative.

In this section we will design PID controllers for the spring-mass system withlow damping. A short presentation of PID control is given in Chapter 10 of [?] amore comprehensive treatment is given in [?]. It is also assumed that the reader isfamiliar with Nyquist and Bode plots and the Gang of Four thatis used to assessthe properties of a control system. This material is presented in Chapters 8, 9, 11and 12 of [?].

Integral Control

All stable system can be controlled with an integrating controller (I control) if theperformance requirements are modest. The reason is that all stable systems behavelike static systems if the input has sufficiently low frequency. The model requiredis simply the static gain. This observation also applies to oscillatory systems butthe bandwidth obtained is much smaller than the resonant frequency.

Consider the spring mass system discussed in Section??which has the transferfunction (??). An integrating controller has the transfer functionC(s) = ki/s. The



loop transfer function when controlling a spring-mass system is

L(s) =kiω2

0

ks(s2+2ζ0ω0s+ω20)

≈ ki

ks.

The approximation physically corresponds to integral control of a spring withspring constantk, and it holds for low frequencies. Nyquist and Bode plots ofthe loop transfer function are shown in Figure 2.4. The Bode plot shows that theapproximation is very good for low frequencies,but not for frequencies approach-ing the resonance. The resonance appears as a sharp phase dropof 180 in thephase plot. In the Nyquist plot the low frequency approximation corresponds toa line along the negative imaginary axis and the resonance appears as a curvewhich is close to a circle with diameterki/(2ζ0ω0). For a low bandwidth design(ωgc << ω0), the gain crossover frequency is approximatelyωgc = ki/k. In thefollowing we will for simplicity, normalize by choosing thespring contantk = 1.The normalized integral gain then has dimension frequency.

The nature of the control problem can be assessed from the Nyquist and Bodeplots. To have a high bandwidth it is desirable to have a largeintegral gain. Ifthe gain is too high the Nyquist curve reaches the critical point and the systembecomes unstable. We haveL(iω0) = −ki/(2ζ0ω0), which means that the phasecrossover frequency isω0. Requiring a gain margingm the integral gain is

ki =2ζ0ω0

gm. (2.4)

This formula gives a simple rule for finding the gain of an integral controller. Re-quiring a gain margingm = 2.5 giveski = 0.008ω0, andωgc = 0.008ω0.

Notice in Figure 2.4 that the phase margin is close to 90, the gain margin isthus the critical quantity, a situation that is typical for oscillatory systems. The keydifficulty with the system is the resonance peak in the loop transfer function whichlimits the gain. Better performance can be obtained by usinga more complex con-troller. Adding proportional action to the controller gives no improvement becauseit rises the resonant peak. The peak can, however, be lowered by lowering the gainabove the crossover frequency. Adding a second order filter gives the controllertransfer function

C(s) =ki

s(s2T2f /2+sTf +1)

, (2.5)

and the loop transfer function becomes

L(s) =kiω2

0

s(s2T2f /2+sTf +1)(s2+2ζ0ω0s+ω2

0).

To reduce the peak the parameterω0Tf must be smaller than one. A Taylor seriesexpansion of the loop transfer function for smalls gives

L(s)≈ ki

s(1−sTf −s

2ζ0

ω0)≈ ki

s−kiTf .


2.2. PID CONTROL 35

10−2

100

102

10−2

100

10−2

100

102

−450

−360

−270

−180

−90

ω/ω0

|L(i

ω)|

∠L(i

ω)

−1 −0.5 0 0.5−1

−0.5

0

0.5

ReL(iω)

ImL(

iω)

Figure 2.5: Bode (left) and Nyquist (right) and plots of the loop transfer function forthespring-mass system with integrating control having high frequency roll-off. The dampingratio of the process isζ0 = 0.01. The integral gains areki = 0.01ω0, 0.02ω0, 0.05ω0 and0.1ω0. The filter constant isTf = 0.5/ki .

A simple design rule is to choose controller parameters suchthat the real partof L(iω) is close to−0.5 for small s (see pages 172-173 of [?]), which givesTf = 0.5/ki .

Figure 2.5 shows the Bode and Nyquist plots forki/ω0 = 0.01,0.02,0.05 and0.1. The Bode plot shows that the gain crossover can be increased significantlycompared to the integrating controller. The Nyquist plot shows that the circleshaped loop of the curve bends away from the critical point because of the fil-tering, compare with Figure 2.4. The loop is small forki < 0.05ω0, it has the samesize as the loop in Figure 2.4 forki = 0.1ω0, and it increases significantly for largervalues ofki . Notice that the low frequency part of the Nyquist curves arepracticallythe same for allki , but the frequency scales are different.

An assessment of the designs can be made from Figure 2.6 which shows thegain curves of the Gang of Four for the system with the integrating controller andthe integrating controller with high frequency roll-off. The maximum sensitivitiesfor ki = 0.1ω0 are Ms = 1.62, Mt = 1.00, Mps = 26.6 andMcs = 1.00 and thebandwidth isωb = 0.21ω0.

Adding a second-order filter to the integrating controller makes it is possible toincrease integral gain fromki = 0.008ω0 to 0.1ω0, and the bandwidth is increasedaccordingly fromωb = 0.008 toωb = 0.21ω0. The sensitivity to load disturbancesis reduced fromMps= 83.3 toMps= 26.6. The system is still very sensitive to loaddisturbances. Notice that the largest gain for load disturbances for the uncontrolledsystem is 1/(2ζ ) = Q = 50. The transfer function from measurement noise tocontrol signal has the maximumMcs= 1.0 so the designs are not very sensitive tomeasurement noise.



10−4

10−2

100

102

10−2

100

10−4

10−2

100

102

10−2

100

10−4

10−2

100

102

10−2

100

10−4

10−2

100

102

10−2

100

ω/ω0ω/ω0

|T(i

ω)|

|S(i

ω)|

|PS(

iω)|

|CS(

iω)|

Figure 2.6: Bode plot of the Gang of Four for the spring-mass system with an integratingcontroller with gainki = 0.008ω0 (red dashed) and an integrating controller with secondorder filter with gainski/ω0 = 0.01,0.02,0.05 and 0.1 (blue). Adding a filter thus gives asignificant increase of the bandwidth.

A Notch Filter PID Design

The robustness condition (2.4) limits the bandwidth that canbe achieved with anI (integrating) controller. A simple way to reduce the peak in the loop transferfunction, caused by the resonance, is to choose a controllerwhose zeros cancel theresonant mode, hence

C= kis2+2ζ ω0s+ω2

0

ω20s

=kds2+kps+ki

s. (2.6)

Such a control scheme is callednotch filtercompensation because the frequencyresponse of the controller has a notch at the resonant frequency. If integral gainki

is chosen as the key parameter the other controller parameters are given by

kp =2ζ0ki

ω0, kd =

ki

ω20

, Ti =2ζ0

ω0, Td =

12ζ0ω0

. (2.7)

whereTi is the integration time andTd the derivative time. Notice thatTi/Td = 4ζ 20 .

The loop transfer function is

L = PC=ki

s,

For larges the controller (2.6) has the propertyC(s) ≈ kds. The controller is thuspractically useless because its gain is very high at high frequencies. High frequencymeasurement noise thus gives very large variations in the control signal, the actua-tor may even saturate. To avoid this we introduce high-frequency roll-off by adding


2.2. PID CONTROL 37

a second order filter

C(s) =ki(s2+2ζ ω0s+ω2

0)

ω20s(s2T2

f /2+sTf +1). (2.8)

This controller is very similar to (2.6) for small,s but it is very different for larges becauseC(s)≈ 2ki/(sT2

f ). The controller (2.8) gives the loop transfer function

L(s) =ki

s(s2T2f /2+sTf +1)

, (2.9)

and the closed loop characteristic polynomial becomes

s3T2f /2+Tf s

2+s+ki .

To have a stable closed-loop system we must require thatTf ki < 2. A simple way todesign the system is to choose the integral gain so thatki is equal to the desired gaincrossover frequencyωgc. If the filter time constantTf is small the response is thenfirst order with the time constant 1/(ki). The phase margin is then approximativelygiven by

ϕm = 90−arctan(kiTf ).

Requiring a phase margin of 60 giveskiTf < 0.58. A suitable value ofTf can alsobe obtained by observing that we have the following approximation of the looptransfer function (2.9) for smalls.

L(s)≈ ki

s(1−sTf − . . .) =

ki

s−kiTf .

The simple design rule in [?] page XXX implies that the loop transfer functionshould be close to the line ReL(iω) = −0.5 for smallω, hencekiTf = 0.5. Fig-ure 2.7 shows the gain curves of the Gang of Four for the systemwith the controller(2.8) whose parameters are given by (2.7).

The figure shows that the sensitivity function 1/(1+PC) and the complemen-tary sensitivity functionPC/(1+PC) have nice properties. The maximum sensi-tivities areMs = 1.32 andMt = 1.0. The bandwith is 0.5ω0 as is expected by thedesign. Notice that the high frequency roll-off due to the filtering reduces the gainof the transfer functionC/(1+PC) drastically. Also notice the dip in the trans-fer function due to the notch filter. The transfer functionP/(1+PC) still has thehigh resonance peak because the notch filter prevents the controller to have anyinfluence around the frequencyω0.

The notch filter designs give controllers where the process poles are canceledby the corresponding controller zeros. The process poles therefore do not appearin the loop transfer function and the sensitivity functions. A consequence is thatrobustness measures such as the maximum sensitivities (Ms andMt) do not cap-ture effects of changes in process dynamics. Small deviations can give significantchanges in the closed loop system, particularly for systemswith small dampingratios. Figure 2.8 shows Nyquist plots for the loop transfer function for designswith ki = 0.2 andki = 0.8 for systems where the resonant frequency of the processhas been changed by 1%. The effects of the parameter changes are moderate for



10−2

100

102

10−2

100

10−2

100

102

10−2

100

10−2

100

102

10−2

100

10−2

100

102

10−2

100

ω/ω0ω/ω0

|T(i

ω)|

|S(i

ω)|

|PS(

iω)|

|CS(

iω)|

Figure 2.7: Gain curves for the Gang of Four for the spring-mass system with PID controlbased on a notch-filter design. The integral gain iski = 0.5ω0. The dashed curves correspondto the ideal PID controller given by (2.6), and the full lines correspondto a PID controller(2.6) which has high frequency roll-off with filter time constant isTf = 0.5/ki . The maxi-mum sensitivities areMs = 1.6, Mt = 1.0, Mps= 69 andMcs= 1.0.

−1.5 −1 −0.5 0 0.5

−1.5

−1

−0.5

0

−1.5 −1 −0.5 0 0.5

−1.5

−1

−0.5

0

−1.5 −1 −0.5 0 0.5

−1.5

−1

−0.5

0

−1.5 −1 −0.5 0 0.5

−1.5

−1

−0.5

0

−1.5 −1 −0.5 0 0.5

−1.5

−1

−0.5

0

−1.5 −1 −0.5 0 0.5

−1.5

−1

−0.5

0

Figure 2.8: Effects of parametric uncertainty. A PID controller is designed based onthenominal valueω0 of the resonant frequency in all cases. The middle column shows Nyquistplots of the loop transfer function for the nominal case when the resonant frequency isknown. The left column shows the corresponding curves when the resonant frequency ofthe process is 0.99ω0 and the right column shows the curves for 1.01ω0. The controller pa-rameters areki = 0.2ω0 in the top row andki = 0.8ω0 in the bottom row. The filter constantis Tf = 0.5/ki in all cases.


2.2. PID CONTROL 39

designs with lower bandwidthki = 0.2ω0 but very significant for the design withhigher bandwidthki = 0.8ω0. The notch filter design requires a good estimate ofthe resonant frequency. It is better to underestimate the resonant frequency be-cause the loop transfer function then has phase advance nearthe resonance, seeFigure 2.8.

Active Damping of Resonant Modes

The designs we have dicussed so far have the drawback that the resulting closedloop systems are very sensitive to load disturbances with frequencies close to theresonanceω0. Compare the Bode plots in Figures 2.6 and??. Another drawbackis that the bandwidth of the systems is restricted to be less thanω0, recall that thedesign rule for the notch filter design had this constraint. Another drawback withthe notch filter design is that the closed loop system is very sensitive to parametervariations. We will now explore a design of a PID controller which avoids all thesedifficulties (at the cost of being more sensitive to measurement noise).

We will start with an ideal PID controller and we will add high-frequency roll-off afterwards. The loop transfer function with a PID controller is

L(s) =ω2

0(kds2+kps+ki)

s(s2+2ζ ω0s+ω20),

and the closed loop characteristic polynomial becomes

s3+(kdω20 +2ζ ω0)s

2+(kp+1)ω20s+kiω2

0 .

The desired closed loop poles are given by a general third order polynomial whichcan be parameterized as

(s+αcωc)(s2+2ζcωcs+ω2

c )

= s3+(αc+2ζc)ωcs2+(1+2αcζc)ω2

c s+αcω3c .

Identification of coefficients of equal powers ofs gives

kp =(1+2αcζc)ω2

c

ω20

−1, ki =αcω3

c

ω20

, kd =(αc+2ζc)ωc

ω20

− 2ζω0

,

Ti =(1+2αcζc)ω2

c −ω20

αcω3c

, Td =(αc+2ζc)ωc−2ζ0ω0

(1+2αcζc)ω2c −ω2

0

.

Notice that the controller gainkp is negative ifωc is not sufficiently large. Thecritical valueωc0 is when the proportional gain is zero, which occurs for

(1+2αcζc)ω2c0 = ω2

0 .

The proportional gain is zero for this value and the controller transfer function hasa zero fors=±iωc0. The controller then has a perfect notch atωc0. In Section 2.1it was shown that zeros in the right half plane increases sensitivity, the designparameterωc should thus be chosen larger thanωc0, a simple choice isωc > ω0.The controller gains increase rapidly withωc, k asω2

c andki asω3c . Increasingωc



10−2

100

102

10−2

100

10−2

100

102

10−2

100

10−2

100

102

10−2

100

102

10−2

100

102

10−2

100

ω/ω0ω/ω0

|T(i

ω)|

|S(i

ω)|

|PS(

iω)|

|CS(

iω)|

Figure 2.9:Gain curves of the Gang of Four for the spring-mass system with a PID controller(2.10), that gives active damping. The parameters areki = 0.267ω0 (dash-dotted),ki = ω0(dashed) andki = 8ω0, (full). The approximate estimates of high frequency controller gainsand the crossover frequency ofCSare denoted by circles.

by feedback means physically that the spring constant is increased which naturallyrequires significant action, compare with Example?? in Chapter??.

To avoid feeding too much measurement noise into the system we introducehigh frequency roll-off as we did for the notch filter design. The controller is thengiven by

C(s) =kds2+kps+kd

s(s2T2f /2+sTf +1)

. (2.10)

WhenTf is zero the closed-loop poles are given by the roots of the polynomialand whenTf is small the closed loop poles are close to the these roots. Inadditionthere are two large closed loop poles. To have effective filtering it is useful touse as large a value ofTf as possible. A simple way is to choose a value so thatthe largest sensitivities have reasonable values. A fraction of Td can be used as astarting value.

Figure 2.9 shows that gain curves for the Gang of Four for the system with acontroller that damps the active modes. The parameters areζc = 0.707, αc = 1andωc/ω0 = 0.6436, 1 and 2. The controller parameters are given in Table 2.1. Acomparison with the corresponding curves with a controllerbased on a notch-filterdesign in Figure 2.7 shows that the controller with active damping has significantadvantages, the|T| plot shows that the bandwidth is larger, the|PS| plot showsthat reduction of load disturbances is improved dramatically. The plots of the sen-sitivity functions S and T show that the robustness is maintained. The plotCSshows that the improvements come at a cost because the high frequency gain ofthe controller is increased significantly. The high frequencygain of the controllerincreases rapidly with the design parameterωc. The gain is 72 forωc = 2ω0 and


2.2. PID CONTROL 41

Table 2.1: Parameters of PID controller that provides active damping of the the resonantmodes. The frequencyω∗ is the highest frequency where theCS(iω) = 1.

ωc kp ki/ω0 kdω0 ω0Tf ωgc/ω0 Mcs ωcs/ω0 ω∗/ω0

0.6436 0 0.267 1.534 0.311 0.65 7.14 3.56 32.2

1 1.414 1.00 2.394 0.20 2.67 17.64 5.28 121

2 8.656 8.00 4.808 0.10 4.98 72.17 10.07 977

the controller gain remains high for frequencies up to 1000ω0. To use this designit is thus necessary that the model is valid for frequencies that are at least threeorders of magnitude larger thanω0, which is a very stringent requirement. Eventhe design withωc = 0.64ω0 requires a model that is valid for frequencies that areat least 30ω0.

It follows from (2.7) that the controller parametersTi andTd that determines theposition of the controller zeros do not change much withω0 if ωc is large. We cantherefore expect that the system obtained with a controllerhaving active dampingis not as sensitive to parameter variations as the controller based on a notch filter.To explore this Figure 2.10 shows the Nyquist of the loop transfer functions whenthe process parameters are perturbed by 20%. A comparison with the correspond-ing curves for the notch filter design in Figure 2.8 shows that the system obtainedwith a controller providing active damping of the oscillatory modes is much lesssensitive to parameter variations. A comparison of Figures 2.7 and 2.9 shows thatthe controller with active damping gives better performance than the controllerbased on notch-filter design. The bandwidth is larger and the robustness is muchbetter. The improvemeent comes at a cost, the plots of the transfer functionCSin Figures 2.7 and 2.9 show that the controller with active damping has high gainat high frequencies. A consequence is that the process modelmust be accurate athigh frequencies and that measurement noise is amplified and fed into the system.The noise also generates much actuator action. The high-frequency controller gainthat can be permitted depends on sensor resolution, measurement noise, actuatorrange and resolution and the bandwidth of sensors and actuators.

Summary

We can learn several things from the simple example with PID control of thespring-mass system.

We can first notice that the system can be controlled by a pure integratingcontroller provided that the requirements are modest. An estimate of the achiev-able gain crossover frequency isωgc = 2ζ0ω0/gm, whereω0 is the resonant fre-quency,ζ0 is the damping of the resonant mode andgm is the desired gain margin.The approximative formula is valid for sufficiently smallζ0 (¡0.5). For nonreso-nant systems the achievable gain crossover frequency is insteadωgc = (1/Tar)×



−20 0 20 40 60−80

−60

−40

−20

0

20a=0.80

−2 0 2−3

−2

−1

0

1a=0.80

−20 0 20 40 60−80

−60

−40

−20

0

20a=1.00

−2 0 2−3

−2

−1

0

1a=1.00

−20 0 20 40 60−80

−60

−40

−20

0

20a=1.20

−2 0 2−3

−2

−1

0

1a=1.20

Figure 2.10:Nyquist plots of the loop transfer function for a controller with active damping.The resonance frequency of the process has the nominal value in the middle column, it isreduced by 20% in the left column and increased by 20% in the right column. The top rowgives an overview, the lower row shows an expanded view of the regionnear the criticalpoint. The nominal design parameters areωc = 1, ζc = 0.5, αc = 1

arctan(π/2−ϕm) whereTar = −P′(0)/P(0) is the average residence time of thesystem andϕm is the phase margin. The gain margin is the design parameter foroscillatory systems but the phase margin is the design parameter for non-resonantsystems.

No improvement is obtained by adding proportional action but significant im-provement can be obtained by adding high frequency roll-off. Integral gain is agood measure of disturbance attenuation. Integral gain andbandwith were in-creased by an order of magnitude fromki = 0.008ω0 to ki = 0.1ω0 in the particularexample.

Further improvements can be obtained by using a full PID controller with highfrequency roll-off. Two design methods were used. The resonance can be elimi-nated using a notch filter design. The achievable performance is limited by robust-ness requirements. With reasonable robustness requirements the bandwidth andintegral gain can be increased to aboutki = 0.5ω0. A drawback with the notch fil-ter design is that disturbances near the resonance frequency are poorly attenuated.The system is also sensitive to variations in the resonance frequency of the system.

Bandwidths larger thanω0 can be obtained with PID controllers with high fre-quency roll-off but the controller parameters have to be designed to give activedamping. Such designs can be made using robust pole placement. The resonantfrequency can also be changed by such controllers, compare the simple case in


2.3. STATE FEEDBACK 43

FeedforwardSignal

Generator

r

u

-x

xm

ProcessΣ ΣState

Feedback

KalmanFilter

uf b

uf f

y

Figure 2.11: Block diagram of a controller based on a structure with two degrees of free-dom. The controller consists of a command signal generator, state feedback and an observer.Figure from [?].

Example??. Controller designed in this way can have low sensitivitiesand ex-cellent disturbance attenuation. A drawback is that the controller has high gainat high frequency. Amplification of measurement noise is a factor that limits thebandwidth. The closed loop system is not sensitive to variations in the resonancefrequency, but it requires models that are valid for frequencies well above the res-onance frequency.

2.3 State Feedback

The PID controller is versatile but there are problems where performance can beimproved significantly by using a more complex controller. Derivative action isvery important to introduce damping in oscillatory systems. In a PID controllerderivative action is used to predict future errors, and prediction is done by linearextrapolation, see Figure 1.17 in [?].

There are more advanced controllers that obtain better performance by usinga mathematical model of the process and the disturbances in combination withmeasured process inputs and outputs to predict. A block diagram of a system withsuch a controller is shown in Figure 2.11. The controller has three blocks, thefeedforward generator, the state feedback and the Kalman filter. The feedforwardgenerator generates the feedforward signaluf f and the ideal behavior of all statesxm. The feedforward signaluf f is such that it produces the desired behaviorxm

of the states. It can be generated in real time from the command signal or from atable. The Kalman filter produces an estimate ˆx of the state based on a model andmeasurements of process inputu and outputy. The state feedback generates thefeedback signalK(xm− x) which is added to the feedforward signaluf f to give thecontrol signalu= K(xm− x)+uf f .

There are many ways to design algorithms in the blocks. The feedforward gen-erator typically requires an inversion of process dynamics. The feedforward signalscan also be generated by tables driven by the command signal.Process dynamics



imposes constraints on what can be achieved. To obtain a realizable controller it isnecessary to consider time delays and right half plane zeros. The maximum avail-able control signals limits the response speed.

The Kalman filter or the observer is an essential element of the controller. Theobserver is based on a mathematical model ofthe process and its disturbances.It can be designed in many different ways, by pole placement,LQG or robustcontrol depending on the specifications and the available information. The samemethods can be used for the design of the state feedback because it is the dual ofthe observer design problem. and it is driven both by the control variable

The controller in Figure 2.11 is a general version of a controller with twodegrees-of-freedom. It separates the responses to commandsignals and distur-bances. The disturbance response is influenced by the observerand the state feed-back. The response to command signal is determined by the feedforward generator.

To get insight into the behavior of the system let us walk through what happenswhen the command signal is changed. To fix the ideas, assume that the system is inequilibrium with the observer state equal to the process state. When the commandsignal is changed a feedforward signaluf f is generated. This signal is such that theprocess output gives the desired output. The process state changes in response tothe feedforward signal. The observer tracks the state perfectly, because the initialstate was correct. The feedback signalK(xm− x) is zero because ˆx= xm. If thereare disturbances or modeling errors the feedback signal will be different from zeroand attempt to correct the situation. After this broad discussion we will now treatsome details.

Reference Values and Integral Action

We will first discuss a very simple way of introducing command signals. For sim-plicity we will assume that the full state is measured. Consider the process

dxdt

= Ax+Bu, y=Cx,

and the controlleru=−Kx+Kr r. (2.11)

Compared with the general case the reference value is fed directly to the controllerand the model state does not appear in the control lawxm = 0. The closed loopsystem is then given by

dxdt

= (A−BK)x+BKxm+BKr r, y=Cx.

A prime requirement is that the matrixK makesA−BK stable. The steady stateoutput is

x0 = (A−BK)−1(BKxm+BKr r), y0 =C(A−BK)−1(BKxm+BKr r).

To obtain the desired steady state output,y0 = r, the feedforward gain should be

Kr =(

C(A−BK)−1B)−1

(2.12)



This choice gives acalibrated systembecause the correct steady state is obtainedby carefully matching the feedforward gainKr to the system parameters. This wayof introducing the reference value is similar to what was used in early controlwhere the steady state errror was eliminated by using a bias term.

Integral action is a better way to make sure that the controller gives the correctsteady state. State feedback generates a feedback law that drives the states to zero.Normally we consider only the process states but if there is aspecial variable wewant to drive to zero we can augment the state by this variable. If it is possible tofind a state feedback the special variable will always be zero.To guarantee that theoutput follows the reference valuer, we then introducez=

∫

(y− r)dt as an extrastate variable. The state equation forz is

dzdt

= y− r

Augmenting the process statex with z we get the following state equations for theaugmented state

ddt

xz

=

A 0C 0

xz

+

B 00 −1

ur

, y=

C 0

If this system is reachable we can then compute a state feedback u= −Kxx−Kizthat stabilizes the system. In particular ifr is constant the statez will be driven tozero andy will approachr. The control law

u=−Kx−Kiz,dzdt

=Cx− r. (2.13)

is the state space analog of PI control. Instead of feeding back the integrator stateand the integrator state and the full process state are fed back. Notice that thereference value only enters in the integrator state. To havegreater flexibility todesign the command response we can modify the control law to also feed back themodel statexm. The controller then becomes

u= K(x−xm)−Kiz+Kr r.

The parameterKr give extra freedome to adjust the command signal response.If all states are not measured, the control law becomes

dxdt

= Ax+Bu+L(y−Cx),dzdt

=Cxm− r, u= K(xm− x)−Kiz, (2.14)

wherex is an estimate of the process state.It is interesting to compare the control laws (2.11) and (2.13). In (2.11) the

reference enters as a feedforward term in the control law, but in (2.13) it enters asa feedback via the statez which has integrator dynamics.

Since integral action appear as a separate integrator, windup protection andmanual control can be done in the same way as for the ordinary PID controller.



Disturbance Observer

Integral action is typically introduced to reduce the effect of disturbances. Theobserver used in (2.14) may not work well when there are process disturbancesbecause the disturbances are not accounted for in the underlying model. Anotherway is to model boththe process and its disturbances. To illustrate this we assumethat the disturbance is an unknown but constant disturbanceentering at the processinput.

For simplicity we assume that there is a constant disturbance v at the processinput. A constant but unknown disturbance can be modeled by

dvdt

= 0, (2.15)

the process input is thenu+ v. If the disturbancev was known it is natural to usethe feedbacku=−Kx−v. Sincev is not known we will instead use the feedback

u=−Kx− v,dvdt

=−L(u+ v), (2.16)

where the differential equation is an observer forv. We can also view the differen-tial equation for a feedback loop that adjusts ˆv to driveu+ v to zero. Eliminationof v in the above equation gives

U(s) =−KX(s)− Ls

KiX(s)

which shows that the controller (2.16) has integral action.This scheme of introduc-ing integral action is similar to the one illustrated in Figure 3.3 inAstrom HagglundAdvanced PID control.

When the complete state is not measured we must design a complete observer.Let the process dynamics be described by

dxdt

= Ax+B(u+v), y=Cx, (2.17)

whereu is the control signal,y the measured output,x the state of the systemandv a load disturbance. Combining this equation with the model (2.15) for thedisturbancev we find the folloing model that describes the dynamics ofthe processand its disturbances

ddt

xv

=

A B0 0

xv

+

B0

u, y=

C 0

xv

. (2.18)

The state of the complete system is the state of the processx and the disturbancestatev. The 0:s denote matrices with zero entries of appropriate dimensions. Thedisturbance statev is naturally not reachable from the control signal. Computa-tion of the feedback gainKx does however only require that the original system isreachable. The feedback gain from ˆv trivial.

If the disturbancev known its effect of on the system can however be eliminatedby choosing the control signalu=−v. Since the disturbance and the state are not



known we estimate them using the observer

ddt

xv

=

A B0 0

xv

+

B0

u+

Lx

Lv

(y−Cx), (2.19)

wherex denotes the estimate of the process statex and v denotes the estimate ofthe disturbance statev. The right hand side of (2.19) is a sum of three terms, thefirst is a linear function of the state, the second is a functionof the control variableu and the third is proportional to the errore, which is the difference between themeasured outputy and its estimate ˆy=Cx. The matricesLx andLv are filter gainswhich tell how much weight is given on the error. An observer can be found if theaugmented state is observable fromy.

The control signal is given by

u= uf f +Kx(xm− x)− v (2.20)

whereuf f is the feedforward signal andxm is the desired state. The signalsuf f andxm are generated by the feedforward signal generator. The last term in (2.20) canbe interpreted as feedforward from an estimate of the load disturbance. The idea tomodel disturbances and to estimate them using an observer issometimes referredto as adisturbance observer.

More General Disturbance Models

The idea to augment the process states with states that capture properties of thedisturbance and to estimate the disturbance state can be viewed as an extension ofintegral control. It admits capture of disturbance with special properties.

Sinusoidal disturbances with known frequencyωd can be captured by

dvdt

=

0 ωd−ωd 0

v

Periodic disturbances with known period can be modeled as

v(t) = v(t −L)

In general we can use the idea of a disturbance observer whenever the distur-bance can be modeled as a linear equation

dwdt

= Avw, y=Cvw.

The matrixAv will typically have eigenvalues on the imaginary axis to capturedisturbances that do not go to zero as time increases.

The standard theory for designing state feedback typically assumes that thefull state is reachable from the control signal and that the state is observable fromthe output. When using feedback from an observer state we must require that theaugmented system is observable from the output. Since the disturbance statewnaturally cannot be reachable from the control signal some additional technicalrequirements are typically needed.



-

FeedforwardSignal

Generator

ru

Controller

x

ym

Process

ΣΣObserver SFB

uf b

uf f

y

Figure 2.12:Block diagram of a controller based on a feedforward and feedback from esti-mated states with a simplified feedforward structure.

A Decoupled Feedforward Architecture

The control scheme in Figure 2.11 admits very tight control since feedback is gen-erated from the state errorxm− x. So far we have not given any formal restrictionson the feedforward signalsxm anduf f . It is, however, clear that the desired statexm must be compatible with the process state. The dimensions ofx andxm mustbe the same and the components of the states must correspond to the same physi-cal variables. One way to achieve this is to require that the signalsuf f andxm arecompatible with the model meaning that the stagexm is generated by

dxm

dt= Axm+Buf f , ym =Cxm. (2.21)

It then follows from the process model (2.17) that the feedback termKx(xm− x) iszero under the ideal circumstances of no disturbances and correct initial conditions.We will also see that the controller structure can be simplified significantly.

Introduce ˜x = xm− x and use Equations (2.19) and (2.21) to eliminate ˆx. Thecontroller then becomes

dxdt

= (A−BKx−LxC)x+Lx(ym−y) = Acx+Lx(ym−y)

dvdt

= LvCx−Lv(ym−y)

u= uf f +Kxx− v.

(2.22)

A block diagram of the system is given in Figure 2.12. The controller consists thefeedforward signal generator, the model of the system, which is parameterized bythe feedback gainK, and the observer gainsLx andLv. There rae many designmethods that can be used to find these gains. Notice that under the assumption thatxm is generated by (2.21) it is sufficient if the feedforward generator produces thesignalsuf f andym.



We will now rewrite (2.22) in a way which will give us insight into the proper-ties of the controller. Replace the state ˆv in (2.22) withw given by

w= LvCA−1c x− v.

The controller (2.22) can be written as

dxdt

= Acx+Lx(ym−y)

dwdt

= (Lv+LvCA−1c Lx)(ym−y) = Ki(ym−y)

u= uf f +(Kx−LvCA−1c )x+w.

(2.23)

This equation shows that the controller is indeed a generalization of a PID con-troller. The statew represents the integral term andKi is the integral gain. Thestate ˜x is a generalization of proportional and derivative action with high frequencyroll-off. Recall that ˜x= xm− x is the difference between the ideal statexm and theestimate of the process state ˆx.

The controller transfer function is

C(s) = Guf be =Lv

s+(

Kx−1sLvC

)

(sI−A+BKx+LxC)−1Lx (2.24)

and it has the integral gain

Ki = Lv(1+CA−1c Lx) = Lv

(

1+C(A−BKx−LxC)−1Lx

)

.

The controller structure given by (2.23) and in Figure 2.12 hasseveral advan-tages. The feedforward generator only has to supplyym anduf f which gives a niceseparation of feedback and feedforward. The fact that integral action appears as aseparate integator means that windup protection and manualcontrol can be donein the same way as for simple PID controllers. The integration of advanced controlwith PID control is then simplified. Another way to deal with integrator windupis to limit the control signalu so that it corresponds to the real saturated controlbefore it is fed into the observer. A block diagram of such a scheme is shown inFigure 2.13.

There are many ways to generate the feedforward signalsxm, ym anduf f . Com-puting uf f is essentially a way to invert process dynamics, to compute the inputthat generates the ideal outputym. A particularly simple form is obtained by simplyspecifyingym and to setuf f = 0. The controller then becomes

dxdt

= Acx−Lxy

dwdt

= (Lv+LvCA−1c Lx)(ym−y) = Ki(ym−y)

u= (Kx−LvCA−1c )x+w.

(2.25)

This scheme is equivalent to using zero set point weights in anordinary PID con-troller.



xh

FeedforwardSignal

Generator

ActuatorModel

r

uym

ProcessΣ ΣObserver

SFB

uf b

uf f

y

Figure 2.13:Block diagram of a controller based on a feedforward and feedback from esti-mated states with windup protection.

Sensor Calibration

A sensor bias in single-input-single-output system will always give an error. Thesituation is different for multivariable systems because it may be possible to elim-inate bias in one sensor by exploiting other sensors. We willuse an example toillustrate how this can be done.

Example 2.4 Pendulum on a cartConsider a pendulum on a cart. The normalized state equationsare

dxdt

=

0 1 0 00 0 0 00 0 0 10 0 1 0

x+

0101

u (2.26)

wherex1 is the position of the cart andx2 its velocity,x3 the pendulum angle andx4 the angular velocity of the pendulum. Assume that the position measurementy1is bias free and that there is a constant biasx5 in the angle measurement. Hence

y1 = x1, y2 = x3+z5,dx5

dt= 0

The model of the system and its environment is

dxdt

=

0 1 0 0 00 0 0 0 00 0 0 1 00 0 1 0 00 0 0 0 0

x+

01010

u, y=

1 0 0 0 00 0 1 0 1

(2.27)



The observability matrix is

Wo=

CCACA2

=

1 0 0 0 00 0 1 0 10 1 0 0 00 0 0 1 00 0 0 0 00 0 1 0 0

This matrix has full rank because the rows 1, 3, 6, 4 and 2 are linearly independent.It is thus possible to design an observer for the state. To find the controller wedesign a state feedback for the original process model (2.26). This system is offourth order and the design gives the feedback gainsk1, k2, ,k3 andk4. Then wedesign an observer for the augmented system (2.27) and the controller becomes

dxdt

= Aax+Bau+K(y−Cax), u= k1(xm1− x1)+ · · ·+k4(xm4− x4)+uf f

whereAa, Ba andCa are the matrices of the augmented system. Notice that thereis no feedback from the statex5. The estimate of the calibration error only appearsin the termy2− x3− x5 in the observer.

If there is no bias correction, a bias of the sensor for the pendulum angle willcause the cart to move. The bias is thus reflected in a cart motion. ∇

The possibility to reduce bias in sensors and load disturbances is a nice propertyof state feedback with observers.

Summary

State feedback with complete state information can be viewedas a special case ofcascade control where all internal signals are available. State feedback does notautomatically give integral action. Integral action can beintroduced in many ways,by forcing the integral ofy− r to zero, by modeling disturbances and measurementerrors. In all cases the process model is augmented with a model that captures thebehavior of command signals, disturbances and measurementerrors. Modeling thedisturbances is a natural extension that makes it possible to tailor the controller tothe specific situation. The disturbance states are naturally not reachable but theymust be observable from the measured signals.

Pole Placement LQG-LTR, H∞ and All That Jazz

There are many design methods that can be used to determine thecontroller param-eters, many of them are supported by software. It is interesting that most methodscan be represented by the block diagram in Figure?? or Figure??. All methodsstart with a linear model of the system. In all methods there are a number of designparameters that have to be chosen by the user, the method thendelivers controllerparameters that satisfy various criteria which attempt to capture robustness and



performance. Since there are extensive discussion of the design methods in the lit-erature, we will only briefly summarize some of the methods andtheir properties.

In pole placementor eigenvalue placementthe closed loop poles are specified.All closed loop poles can be specified but the eigenvalues of the matricesA−BKandA− LC can also be specified separately. In MATLAB the computations aredone by the functionsacker or place. Specifying the closed loop poles doesnot guarantee robustness. Therefore it is necessary to compute sensitivities andother quantities to obtain a robust closed loop system. It isfairly easy to specifythe closed loop poles for low order systems but not so easy forhigh order systems.Som design rules for pole placement are given below.

The secret to obtain a robust pole placement design is to choose closed looppoles based on the process dynamics. Here is a simple design rule.

• Choose the desired gain crossover frequencyωgc and classify the processpoles and zeros as slow, central and fast.

• Simplify the model by neglecting very fast poles and zeros, those who are10 to 20 times faster thanωgc.

• Choose closed loop poles to match slow stable zeros. Slow stable processzeros are thus cancelled by identical controller poles. Slowunstable ze-ros impose limitations on the achievable gain crossover frequency (ωgc <0.5zRHP).

• Choose closed loop zeros th match fast stable poles. Fast stable processpoles are thus canceled by corresponding zeros in the controller. Fast unsta-ble poles impose limitations on the achievable crossover frequency (ωgc >2pRHP.

The cancellations imply that the model can be simplified. Closed loop poles arethen chosen in the range of the central poles and a controlleris designed by poleplacement. The factors corresponding to the cancellations are then added to thecontroller. A detailed discussion is given in Section 12.4 of[?].

Linear quadratic gaussian control can also be used. In this case the controlleris specified by assuming a stochastic noise model and to minimize the expectedvalue of the criterion

V = XT(t1)Q0x(t1)+∫ t2

t0

(

xT(s)Qxx(s)+2xT(s)Qxuu(s)+uT(s)Quu(s))

ds.

When optimization is made over a finite time interval the controller gainK and theobserver gainL are functions of time. When the time interval is infinite the gainsare constant. The controller gain is given by a Riccati equation.

The computations are executed using the MATLABfunctionslqg . In somecases there are goop physical models that gives the noise model, in other cases thenoise model can be determined from experimental data using system identifica-tion. Minimizing a quadratic loss function does not guarantee that the closed loop


2.4. A DRIVE SYSTEM 53

system is robust. It is therefore necessary to check the robustness of a design af-terwards. Choosing the lossfunction also requires some care. Scaling of the designvariables is a good starting point, having chosen state variables with good physi-cal interpretation helps. For example increasing penalty on variables representingvelocity typically improves damping. Optimization is useful for multivariable sys-tem. A good source for LQG is XXX.

Since LQG does not guarantee robustness it is therefore necessary to explorethe solution to make sure that the design choices made resultin a robust system.There are serveral attempts to extend LQG to obtain a robust controller there aremethods for robust recovery LQG/LTR and the socalledH∞ method which givescontrollers with guaranteed robustness. The design parameters are typically fre-quency weighting whose choise also requires some care. A good advice is to col-laborate with collegues who have experience in use of the design methods.

2.4 A Drive System

We have mentioned several times that many systems can be effectively solvedusing PID control. In this section we will discuss a more complicated system whichalso shows that there are systems cases where significant improvements can beobtained by using a controller that is more complicated thana PID.

A schematic picture of a positioning system is shown in Figure2.14. The sys-tem has two masses connected by a spring and one mass is forcedby a voice coil.The system is a simple prototype for many practical systems, it could represent thedrive mechanism for a DVD drive or an optical memory, a motor driving an inertiawith a flexible shaft, or a drive system for a robot with elasticjoints.

The main requirements is that the system should have a fast response to com-mand signals, be insensitive to forces acting on the second mass and that it shouldbe robust to process variations.

The moving mass of the voice coil assembly ism1, the mass we want to positionis m2 and the spring coefficient isk. It is assumed thatm1 is smaller thanm2. Itis assumed that the voice coil is current driven and that the force generated isproportional to the current. For simplicity we assume that the only damping iscaused by the relative motion of the masses. The equations of motion of the systemare given by the momentum balances The equations of motion of the system

m1x1 = k(x2−x1)+cd(x2− x1)+KI I

m2x2 = k(x1−x2)+cd(x1− x2)+Fd,(2.28)

The inputu corresponds to the control signal which is a force applied tothe firstinertia, the output is the positionx2 of the second mass, and the major disturbanceis the forceFd applied to the second mass.

The state variables are chosen as the positionsx1 andx2 and their scaled deriva-



Figure 2.14:Schematic picture of a drive system.

tives

x3 =x1

ω0, x4 =

x2

ω0, ω0 =

√

k(m1+m2)

m1m2.

all states have dimension[m], and the scale factorω0 is the undamped naturalfrequency of the system when the control signal is zero. The state equations are

1ω0

dxdt

=

0 0 1 00 0 0 1

−α1 α1 −β1 β1α2 −α2 β2 −β2

x+

00γ10

I +

000γ2

Fd

y=

0 1 0 0

x.

(2.29)

The parameters are given by

α1 =m2

m1+m2, α2 =

m1

m1+m2β1 =

cd

ω0m1, β2 =

cd

ω0m2

γ1 =KI

ω20m1

, γ2 =1

ω20m2

.

Their values are:m1 = 10/9, m2 = 10, k = 1, cd = 0.1, andkI = 5. They arenormalized to give the resonance frequencyω0 = 1, the damping ratio isζ0 = 0.05.

Straight forward but tedious calculations shows that the transfer functions fromcurrent to the positionx2 is

P(s) =kI (cds+k)

(m1+m2)s2(ms2+cds+k)=

kI

(m1+m2)s2 ×cds+k

ms2+cds+k

=kI

mtots2 ×2ζ0ω0s+ω2

0

s2+2ζ0ω0s+ω20

=0.045s+0.45s2+0.1s+1

,

(2.30)

wheremtot = m1+m2 is the total mass andm= m1m2/mtot is the equivalent mass.The frequencyω0 =

√

k/m is the natural frequency of the system (the natural



10−2

10−1

100

101

10−2

100

102

104

10−2

10−1

100

101

−360

−270

−180

ω/ω0

|P(i

ω)|

∠P(i

ω)

−4 −2 0 2

0

2

4

ReP(iω)

ImP(i

ω)

Figure 2.15: The left figure shows the Bode plot of the transfer functionsP (dashed) andright picture the corresponding Nyquist plot. The circle indicates the gain crossover fre-quency that can be obtained by using PID control.

frequency of the spring-mass system with massm and spring coefficientk). Thesystem is conveniently normalized using the natural frequency ω0, the dampingratio ζ0, the inertia ratiom2/m1 and the gainkI/mtot.

The transfer functionP(s) has complex poles atω = ω0 = 1 due to the reso-nance and a complex zero atω = ω0/(2ζ0) = 10. Bode and Nyquist plots of thetransfer function is shown in Figure 2.15.

PID Control

A controller for positioning the massm2 will now be developed. We will first makean assessment of what can be achieved by PID control and we willthen discusshow various requirements can be met quantitatively. The dynamics of the system isgiven by the transfer functionP, see Figure 2.15. For the simple oscillatory systemdiscussed in Section?? it was possible to actively damp the oscillatory modesusing PID control. Unfortunately this is not possible for thesystem (2.30) becausethe phase lag of the system is more than 180for all frequencies. Derivative actionis needed to stabilizing the system and it cannot provide active damping of theoscillatory mode.

A PID controller will be designed by finding an ideal PID controller for anapproximate model, adding high frequency roll-off and fine tuning parameters bycomputing appropriate closed loop properties.

For low frequencies (|s| << ω0) the transfer functionP2 can be approximatedby

P(s)≈ P(s) =kI

mtots2 =0.45s2 , (2.31)

which is the dynamics of the massmtot = m1+m2 driven by the voice coil. Thismodel implies that masses move synchronously as if they wererigidly connected.The model is a good approximation for excitations with frequencies well below theresonance. For more rapid excitations the massess no longermove synchronouslyand it is then necessary to use the more complicated model given by (??).



The loop transfer function of (??) with an ideal PID controller is

L(s) =kI (kds2+kps+ki)

mtots3 ,

and the closed loop has the characteristic polynomial

s3+kdkI

mtots2+

kpkI

mtots+

kikI

mtot.

Requiring this polynomial to be

(s2+2ζcωcs+ω2c )(s+αcωc)

gives the controller parameters

kd =mtot(αc+2ζc)ωc

kI, kp =

mtot(1+2αcζc)ω2c

kI, ki =

mtotαcω3c

kI. (2.32)

Reasonable choices areζc = 0.5 andαc = 1, the parameterωc will be used as adesign parameter. With these choices we get

kd =2mtot

kIωc, kp =

2mtot

kIω2

c , ki =mtot

kIω3

c . (2.33)

It follows from this equation thatTi/Td = 2, and that the controller has complexzeros atω = 1/(Td

√2) with damping ratio 0.707. Adding high frequency roll-off

gives the controller transfer function

C(s) =kds2+kps+ki

s(1+sTf +s2T2f /2)

(2.34)

where we chooseTf = 0.1Td. The controller will work well for the approximatemodel (2.31) ifωc is sufficiently small.

To find reasonable values of the design parameterωc we make an analysissimilar to the one made for the spring mass system in Section XXX. Assuminga controller where the derivative action dominates at the gain crossover we findωgc= 2ζ0ω0/gm. A gain margin of 2.5 then givesωgc≈ 0.04ω0. Figure 2.16 showsBode and Nyquist plots for the closed loop system. The low frequency part of theNyquist curve is practically the same for all for all values of ωc. The reason is thatthe approximate loop transfer function for the approximatemodelP

L(s) = P(s)C(s) =kI (kds2+kps+ki)

mtots3(1+sTf +s2T2f /2)

(2.35)

is invariant to frequency scaling becausekd, kp andki are proportional toωc, ω2c

and ω3c respectively. The corresponding invariance in the Bode plotis that the

curves are identical for low frequencies apart from a frequency shift. the loops inthe Nyquist plot are due to the difference betweenP andP.

The Nyquist plot shows that the sensitivity increases for increasingωc. Thecircular shaped loops near the origin are caused by the oscillatory process dynam-



10−2

100

102

10−2

100

10−2

100

102

−450

−360

−270

−180

−90

|L(i

ω)|

∠L(i

ω)

ω/ω0

−2 −1 0 1−1

−0.5

0

0.5

1

1.5

2

ReL(iω)

ImL(i

ω)

Figure 2.16: Bode (left) and Nyquist plots (right) of PID control of the drive system forωc = 0.04 (dotted), 0.06 (dash-dotted), 0.08 (dashed) and 0.1 (full).

ics. The loops increase with increasingωc, for ωc = 0.1 they are so large thatthey encircle the critical point−1 and the system is unstable. The diameter of theloops are approximately equal to|L(iω0)| which gives the values 0.24 (ωc = 0.04),0.7 (ωc = 0.06), 1.3 (ωc = 0.08) and 1.8 (ωc = 0.1). Important properties of theclosed loop system are illustrated in Table 2.2 and Figure 2.17 which show howkey parameters depend on the design parameterωc.

The requirements on robustness can be captured by the largestsensitivitiesMs

and Mt . The sensitivities are practically constantMs = 1.26 andMt = 1.69 forωc ≤ 0.05. The sensitivity to load disturbancesMps decreases but the sensitivity tomeasurement noiseMcs increases with increasingωc. Integral gainki also increaseswith increasingωc.

Since load disturbances act on the second mass they are not captured byMps

instead we will consider the transfer function from load disturbancesFd to outputy. Let C(s) denote the controller transfer function. It follows from (2.28) that thetransfer function from a forceFd on the second mass to the position of the secondmass is

Gyd(s) =m1s2+cds+k

m1m2s4+cd(m1+m2)s3+k(m1+m2)s2+kI (cds+k)C(s)

=s2+2(m/m1)ζ0s+(m/m1)ω2

0

m2s2(s2+2ζ0ω0s+ω20)

S(s)

(2.36)

The largest value ofGxd is given in 2.2, notice that it is approximately five timessmaller thanMps. For smalls (low frequencies) we have the approximationGxd ≈s/(kikI ). Since load disturbances typically have low frequencies integral gain isthus a good measure of load disturbances.

The response time can be captured by bandwidthωbw or by the settling timeTs. The time responses are shown in Figure 2.18. The overshoot is large for con-troller with error feedback but it can be reduced significantly by using set pointweighting. It is generally true for oscillatory systems that the set point response



Table 2.2:Properties of the closed loop system obtained with PID control given as functionsof the design parameterωc.

ωc Ms Mt Mps Mcs Mxd ki ×104 ωbw ωgc

0.01 1.26 1.69 3335 0.005 672 0.02 0.035 0.0200.02 1.26 1.69 840 0.020 168 0.18 0.071 0.0400.03 1.26 1.69 373 0.046 74.6 0.60 0.108 0.0610.04 1.27 1.69 209 0.084 41.9 1.42 0.145 0.0810.05 1.29 1.69 134 0.138 26.8 2.78 0.183 0.1020.06 1.51 1.69 93.2 0.241 18.6 4.80 0.224 0.1230.07 2.08 1.69 68.4 0.451 13.6 7.62 0.268 0.1440.08 3.51 2.80 52.3 0.954 10.4 11.38 0.738 0.1660.09 10.60 9.91 41.3 3.443 8.2 16.20 0.379 0.188

can be improved significantly by using set-point weighting. The reason for this isthat the excitation of the oscillatory mode is significantly reduced by feeding theset-point only to the integral term. The settling times are 200, 150, and 100 forωc = 0.04,0.06 and 0.08.

The reason whyωc cannot be increased too much is that the controller is basedon a simplified model. The essential assumption is that the oscillatory mode isneglected which means that the anglesx1 andx2 are the same. The effect of themodel simplifications is clearly seen in Figure 2.18 which shows both that theangles are practically the same forωc/ω0 = 0.04. The masses move synchronouslyand the approximation of the dynamics with a double integrator works well. Forthe design withωc = 0.08ω0 there are deviations betweenx1 andx2 and the systemis unstable forωc = 0.10.

We can also draw another conclusion from the green curves in Figure??whichshow the set point response for a controller with zero set point weights on thederivative and proportional terms. With set point weighting there is less excitationof the oscillatory modes and the oscillatory mode is not visible in the curves. Theovershoot also decreases significantly without increasing the settling time.

We can draw another conclusion from the green curves in Figure?? whichshow the set point response for a controller with zero set point weights on thederivative and proportional terms. With set point weighting there is less excitationof the oscillatory modes and the oscillatory mode is not visible in the curves. Theovershoot also decreases significantly without increasing the settling time.

In summary we find that the system can be controlled by a PID controller.It is important to use a roll-off filter and set-point weighting but the achievableperformance is limited toωc = 0.05. A resonable choice of the design parameter isωc = 0.05 wich gives the controller parameterskp = 1.11×10−2, ki = 2.78×10−4

kd = 2.22 andTf = 2. The maximum sensitivites areMs = 1.3 andMt = 1.7 andthe settling time is about 100 s.



10−2

10−1

100

101

10−2

100

10−2

10−1

100

101

100

102

10−2

10−1

100

101

10−2

100

10−2

10−1

100

101

10−2

100

|T(i

ω)|

|S(i

ω)|

|PS(

iω)|

|CS(

iω)|

ω/ω0ω/ω0

Figure 2.17:Gain plots of the Gang of Four for PID control of the drive system. The designparameterωc has the values 0.04 (dotted), 0.06 (dash-dotted), 0.08 (dashed) and 0.10 (full).

State feedback

The performance that could be achieve with PID control is limited to settling timesof the order of 100 s. The PID controller which predicts by linear extrapolation didnot have the ability to capture what is happening when the masses do not movetogether. To obtain faster response we will now design a controller which is basedon a more complicated model that will capture the physics of the system better.We will design a controller with the structure shown in Figure2.11 based on statefeedback and an observer. To obtain a controller that gives acrossover frequencyhigher than was obtained with PID control it is necessary to design a controllerthat gives a substantial phase lead. We will use pole placement to find the feedbackgains and the observer gains. Integral action will be obtained using a disturbanceobserver.

Requiring that the matrixA−BK has the eigenvalues−4± 4i, −8± 8i andand that the matrixA−LC has eigenvalues and -2,−0.05± i and−0.5±0.5i theobserver gains become

Kx =

38.7 9063.5 5.3 2497.8

Lx =

13.0690 3.0000 2.5375 2.5025

Lv = 2.2278.

(2.37)

The values are obtained by using the following MATLAB code.

The controller transfer function is

C(s) = 103× 33.96s4+28.7s3+45.67s2+26.23s+9.125s(s4+27s3+362.5s2+2461s+7921)

.



0 50 100 150 2000

0.5

1

1.5

0 50 100 1500

0.5

1

1.5

0 20 40 60 80 1000

0.5

1

1.5

0 20 40 60 800

0.5

1

1.5

yy

yy

(a) (b)

(c) (d)

Figure 2.18:Step responses for PID control of the drive system. The full blue line representspositionx2 of massm2, the red dashed line shows positionx1 of massx1. The red and bluecurves are obtained with a PID controller having error feedback. The green curve showspositionx2 of massm2 for a controller with setpoint weighting. The design parameters area) ωc/ω0 = 0.04, b)ωc/ω0 = 0.06, c)ωc/ω0 = 0.08, and d)ωc/ω0 = 0.10.

It has poles at the origin and−6.41±8.06i, −7.10±4.94i and zeros in−0.05± iand−0.37±0.36i. Notice that the controller zeros cancel the resonant polesof theprocess. This is a consequence of our requirement that one closed loop pole pairshould be equal to the resonant poles. The design choices thusgives a notch filterdesign.

Bode and Nyquist plots of the loop transfer function are shown in Figure??.The properties of the system are further illustrated in Figure2.20 which givesthe gains of the Gang of Four. The key parameters areMs = 1.6 (1.5),Mt = 1.6(1.7), Mps = 3.5 (93.2),Mcs = 2023 (0.24),ωgc = 1.95 (0.123),ωbw = 4.43ω0(0.224),ki = 1.15 (4.8×10−4) the corresponding values for PID control are given

10−2

10−1

100

101

102

10−2

100

102

104

10−2

10−1

100

101

102

−400

−350

−300

−250

−200

−150

−100

|L(i

ω)|

∠L(i

ω)

ω/ω0−2 −1.5 −1 −0.5 0 0.5 1

−2

−1.5

−1

−0.5

0

0.5

1

ReL(iω)

ImL(i

ω)

Figure 2.19: Bode (left) and Nyquist plots (right) of the loop transfer function of the drivesystem with state feedback.



10−2

10−1

100

101

10−2

100

10−2

10−1

100

101

10−2

100

102

10−2

10−1

100

101

10−2

100

102

10−2

10−1

100

101

10−2

100

|T(i

ω)|

|S(i

ω)|

|PS(

iω)|

|CS(

iω)|

ω/ω0ω/ω0

Figure 2.20: Gain plots of the Gang of Four for a state feedback controller for the drivesystem. The corresponding curves for a PID controller with design parameterωc = 0.06 aregiven as dashed lines.

in brackets. The robustness margins are good for both designs. A comparison withPID control shows that there are drastic differences in performance. The bandwidthis increased by a factor of 20. There are dramatic improvements in load disturbanceattenuation. Integral gain is increased by more than three orders of magnitude as isthe low frequency part of the transfer functionPS. The price is that the attenuationof load disturbances is increased dramatically. The largestvalue of the transfferfunctionCS is about 2000. To use the controller it is thus necessary thatsensornoise is very small.

The time responses of the system are shown in Figure 2.21 where we also showthe time responses of a PID controller with design parameterω0 = 0.06ωc. Largeovershoots are avoided by using set point weighting. The reference signal is onlyfed to the integral term, notch filtering of the command signalis also used to avoidexcitation of the oscillatory mode. The feedforward action of the controller is thusdescribed by

Cf f =9125(s2/ω2

0 +2ζ0s/ω0+1)s(s4+27s3+362.5s2+2461s+7921)

.

A comparison with the PID controller shows that the response time is improvedby an order of magnitude, notice the different scales in the diagram. The controlsignals required for the state feedback is about 2000 times larger because a muchlarger force is required to provide the acceleration required.

The propertie of the controllers can be understood intuitively as follows. PIDcontrol is based on the assumption that the masses move synchronously and thatthe system can be viewed as two rigidly connected masses. Large accelerations arerequired to obtain a fast response and the masses will then move asyncronously.When the masses deviate too much PID controller results in an unstable system see



0 5 10 150

0.2

0.4

0.6

0.8

1

1.2

1.4

0 5 10 15−2

−1

0

1

2

3

4

x 1,x

2u

Time t

0 50 100 1500

1

0 50 100 150−1

0

1

x 10−3

x 2u

Time t

Figure 2.21: The left plots shows the response to a step command for a controller basedon feedback from observed states, and the right figure shows the same response of a PIDcontrollers. The full lines show the response of the second mass, the dotted line the motionof the first mass. Both controller use set point weighting. The dashed lineshows the responseof the second mass when setpoint weighting is not used.

Figure??. The controller with state model captures the more complicated behaviorof the system and can provide a fast response while stabilizing the system.

2.5 Auto-tuning, Adaptation and Iterative Tuning

• Canners

• Inkjet printer

2.6 References

Although the architecture of a control system is important it is not extensivelydiscussed in literature. Fundamental limitations was a key issue in classical control,[?], they were long forgotten but had a revival with Gunter Steins Bode lecture[?, ?]. Excellent up-to-date sources are [?, ?]. Optimal control is a useful tool forperformance assessment and actuator sizing, a good source is [?]. Applications todisk drives are found in [?] and [?] which also discuss implementation of time-optimal controllers. Time-optimal control often leads to control strategies of thebang-bang type where the control signal rides on the constraints and switches asin the example. Such control strategies may not be suitable for systems with manyresonant modes because the switches may excite the oscillatory modes and it maytake a long time before the oscillations decay. An interesting alternative strategyhas been developed by Grundelius []. He first designs a time optimal control, whichgives the shortest possible time. Then he designs a minimum energy controllerthat transfers the system in a slightly longer time, say 10% more. Such a designgives a smooth control signal at the price of a moderate increase of the transition


2.6. REFERENCES 63

0 2 4 6 8 10 120

0.5

1

0 2 4 6 8 10 12−0.2

−0.1

0

0.1

0.2

ϕ 1,2

u

Time t

0 2 4 6 8 10

−2

−1

0

0 2 4 6 8 10

−5

0

5

ϕ 1,2

u

Time t

Figure 2.22:Response of the system to steps in setpoint (left) and force on the secondmass(right). The full lines show the positionx2 of the second mass and the dashed line shows thepositionx1 of the first.

time. When using optimal control it is therefore important to carefully consider thechoise of criterion for optimization.

When dealing with oscillatory systems it is therefore advantageous to designthe systems so that they have high resonance frequencies.

Notch filters require good information about the resonant frequency otherwisethe peak will not be reduced.

A limit to what can be achieved by PID control. When controlling a motordrive like the one discussed in Section XXX a PID controler can do very wellif a controller is designed for a low bandwidth because the two inertias move asone object and their dynamic behavior is very similar to thatof a system withone inertia. As the banwidth increases the inertias will however move in oppositedirections and the simple PID controller is confused. It is clearly much better tohave a controller that is aware of the behavior of two inertias coupled by a spring.

The general controller discussed in Section XXX emerged in thedynamic de-velopment of control theory in the early 1960s, when the timedomain view ofcontrol reappeared. The notion of state was a key idea which naturally led to theidea of state feedback, see citeZadeh Desoer. State feedback requires that all sig-nals are measured. Another major break through was the Kalman filter which madeit possible to obtain approximations of the state by combining measurements witha mathematical model. A comprehensive discussion is given in [?]. Many theories,optimal control [?], stochastic control [?], robust control orH∞-control [?] lead tocontrollers with the structure shown in Figure??or Figure??although the impor-tance of the feedforward action is rarely emphasized in the literature. Fortunatelyall theories give controllers of the same structure, there are just different ways tocompute filter gains and observer gains.

Date post:	22-Dec-2015
Category:	Documents
Upload:	admicicmail
View:	214 times
Download:	0 times

Micro 01

Documents