Compendium of digital signal processinghestia.lgs.jussieu.fr/~boschil/ondes/traitment.pdfCompendium...

Compendium of digital signal processing

Lapo Boschi ([email protected])

August 31, 2014

This document is essentially a compilation of slightly edited excerptsfrom Smith (1997). I have selected the topics that I believe to be mostrelevant to us. The electronic version of the original book is free athttp://www.dspguide.com.

1 Linear systems

A signal is a description of how one parameter varies with another param-eter. For instance, the acceleration of the Earth’s surface recorded by aseismometer over time. A system is any process that produces an outputsignal in response to an input signal. This is illustrated by the block diagramin Fig. 1. Continuous systems input and output continuous signals, such asin analog electronics. Discrete systems input and output discrete signals,such as computer programs that manipulate the values stored in arrays.

A system is called linear if it has two mathematical properties: homo-geneity and additivity.

As illustrated in Fig. 2, homogeneity means that a change in the inputsignal’s amplitude results in a corresponding change in the output signal’samplitude. In mathematical terms, if an input signal of x[n] results in anoutput signal of y[n], an input of kx[n] results in an output of ky[n], for anyinput signal and any value of the constant k.

The property of additivity is illustrated in Fig. 3. Consider a systemwhere an input of x1[n] produces an output of y1[n]. Further suppose that a

Figure 1: Terminology for signals andsystems. A system is any process thatgenerates an output signal, e.g. y(t) ory[n], in response to an input signal, e.g.x(t) or x[n]. From Smith (1997).

1

Figure 2: Definition of homogeneity. Asystem is said to be homogeneous if anamplitude change in the input resultsin an identical amplitude change in theoutput. From Smith (1997).

different input, x2[n], produces another output, y2[n]. The system is said tobe additive, if an input of x1[n] + x2[n] results in an output of y1[n] + y2[n],for all possible input signals.

Static linearity defines how a linear system reacts when the signals arenot changing, i.e., when they are “DC” or static. The static response of alinear system is very simple: the output is the input multiplied by a constant.All linear systems have the property of static linearity.

An important characteristic of linear systems is how they behave withsinusoids, a property we will call sinusoidal fidelity: If the input to a linearsystem is a sinusoidal wave, the output will also be a sinusoidal wave, and atexactly the same frequency as the input. Sinusoids are the only waveformthat have this property. For instance, there is no reason to expect thata square wave entering a linear system will produce a square wave on theoutput. Although a sinusoid on the input guarantees a sinusoid on theoutput, the two may be different in amplitude and phase.

Linear systems of interest to us usually do not change over time, at leastin the time frames considered. In terms of signal processing, they are said toenjoy the property of shift invariance: a shift in the input signal will resultin nothing more than an identical shift in the output signal.

Figure 3: Definition of additivity. Asystem is said to be additive if addedsignals pass through it without interact-ing. From Smith (1997).

2

Figure 4: Any signal, such as x[n],can be decomposed into a group of ad-ditive components, shown here by thesignals: x0[n], x1[n] and x2[n]. Pass-ing these components through a lin-ear system produces the signals y0[n],y1[n] and y2[n]. The synthesis (addi-tion) of these output signals forms y[n],the same signal produced when x[n] ispassed through the system. From Smith(1997).

2 Superposition

Consider an input signal, called x[n], passing through a linear system, result-ing in an output signal, y[n]. As illustrated in Fig. 4, the input signal can bedecomposed into a group of simpler signals: x1[n], x2[n], x3[n], etc. We willcall these the input signal components. Next, each input signal componentis individually passed through the system, resulting in a set of output signalcomponents: y1[n], y2[n], y3[n], etc. These output signal components arethen synthesized into the output signal, y[n].

Importantly, the output signal obtained by this method is identical to theone produced by directly passing the input signal through the system. Thisis a very powerful idea. Instead of trying to understand how complicatedsignals are changed by a system, all we need to know is how simple signalsare modified. This is the basis of nearly all signal processing techniques.

3 Impulse decomposition

Impulse decomposition (implicitly applied in the example of Fig. 4) breaksan N -sample signal into N component signals, each containing N samples.Each of the component signals contains one point from the original signal,with the remainder of the values being zero. A single nonzero point in astring of zeros is called an impulse. Impulse decomposition is importantbecause it allows signals to be examined one sample at a time. Similarly,

3

Figure 5: The delta function δ[n] is anormalized impulse. All of its sampleshave a value of zero, except for samplenumber zero, which has a value of one.The impulse response h[n] is the outputof the system when the input is a deltafunction. From Smith (1997).

systems are characterized by how they respond to impulses. By knowinghow a system responds to an impulse, the system’s output can be calculatedfor any given input. This approach is called convolution, and is the topic ofthe next section.

4 Convolution

Figure 5 defines two important terms used in signal processing. The firstis the delta function, symbolized by the Greek letter delta, δ[n]. The deltafunction is a normalized impulse, that is, sample number zero has a value ofone, while all other samples have a value of zero. The second term defined inFig. 5 is the impulse response. As the name suggests, the impulse responseis the signal that exits a system when a delta function is the input. If twosystems are different in any way, they will have different impulse responses.

If the input to a linear system is an impulse, such as −3δ[n− 8], what isthe system’s output? We can answer this question based on the propertiesof homogeneity and shift invariance of linear systems: scaling and shiftingthe input results in an identical scaling and shifting of the output. If δ[n]results in the output h[n] (equal to the impulse response), it follows that−3δ[n − 8] results in −3h[n − 8]. In words, the output is a version of theimpulse response that has been shifted and scaled by the same amount asthe delta function on the input. If you know a system’s impulse response,you immediately know how it will react to any impulse.

Let us summarize this way of understanding how a system changes aninput signal into an output signal. First, the input signal can be decomposedinto a set of impulses, each of which can be viewed as a scaled and shifteddelta function. Second, the output resulting from each impulse is a scaledand shifted version of the impulse response. Third, the overall output signalcan be found by adding these scaled and shifted impulse responses. Inother words, if we know a system’s impulse response, then we can calculatewhat the output will be for any possible input signal. This means we knoweverything about the system.

The mathematical operation that describes this procedure is called con-volution x[n] ∗ h[n] between the input signal x[n] and the impulse responseh[n]. It is typically denoted by ∗. The above paragraph can be summarized

4

(a)

(b)

Figure 6: Example convolution problem. (a) A nine point input signal, convolvedwith a four point impulse response, results in a twelve point output signal. Eachpoint in the input signal contributes a scaled and shifted impulse response to theoutput signal. (b) The nine scaled and shifted impulse responses. In these signals,each point that results from a scaled and shifted impulse response is represented bya square marker. The remaining data points, represented by diamonds, are zerosthat have been added as place holders. From Smith (1997).

as follows: the response of a linear system to any input signal is given bythe convolution between the input signal and the system’s impulse response.

Let us write a mathematical formula for the convolution. We start fromthe point of view of the input signal. When fed to the system, each impulsex[n = k] (k = 0, 2, . . . , N − 1), resulting from the signal’s decompositionin N impulses, generates a shifted and scaled output x[k]h[i − k] (k =0, 2, . . . , N − 1, i = 0, 2, . . . ,M − 1, where M is the number of samples wehave for the impulse response). The system’s response y[n] to the entire

5

signal x[n] is (additivity) the sum of its responses to individual impulses,

y[i] =

N−1∑k=0

x[k]h[i− k] (1)

where i takes values from 0 to N +M − 1 (the impulse at i = N generatesoutput starting at i = N and ending M samples later).

This exercise can be repeated adopting the point of view of the outputsignal. The output signal y[n] at n = i is the result of contributions fromearlier input impulses: an input impulse x[n] at n = i− k contributes withthe scaled impulse response x[i− k]h[k]. Summing all contributions,

y[i] =

M−1∑k=0

h[k]x[i− k], (2)

equivalent to eq. 6-1 in Smith (1997). Again, i = 0, 1, . . . , N + M − 1. Weinfer that convolution is commutative, i.e. x[n] ∗ h[n] = h[n] ∗ x[n].

Some of you might be familiar with the traditional, continuous-signal-processing formula

y(t) = x(t) ∗ h(t)

=

∫ +∞

−∞h(τ)x(t− τ)dτ

(3)

that can be obtained from eq. (2), going to the limit of infinitely closesamples and defining h(t) and x(t) to be zero at values of t where no samplesare available.

5 Correlation and cross-correlation

The concept of correlation can best be presented with an example. Figure 8shows a radar antenna transmitting a short burst of radio wave energy ina selected direction. If the propagating wave strikes an object, such asthe helicopter in this illustration, a fraction of the energy is reflected backtoward a radio receiver located near the transmitter. The transmitted pulseis a specific shape that we have selected, such as the triangle shown in thisexample. The received signal will consist of two parts: (i) a shifted andscaled version of the transmitted pulse, and (ii) random noise, resultingfrom interfering radio waves, thermal noise in the electronics, etc. Sinceradio signals travel at a known rate, the speed of light, the shift betweenthe transmitted and received pulse is a direct measure of the distance to theobject being detected: it is very useful, then, to identify such shift. Thisis the problem: given a signal of some known shape, what is the best way

6

(a)

(b)

(c)

(d)

Figure 7: Examples of signals being processed using convolution. In this example,the input signal is a few cycles of a sine wave plus a slowly rising ramp. These twocomponents are separated by using properly selected impulse responses: comparethe low-pass and high-pass filters in (a) and (b), respectively. Other signal process-ing tasks implemented via very simple impulse responses are illustrated in (c) and(d). As shown in these examples, dramatic changes can be achieved with only afew nonzero points. From Smith (1997).

to determine where (or if) that same shape occurs, shifted and/or scaled, inanother signal?

Let us first define correlation as the mathematical operation consistingof multiplying two signals and summing their product over their whole span:in digital signal processing, the correlation of x[n] and y[n] can be written∑

i x[i]y[i]. The more two signals are “in phase” with one another, thelarger their correlation. The correlation between the transmitted (t[n]) andreceived (x[n]) traces of Fig. 8 is small, even though they do contain the

7

Figure 8: Like other echo lo-cation systems, radar transmitsa short pulse of energy that isreflected by objects being ex-amined. This makes the re-ceived waveform a shifted ver-sion of the transmitted wave-form, plus random noise. FromSmith (1997).

same signal. To properly compare t[n] and x[n] we must: (i) shift either ofthem, say x[n], of s samples; (ii) calculate the correlation of x[n + s] andt[n]; (iii) iterate, after changing the shift s, until t[n] and x[n] have beenoverlapped in all possible ways. This operation is called cross-correlationand is depicted in Fig. 9. Formally,

y[s] =∑i

x[i+ s]y[i], (4)

where y[n] is the cross-correlation of x[n] and t[n]. In continuous signalprocessing,

y(s) =

∫ −∞−∞

x(s+ τ)t(τ)dτ. (5)

Notice that cross-correlation is a function of the shift s between the twosignals. In the example of Fig. 8, τ is “absolute” time and s is a time shift.The value of s corresponding to the maximum correlation between x[n+ s]and t[n] might be associated with the distance between transmitter/receiver(radar) and object (helicopter).

The similarity between cross-correlation and convolution is evident. How-ever, unlike convolution, cross-correlation is not commutative, and in general∫ −∞

−∞x(s+ τ)t(τ)dτ 6=

∫ −∞−∞

t(s+ τ)x(τ)dτ. (6)

The operation of cross-correlating a signal with itself is called autocorre-lation. Clearly, autocorrelation is always maximum if the shift is 0. However,

8

Figure 9: This flowchart showshow the cross-correlation of twosignals is calculated. In this ex-ample, y[n] is the cross-correlationof x[n] and t[n]. The dashed boxis moved left or right so that itsoutput points at the sample be-ing calculated in y[n]. The indi-cated samples from x[n] are multi-plied by the corresponding samplesin t[n], and the products added.In this illustration, the only sam-ples calculated in y[n] are wheret[n] is fully immersed in x[n]. FromSmith (1997).

other relevant maxima can be found: think, e.g., of a seismic survey, withvarious crustal discontinuities reflecting seismic waves multiple times.

6 The Fourier transform

Joseph Fourier (1768-1830) proposed in 1807 that any continuous periodicsignal could be represented as the sum of properly chosen sinusoidal waves.He was only partially right, as signals with discontinuous slopes can be rep-resented only approximately by sinusoids. However, Fourier decompositionis extremely useful in practice.

Why are sinusoids used instead of, for instance, square or triangularwaves? The component sine and cosine waves are simpler than the originalsignal because they have a property that the original signal does not have:sinusoidal fidelity. As discussed in sec. 1, a sinusoidal input to a system isguaranteed to produce a sinusoidal output. Only the amplitude and phase ofthe signal can change; the frequency and wave shape must remain the same.Sinusoids are the only waveform that have this useful property. While square

9

Figure 10: In the time domain, x[n] consists of N points running from 0 to N−1. Inthe frequency domain, the DFT produces two signals: the cosine-wave amplitudesc[n] and the sine-wave amplitudes s[n]. Each of these frequency domain signals areN/2 + 1 points long, and run from 0 to N/2. The Forward DFT transforms fromthe time domain to the frequency domain, while the Inverse DFT transforms fromthe frequency domain to the time domain. From Smith (1997).

and triangular decompositions are possible, there is no general reason forthem to be useful1.

As shown in Fig. 10, in digital signal processing a discrete Fourier trans-form (DFT) changes an N -point input signal into two N/2 + 1-point outputsignals2, one containing the amplitudes of the component sine waves, theother the cosine-wave amplitudes. The input signal is said to be in the timedomain3. The term frequency domain is used to describe the amplitudesof the sine and cosine waves. The frequency domain contains exactly thesame information as the time domain, just in a different form. If you knowone domain, you can calculate the other. Given the time-domain signal, theprocess of calculating the frequency domain is called decomposition, anal-ysis, the forward Fourier transform, or simply, the Fourier transform. Ifyou know the frequency domain, calculation of the time domain is calledsynthesis, or the inverse Fourier transform.

Mathematically, the discrete Fourier transform is described by

x[i] =

N/2∑k=0

c[k] cos(2π k i/N) +

N/2∑k=0

s[k] sin(2π k i/N), (7)

1Think, for example, of a musical piece played through different speaker systems, andin different environments. Wave propagation is a linear system: no matter how crowdedthe room or how complicated its shape, no matter whether you use large speakers orheadphones, you will always hear the same notes, i.e. the same combinations of sinusoidalwaves.

2Here and in the following we assume that N is a power of two; this choice is usuallymade, because the most efficient algorithm for calculating the Fourier transform numeri-cally, the fast Fourier transform or FFT, operates with N that is a power of two.

3this expression is often used even when the independent variable of the function beingFourier-transformed has nothing to do with time

10

where x[n] is the time-domain signal to be Fourier-transformed, and c[n],s[n] are the amplitudes of the component cosine and sine waves, respectively,forming the frequency domain, i.e. the Fourier transform.

I consider it intuitive (and thus leave out a rigourous proof) that the DFTof x[n] should contain exactly the same amount of information contained inthe time-domain signal x[n] itself4. You might notice that this is apparentlynot the case, since an N -sample time-domain signal results in two N/2 + 1-sample signals in the frequency domain, hence N + 2 values total. Thisis explained by the fact that, for k=0, sin(2π k i/N)=sin(0)=0 regardlessof the value of i, so that the s[0] coefficient is irrelevant. Likewise, fork=N/2, sin(2π k i/N)=sin(π i)=0 for all i, and the s[N/2] coefficient is alsoirrelevant. Of the N+2 coefficients forming the frequency domain, two carryno information, so that there are only N relevant values in the frequencydomain, just as in the time domain.

Another peculiar frequency-domain coefficient is c[k=0], associated withthe “constant sinusoid” cos(2π k i/N)=cos(0)=1 when k=0. All other si-nusoids have 0 average with respect to k and independently of i, so thatc[0] coincides with the average value of x[n] over time. Borrowing fromelectronics jargon, c[0] is sometimes dubbed the DC offset of the signal.

In principle, a Fourier transform can be calculated exploiting the orthog-onality of cosine and sine: if one multiplies cos(2π k i/N) by cos(2π j i/N)and sums over i=1, 2, . . . , N (or, in other words, if one calculates the cor-relation of cos(2π k i/N) and cos(2π j i/N)), the result is 0 unless i=j; thecorrelation of a cosine with a sine is also 0. Multiply, then, both sides of eq.(7) by cos(2π j i/N). We are left with

N−1∑i=0

x[i] cos(2π j i/N) =

N/2∑k=0

c[k]δkj

= c[j].

(8)

Through eq. (8) one can compute c[j] numerically for all values of j viacorrelation.

This approach, that you might have encountered in textbooks and thatis useful when analytical Fourier transforms are to be determined, is howevernot at all efficient from the computational (numerical) standpoint. Otherways to calculate a Fourier transform exist, such as solving simultaneouslinear equations (e.g., Smith, 1997); but by far the most efficient algorithm isthe fast Fourier transform or FFT introduced by Cooley and Tukey (1965).We do not have time to discuss how FFT works, but be aware that, asof today, you must use an FFT subroutine if you are to compute Fouriertransforms numerically in a robust and efficient way.

4Think of the problem of determining c[n] and s[n] as a system of simultaneous linearequations, and think of what requirements linear systems need to meet so that they havea unique solution.

11

7 Phase and amplitude: the polar notation in thefrequency domain

According to the convention used so far (the so called rectangular notation),the frequency domain consists of a set of coefficients associated with cosineand sine waves. Alternatively, the frequency domain can be expressed inthe so-called polar form: the coefficients c[n] and s[n] are replaced with twoother arrays, called the magnitude M [n] and phase θ[n] of the signal.

The magnitude and phase are a pair-for-pair replacement for the cosine-and sine-wave coefficients. For example, M [0] and θ[0] are calculated usingonly c[0] and s[0]. The transformation between the two notations is basedon the trigonometric identity

a cos(φ) + b sin(φ) =√a2 + b2 cos

[φ+ arctan

(b

a

)], (9)

valid for arbitrary values of a, b, φ. We use eq. (9) to rewrite (7) as follows,

x[i] =

N/2∑k=0

c[k] cos(2π k i/N) +

N/2∑k=0

s[k] sin(2π k i/N)

=

N/2∑k=0

[c[k] cos(2π k i/N) + s[k] sin(2π k i/N)]

=

N/2∑k=0

{√c2[k] + s2[k] cos

[2π k i/N + arctan

(s[k]

c[k]

)]}.

(10)

Those who already have some background in seismology (or wave physicsin general), but have not yet much experience with Fourier analysis, mightfind it useful to compare eq. (10) with the generic expression for a monochro-matic plane wave traveling in the direction x,

u(x, ω, t) = M(ω) cos(ωt− γx), (11)

where γx is what a seismologist would immediately call “phase”. If onediscretizes time t and frequency ω, using the indexes i and k to keep trackof samples in the time and frequency domain, respectively, (11) becomes

u(x, k, i) = M(k) cos(Cki− γx), (12)

where the constant C accounts for the linear relationships between continu-ous time and frequency, and the indexes of the corresponding samples. (Bycomparison with eq. (10), C = 2π/N .) M(k) and γx can then be thoughtof as the frequency domain, in polar notation, of a packet of monochromaticwaves each described by (11).

12

Figure 11: Information contained in thephase. Panel a shows a pulse-like wave-form. The signal in b is created by takingthe DFT of a, replacing the phase withrandom numbers, and taking the inverseDFT. The signal in c is found by takingthe DFT of a, replacing the magnitudewith random numbers, and taking the in-verse DFT. The location of the edges isretained in c, but not in b. This showsthat the phase contains information onthe “location” of events in the time do-main signal. From Smith (1997).

You might also see from eq. (12) that a shift of s samples in the timedomain, i.e. replacing i with i + s, leaves the magnitude unchanged, butadds a linear term to the phase, Csk (or 2πsk/N).

Fig. 11 is an interesting demonstration of what information is containedin the phase, and what information is contained in the magnitude. Thewaveform in Fig. 11a has two very distinct features: a rising edge at samplenumber 55, and a falling edge at sample number 110. Let us take the DFTof the signal in Fig. 11a, and convert the frequency spectrum into polarnotation. To find the signal in Fig. 11b, we first replace the phase with ran-dom numbers between 0 and 2π, and inverse-DFT the resulting frequencydomain. In other words, Fig. 11b is based only on the information con-tained in the magnitude. In a similar manner, Fig. 11c is found by replacingthe magnitude with small random numbers before using the inverse DFT.This makes the reconstruction in Fig. 11c based solely on the informationcontained in the phase.

13

8 Rectangular vs. polar notation

Rectangular and polar notation allow you to think of the Fourier transformin two different ways. With rectangular notation, the DFT decomposes anN -point signal into N/2 + 1 cosine waves and N/2 + 1 sine waves, eachwith a specified amplitude. In polar notation, the DFT decomposes anN -point signal into N/2 + 1 cosine waves, each with a specified amplitude(called the magnitude) and phase shift. Why does polar notation use cosinewaves instead of sine waves? As we have seen in sec. 6, sine waves cannotrepresent the DC component of a signal, since a sine wave of zero frequencyis composed of all zeros (and, as said above, its coefficient is irrelevant).

When should you use rectangular notation and when should you use po-lar? Rectangular notation is usually the best choice for calculations, suchas in equations and computer programs. In comparison, graphs are almostalways in polar form. As shown by the previous example, it is nearly im-possible for humans to understand the characteristics of a frequency domainsignal by looking at the real and imaginary parts. In a typical program, thefrequency domain signals are kept in rectangular notation until an observerneeds to look at them, at which time a rectangular-to-polar conversion isdone.

Why is it easier to understand the frequency domain in polar notation?This question goes to the heart of why decomposing a signal into sinusoidsis useful. Recall the property of sinusoidal fidelity from sec. 1: if a sinusoidenters a linear system, the output will also be a sinusoid, and at exactly thesame frequency as the input. Only the amplitude and phase can change.Polar notation directly represents signals in terms of the amplitude andphase of the component cosine waves. In turn, systems can be representedby how they modify the amplitude and phase of each of these cosine waves.

Now consider what happens if rectangular notation is used with this sce-nario. A mixture of cosine and sine waves enter the linear system, resultingin a mixture of cosine and sine waves leaving the system. The problem is,a cosine wave on the input may result in both cosine and sine waves on theoutput. Likewise, a sine wave on the input can result in both cosine andsine waves on the output. While these cross-terms can be straightened out,the overall method does not match with why we wanted to use sinusoids inthe first place.

References

Cooley, J. W., Tukey, J. W., 1965. An algorithm for the machine calculationof complex Fourier Series. Mathematics Computation 19, 297–301.

Smith, S. W., 1997. The Scientist and Engineer’s Guide to Digital SignalProcessing. California Technical Publishing, San Diego.

14

Date post:	29-Mar-2020
Category:	Documents
Upload:	others
View:	7 times
Download:	0 times

Compendium of digital signal processinghestia.lgs.jussieu.fr/~boschil/ondes/traitment.pdfCompendium...

Documents