CS 754 Ajit Rajwade

Post on 29-Dec-2021

3 views 0 download

transcript

CS 754Ajit Rajwade

Introduction and Motivation CS Theory: Key Theorems Practical Compressive Sensing Systems Reconstruction algorithms for Compressive

Sensing Systems Proof of some of the key results

Before applying signal/image compression algorithms like MPEG, JPEG, JPEG-2000 etc., the measuring devices typically measure large amounts of data.

The data is then converted to a transform domain where a majority of the transform coefficients turn out to have near-zero magnitude and can be discarded with small reconstruction error but great savings in storage.

Example: a digital camera has a detector array where several thousand pixel values are first stored. This 2D array is fed to the JPEG algorithm which computes DCT coefficients of each block in the image, discarding smaller-valued DCT coefficients.

All discrete signals can be represented perfectly as linear combinations of complex sinusoidal functions of different frequencies.

This is called as the Discrete Fourier transform (See next slide).

What purpose does it serve? It enables determining the frequency content of the signal (eg: which frequencies were present in an audio signal and at what strength?)

NuniN

n

enfNuF /21

0

)()(

NuniN

u

euFN

nf /21

0

)(1

)(

Fourier coefficient

(complex) sinusoidal basesValue of signal f at

location n (f is a vector of size N)

u-th Fourier coefficient

HFf Vector of N Fouriercoefficients

N x N orthonormal matrix (Fourier basis matrix)

)1(

.

.

.

)1(

)0(

1

)1(

.

.

.

)1(

)0(

/)1)(1(2/)1)(1(2/)1)(0(2

/)1)(1(2/)1)(1(2/)1)(0(2

/)0)(1(2/)0)(1(2/)0)(0(2

NF

F

F

eee

eee

eee

N

Nf

f

f

NNNiNNiNNi

NNiNiNi

NNiNiNi

n varies across rows (constant over all entries in a given row). u varies across columns (constant across all entries in a given column)

fHF

HF,f

T

IHHHH TT

)//(21

0

1

0

),(),( NyvMxuiM

x

N

y

eyxfMNvuF

MyvxuiM

u

N

v

evuFMN

yxf /)(21

0

1

0

),(1

),(

F and f are vectors of length MN

HFf

Matrix of size MN x MN

Vector of size MN x 1

1

0

1

0

~)()(

)()(

N

u

un

N

N

n

un

N

auFnf

anfuF

un

N

un

N

un

N

un

N

aa

NuN

un

Na

uN

aDCT

~

1...1,2

)12(cos

2

0,1

:

conjugate)complex (~

:

*

2

un

Nun

N

N

unj

un

N

aa

eaDFT

Expresses a signal as a linear combination of cosine bases (as opposed to the complex exponentials as in the Fourier transform). The coefficients of this linear combination are called DCT coefficients.

Is real-valued unlike the Fourier transform! Has better compaction properties for signals

and images – as compared to DFT.

)1(

.

.

.

)0(

...

.....

.....

.....

...

)1(

.

.

.

)0(

1,10,1

1,00,0

Nf

f

aa

aa

NF

F

NN

N

N

N

N

NN

IAA

A

T

NNR

)1(

.

.

.

)0(

...

.....

.....

.....

...

)1(

.

.

.

)0(

1,10,1

1,00,0

NF

F

aa

aa

Nf

f

NN

N

N

N

N

NN

IAA

A

T

NN DCTR~~

Matrix) Basis(~n

u

u

n

) transposeconjugate(~

:

~:

*T

T

DFT

DCT

AA

AA

1

0

1

0

1

0

~),(),(

),(),(

N

u

unvm

NM

N

n

unvm

NM

M

m

avuFmnf

amnfvuF

unvm

NM

unvm

NM

unvm

NM

aa

Mv)(vMv

Nu)(uNu

MvNuM

vm

N

unvua

DCT

~

/2)( else ,0 /1)(

/2)( else ,0 /1)(

1...0,1...0,2

)12(cos

2

)12(cos)()(

:

The DCT matrix is this case will have size MN x MN, and it will be the Kroneckerproduct of two DCT matrices – one of size M x M, the other of size N x N. The DCT matrix for the 2D case is also orthonormal, it is NOT symmetric and it is NOT the real part of the 2D DFT.

The DCT transforms an 8×8 block of input values to a linear combination of these 64 patterns. The patterns are referred to as the two-dimensional DCT basis functions, and the output values are referred to as transform coefficients.

Each image here is obtained from the 8 x 8 outer product of a pair of DCT basis vectors. Each image is stretched between 0 and 255 – on a common scale.

http://en.wikipedia.org/wiki/JPEG

http://en.wikipedia.org/wiki/JPEG

Real part of DFT DCT

Conventional sensing is a “measure and compress+throw” paradigm, which is wasteful!

Especially for time-consuming acquisitions like MRI, CT, etc.

Compressive sensing is a new technology where the data are acquired/measured in a compressed format!

These compressed measurements are then fed to some optimization algorithm (also called inversion algorithm) to produce the complete signal.

This part is implemented in software. Under suitable conditions that the original

signal and the measurement system must fulfill, the reconstruction is guaranteed to have very little or even zero error!

It has the potential to dramatically improve acquisition speed for MRI, CT, hyper-spectral data and other modalities.

Potential to dramatically improve video-camera frame rates without sacrificing spatial resolution.

A band-limited signal with maximum frequency B can be accurately reconstructed from its uniformly spaced digital samples if the rate of sampling exceeds 2B (called Nyquist rate).

Independently discovered by Shannon, Whitaker, Kotelnikov and Nyquist.

Actual Analog Signal

Digital signal (sampled from analog signal)

Samples

Reconstructed signal (reconstruction is accurate if Nyquist’scondition is satisfied)

A truly band-limited signal is an ideal concept and would take infinite time to transmit or infinite memory to store.

Because a band-limited signal can never be time-limited (or space-limited).

But many naturally occurring signals when measured by a sensing device are approximately band-limited (example: camera blurring function reduces high frequencies).

Sampling an analog signal with maximum frequency B at a rate less than or equal to 2Bcauses an artifact called aliasing.

Aliasing when the sampling rate less than 2B Original signal

Aliasing when the sampling rate is equal to 2B

No aliasing when the sampling rate is more than 2B

The optimal reconstruction for band-limited signals from their digital samples proceeds using the sincinterpolant. It yields a time-domain (or space-domain signal of infinite extent with values that are very small outside a certain window).

The associated formula is called Whittaker-Shannon interpolation formula.

T

tnTtnf

T

nTtnf

T

nTt

T

nTt

nftfnnn

sinc)()(sinc)(

sin

)()(

Magnitude of the n-th sample

Sampling period =1/Sampling rate

The sinc function (it is the Fourier transform of the rect function)

The samples need to be uniformly spaced (there are extensions for non-uniformly spaced samples, with the equivalent of Nyquist rate being an average sampling rate).

The sampling rate needs to be very high if the original signal contains higher frequencies (to avoid aliasing).

Does not account for several nice properties of naturally occurring signals (talks only about band-limitedness which is not perfectly realizable).

Ajit Rajwade

Ref: Candes, Romberg and Tao, “Robust Uncertainty Principles: Exact Signal Reconstruction from Highly Incomplete Frequency Information”, IEEE Transactions on Information Theory, Feb 2006.

operatorFourier

),(),)](([,),(

such that

),(),(min1

0

1

0

22

F

vuGvufFvu

yxfyxfN

x

N

y

yxf

C

The top-left image (previous slide) is a standard phantom used in medical imaging - called as Logan-Shepp phantom.

The top-right image shows 22 radial directions with 512 samples along each, representing those Fourier frequencies (remember we are dealing with 2D frequencies!) which were measured.

Bottom left: reconstruction obtained using inverse Fourier transform, assuming the rest of the Fourier coefficients were zero.

Bottom right: image reconstructed by solving the constrained optimization problem in the yellow box on the previous slide. It gives a zero-error result!

Let a occurring naturally signal be denoted f. Let us denote the measurement of this signal

(using some device) as y. Typically, the mathematical relationship

between f and y can be expressed as a linear equation of the form y = Ff.

F is called the measurement matrix (or the “sensing matrix” or the “forward model” of the measuring device).

For standard digital camera, F may be approximated by a Gaussian blur.

Let f be a vector with n elements.

Since the measurements need to be compressive, F must have fewer rows (m) than columns (n) to produce a measurement vector y with m elements.

We know y and F, and we wish to estimate f.

We know y and F, and we wish to estimate f.

We know that in general, this is an under-determined linear system, and hence there is no unique solution.

Why?

)Nullspace( i.e. , where),~

(~

Φv0ΦvvfΦfΦ

But CS theory states, that in certain cases, this system does have a unique solution.

Conditions to be satisfied:1. Vector f should be sparse (so not all vectors

in Rn are potential solutions).2. F should be “very different from”

(“incoherent with”) any row-subset of the identity matrix.

It turns out this is wonderful news for signal and image processing.

Why?1. Natural signals/images have a sparse

representation in some well-known orthonormal basis Y such as Fourier, DCT, Haarwavelet etc.

2. The measurement matrix F can be designed to be incoherent (poorly correlated) with the signal basis matrix Y, i.e. the sensing waveforms (rows of F) have a very dense representation in the signal basis.

Many signals have sparse* representations in standard orthonormal bases (denoted here as Y).

Example:

n

RRR Tnnnn

k

n

k

0

1

,,,,

,

θ

IΨΨΨθf

ΨΨθf k

Actually, the representation is compressible, i.e. most of the coefficients are close to zero, but not exactly zero. We will clarify this issue very soon.

This is the L0 norm of a vector = the number of non-zero elements in it

Ajit Rajwade

Image source: Candes and Wakin, “An introduction to compressive samplig”, IEEE Signal processing magazine

Consider a sensing matrix F of size m by n, where m is much less than n (if m >= n, there is no compression in the measurements!).

We will assume (for convenience) that each row of F is unit-normalized.

CS theory states that F and Y should be “incoherent” with each other.

The coherence between F and Y is defined as:

We want this quantity m to be as small as possible.

Its value always lies in the range .

inimjn ΨΦΦΨ

)( ,max),( j

1,1m

),1( n

‘j’-th row of F, i-thcolumn of Y

Let the signal basis Y be the Fourier basis. A sampling basis F that is incoherent with it is the standard spike (Dirac) basis: which corresponds to the simplest and most conventional sampling basis in space or time (i.e. F identity matrix) .

The associated coherence value is 1 (why?).

)()( ktt kΦ

Sensing matrices whose entries are i.i.d. random draws from Gaussian or Bernoulli (+1/-1) distributions are incoherent with anygiven orthonormal basis Y with a very high probability.

Implication: we want our sensing matrices to behave like noise!

Let the measured data be given as:

The coefficients of the signal f in Y – denoted as – and hence the signal f itself - can be recovered by solving the following constrained minimization problem:

nmRRR nnmm

,,,

,

fΦy

ΦΨθΦfy

ΦΨθyθ that suchmin : Problem0

P0

This is the L0 norm of a vector = the number of non-zero elements in it

P0 is a very difficult optimization problem to solve – it is NP-hard.

Hence, a softer version (known as Basis Pursuit) is solved:

This is a linear programming problem and can be solved with any LP solver (in Matlab, for example) or with packages like L1-magic.http://users.ece.gatech.edu/~justin/l1magic/

ΦΨθyθ that suchmin :P Problem1

B

This is the L1 norm of a vector = the sum total of the absolute values of all its elements

Is P0 guaranteed to have a unique solution at all? Why? (If answer is no, compressed sensing is not guaranteed to work!)

Consider the case that any 2S columns of an m x n matrix A are linearly independent. Then any S-sparse signal f can be uniquely reconstructed from measurements y = Af. See proof on next slide.