WAVELETS: Data Analytic Perspectivezoe.bme.gatech.edu/~bv20/isye6414/Bank/SNUtalk1.pdf · WAVELETS:...

Post on 17-Apr-2018

214 views 0 download

transcript

WAVELETS: Data Analytic Perspective

Brani Vidakovic

Georgia Institute of Technology, Atlanta, GA, USA

Seminar Talk at Department of Industrial EngineeringSeoul National University

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 1 / 44

Overview

In this talk:

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44

Overview

In this talk:Four Holy Grails of Wavelets or Why Wavelets

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44

Overview

In this talk:Four Holy Grails of Wavelets or Why Wavelets

What are Wavelets?

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44

Overview

In this talk:Four Holy Grails of Wavelets or Why Wavelets

What are Wavelets?

Wavelet Shrinkage via Statistical Inference

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44

Overview

In this talk:Four Holy Grails of Wavelets or Why Wavelets

What are Wavelets?

Wavelet Shrinkage via Statistical Inference

BAMS Example

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44

Overview

In this talk:Four Holy Grails of Wavelets or Why Wavelets

What are Wavelets?

Wavelet Shrinkage via Statistical Inference

BAMS Example

Wavelets and Scaling

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44

Overview

In this talk:Four Holy Grails of Wavelets or Why Wavelets

What are Wavelets?

Wavelet Shrinkage via Statistical Inference

BAMS Example

Wavelets and Scaling

MATLAB Sessions

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44

Overview

In this talk:Four Holy Grails of Wavelets or Why Wavelets

What are Wavelets?

Wavelet Shrinkage via Statistical Inference

BAMS Example

Wavelets and Scaling

MATLAB Sessions

MATLAB DEMOS:http://zoe.bme.gatech.edu/~bv20/isye6420/

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44

Overview

In this talk:Four Holy Grails of Wavelets or Why Wavelets

What are Wavelets?

Wavelet Shrinkage via Statistical Inference

BAMS Example

Wavelets and Scaling

MATLAB Sessions

MATLAB DEMOS:http://zoe.bme.gatech.edu/~bv20/isye6420/

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 2 / 44

Time/Frequency or Time/Scale Domains

Echolocation Signal in Time/Scale Domain

0 50 100 150 200 250 300 350 400−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

Time

Fre

qu

en

cy

0 50 100 150 200 250 300 350 4000

20

40

60

80

100

120

140

160

180

200

Figure: Digitized 2.5 microsecond echolocation pulse emitted by theLarge Brown Bat, Eptesicus Fuscus and time/scale (Wigner-Ville)representation of the pulse. (Left) Echolocation principle; (Middle)Pulse in the time domain; (Right) Wigner-Ville transform.Data courtesy of Rich Baraniuk, DSP at Rice University.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 3 / 44

Wavelets compress the data

time

Tim

e do

mai

n va

lues

0 100 200 300 400 500

-1.0

0.0

0.5

1.0

Haar

wav

elet d

omain

value

s

0 100 200 300 400 500

-4-2

02

p

L(p)

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Figure: Left: Normalized wind velocity [60 Hz, Duke Forest] timeseries and its wavelet transform. Right. Corresponding Lorentz curvesof “energies” (squared coefficients).

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 4 / 44

Wavelets whiten the data

(a)

AC

F in

tim

e do

mai

n0 5 10 15 20

-0.2

0.2

0.6

1.0

(b)

AC

F in

wav

elet

dom

ain

0 5 10 15 20

-0.2

0.2

0.6

1.0

Figure: The autocorrelations in the wind velocity time series [panel(a)] and its wavelet transform [panel (b)].

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 5 / 44

Wavelets filter the data

0 1 2 3 4 5 6

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

0 1 2 3 4 5 6

−1

−0.5

0

0.5

1

2 3 4 5 6 7 8 90

50

100

150

200

250

300

350

400

0 1 2 3 4 5 6−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Figure: Two functions with different frequencies/scale informationseparate in the wavelet domain.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 6 / 44

Wavelets detect self-similarity in data

1 2 3 4 5 6

x 104

−1

−0.8

−0.6

−0.4

−0.2

0

4 6 8 10 12 14

−14

−12

−10

−8

−6

−4

−2

0

2

4

6

−1.9839

dyadic level

log

sp

ect

rum

log2(average(coefs2))

Figure: (Left) A path of Brownian motion; (Right) Wavelet basedlog-spectrum. Regular decay is a signature of monofractality.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 7 / 44

What are Wavelets?

• IL2 o.n. bases of the form

{

ψjk(x) = 2j/2ψ(2jx− k), j, k ∈ ZZ}

f ∈ IL2 : f(x) ≈∑

j,k ∈ finite set

djkψjk(x).

j - index of scale with resolution/size 2−j. Frequency is 2j , areciprocal of scale.k - location, shift, translate, “time”. Measures the energy in theneighborhood of x = k/2j.

• {φJ0,k(x), ψjk(x), j ≥ J0, k ∈ ZZ}• Multiresolution Analysis (MRA) fully determined by φ

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 8 / 44

Three Ways To Think About DWT

y˜= (y1, . . . , yN)- data. There is an underlying function f , so

thatf(x) =

k

ykφJk(x), J = log2(N)

1. f(x) =∑

j<J,k djkψjk(x) y˜−→ {djk = 〈f, ψjk〉} or d

˜.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 9 / 44

Three Ways To Think About DWT

y˜= (y1, . . . , yN)- data. There is an underlying function f , so

thatf(x) =

k

ykφJk(x), J = log2(N)

1. f(x) =∑

j<J,k djkψjk(x) y˜−→ {djk = 〈f, ψjk〉} or d

˜.

2. d˜=Wy

˜or y

˜= W ′d

˜

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 9 / 44

Three Ways To Think About DWT

y˜= (y1, . . . , yN)- data. There is an underlying function f , so

thatf(x) =

k

ykφJk(x), J = log2(N)

1. f(x) =∑

j<J,k djkψjk(x) y˜−→ {djk = 〈f, ψjk〉} or d

˜.

2. d˜=Wy

˜or y

˜= W ′d

˜

3. d˜= (...G(H(H(Hy

˜))),G(H(Hy

˜)),G(Hy

˜),Gy

˜)

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 9 / 44

Second & Third!

2.d˜=Wy

˜(Short signals and images)

3.

h˜= (h1, . . . , hM) g

˜via gn = (−1)n h1−n.

Haar: h˜= (1/

√2 1/

√2) g

˜= (1/

√2 − 1/

√2)

H - filtering with h˜(low pass) + decimation (keep every 2nd)

G - filtering with g˜(high pass) + decimation

d˜= (H . . . (H(H(Hy

˜))) | G . . . (H(H(Hy

˜))) | . . . | G(H(Hy

˜)) | G(Hy

˜) | Gy

˜))

d˜= (smooth part | coarsest details | . . . | finest details )

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 10 / 44

Organization of Scales in Wavelet Domain

signal y˜

1024finest details Gy

˜512

fine details G(H(y˜)) 256

details G(H(H(y˜))) 128

coarse details G(H(H(H(y˜)))) 64

coarsest details G(H(H(H(H(y˜))))) 32

smooth H(H(H(H(H(y˜))))) 32

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 11 / 44

Mallat’s Algorithm

✲ ✲ ✲ ✲ ✲y˜

c˜(1) c

˜(2) . . . c

˜(k−1) c

˜(k)

✡✡✡✡✡✣

✡✡✡✡✡✣

✡✡✡✡✡✣

✡✡✡✡✡✣

✡✡✡✡✡✣

d˜(1) d

˜(2) . . . d

˜(k−1) d

˜(k)

G

H

G

H

G

H

G

H

G

H

d˜(3)

c˜(3)

. . .

. . .⊕

❏❏❏❏❫

✲c˜(2)

d˜(2)

❏❏❏❏❫⊕✲ ✲c

˜(1)

d˜(1)

❏❏❏❏❫⊕✲ ✲ y

˜

G∗

H∗

G∗

H∗

G∗

H∗

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 12 / 44

Mallat’s Algorithm

H,G a single step in forward DWT filter + ↓H∗,G∗ a single step in inverse DWT ↑ + filter

H : cj−1,l =∑

k

hk−2lcj,k

G : dj−1,l =∑

k

gk−2lcj,k

H∗,G∗ : cj,k =∑

l

cj−1,lhk−2l +∑

l

dj−1,lgk−2l.

dwtr.m: wdata = dwtr(data, L, filterh)idwtr.m: data1 = dwtr(wdata, L, filterh)

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 13 / 44

DEMO 0

• Dem0a.m (dwtr.m, idwtr.m)• Dem0b.m (WavMat.m)

200 400 600 800 1000−0.5

0

0.5Doppler

200 400 600 800 1000−2

0

2Wavelet Transform of Doppler Signal by Symmlet 8

200 400 600 800 1000−0.5

0

0.5Doppler Recovered

Figure: Output of Dem0a.m

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 14 / 44

From h to φ, ψ

h → m0 → Φ(ω) → φ(x)

Transfer function and Φ(ω)

m0(ω) =1√2

k∈ZZhke

−ikω [=1√2H(ω)].

Φ(ω) = F(φ(x)) =∫ ∞

−∞φ(x)e−iωxdx

φ(x) =∑

k hk√2φ(2x− k)

Φ(ω) = m0

(

ω2

)

Φ(

ω2

)

= · · · = ∏∞n=1m0

(

ω2n

)

. φ(x) = F−1(Φ(ω))

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 15 / 44

DEMO 1

• Dem1a.m (Symmlet 4, via Daubechies-Lagarias Algorithm, uses Phijk.m,Psijk.m)• h = [-0.07576571479 -0.02963552765 0.49761866763 0.803738751810.29785779561 -0.09921954358 -0.01260396726 0.03222310060];

• Dem1b.m (Pollen family: h = [(1 + cosφ− sinφ)/s,

(1 + cosφ+ sinφ)/s, (1− cosφ+ sinφ)/s, (1− cosφ− sinφ)/s] for s = 2√2

and φ = π/4)

0 2 4 6

0

0.2

0.4

0.6

0.8

1

1.2

−2 0 2 4

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

0 1 2 3

−0.5

0

0.5

1

1.5

−1 0 1 2

−1

−0.5

0

0.5

1

1.5

Figure: Scaling and wavelet functions for Symmlet 4 and Pollen ϕ = 45◦

bases

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 16 / 44

Why Wavelets in Data Analysis?

WT are local (in time; in scale/frequency)

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44

Why Wavelets in Data Analysis?

WT are local (in time; in scale/frequency)

WT are orthogonal or near-orthogonal

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44

Why Wavelets in Data Analysis?

WT are local (in time; in scale/frequency)

WT are orthogonal or near-orthogonal

WT are applicable to discrete data sets

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44

Why Wavelets in Data Analysis?

WT are local (in time; in scale/frequency)

WT are orthogonal or near-orthogonal

WT are applicable to discrete data sets

WT preserve but disbalance the energy in data

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44

Why Wavelets in Data Analysis?

WT are local (in time; in scale/frequency)

WT are orthogonal or near-orthogonal

WT are applicable to discrete data sets

WT preserve but disbalance the energy in data

WT whiten data

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44

Why Wavelets in Data Analysis?

WT are local (in time; in scale/frequency)

WT are orthogonal or near-orthogonal

WT are applicable to discrete data sets

WT preserve but disbalance the energy in data

WT whiten data

WT are FAST! O(n)

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44

Why Wavelets in Data Analysis?

WT are local (in time; in scale/frequency)

WT are orthogonal or near-orthogonal

WT are applicable to discrete data sets

WT preserve but disbalance the energy in data

WT whiten data

WT are FAST! O(n)

WT are versatile

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44

Why Wavelets in Data Analysis?

WT are local (in time; in scale/frequency)

WT are orthogonal or near-orthogonal

WT are applicable to discrete data sets

WT preserve but disbalance the energy in data

WT whiten data

WT are FAST! O(n)

WT are versatile

WT are Bayes friendly

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44

Why Wavelets in Data Analysis?

WT are local (in time; in scale/frequency)

WT are orthogonal or near-orthogonal

WT are applicable to discrete data sets

WT preserve but disbalance the energy in data

WT whiten data

WT are FAST! O(n)

WT are versatile

WT are Bayes friendly

WT are sensitive to self-similar phenomena

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44

Why Wavelets in Data Analysis?

WT are local (in time; in scale/frequency)

WT are orthogonal or near-orthogonal

WT are applicable to discrete data sets

WT preserve but disbalance the energy in data

WT whiten data

WT are FAST! O(n)

WT are versatile

WT are Bayes friendly

WT are sensitive to self-similar phenomena

much more...

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 17 / 44

Where Wavelets in Data Analysis?

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44

Where Wavelets in Data Analysis?

• Regression Problems♦ Equi- and Nonequi-spaced Regression♦ Dimension Reduction, Approximate PCA, Denoising♦ Pursuits and Data Driven Selection of Analyzing Tools

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44

Where Wavelets in Data Analysis?

• Regression Problems♦ Equi- and Nonequi-spaced Regression♦ Dimension Reduction, Approximate PCA, Denoising♦ Pursuits and Data Driven Selection of Analyzing Tools

• Density Estimation♦ Functionals of a Density, Classification and Discrimination.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44

Where Wavelets in Data Analysis?

• Regression Problems♦ Equi- and Nonequi-spaced Regression♦ Dimension Reduction, Approximate PCA, Denoising♦ Pursuits and Data Driven Selection of Analyzing Tools

• Density Estimation♦ Functionals of a Density, Classification and Discrimination.

• Deconvolutions

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44

Where Wavelets in Data Analysis?

• Regression Problems♦ Equi- and Nonequi-spaced Regression♦ Dimension Reduction, Approximate PCA, Denoising♦ Pursuits and Data Driven Selection of Analyzing Tools

• Density Estimation♦ Functionals of a Density, Classification and Discrimination.

• Deconvolutions

• Time Series♦ Approximate K-L Expansions.♦ Nonstationary TS, Wavelet-based Spectral Analysis

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44

Where Wavelets in Data Analysis?

• Regression Problems♦ Equi- and Nonequi-spaced Regression♦ Dimension Reduction, Approximate PCA, Denoising♦ Pursuits and Data Driven Selection of Analyzing Tools

• Density Estimation♦ Functionals of a Density, Classification and Discrimination.

• Deconvolutions

• Time Series♦ Approximate K-L Expansions.♦ Nonstationary TS, Wavelet-based Spectral Analysis

• Long-Range Dependence, Self-similarity and Scaling in Data,(Multi-)Fractality.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 18 / 44

Where Wavelets in Statistics?

• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44

Where Wavelets in Statistics?

• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis

• Theory of Shapes, Wavelet Based Bookmarks

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44

Where Wavelets in Statistics?

• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis

• Theory of Shapes, Wavelet Based Bookmarks

• Biased Sampling

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44

Where Wavelets in Statistics?

• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis

• Theory of Shapes, Wavelet Based Bookmarks

• Biased Sampling

• Medical Image Enhancement, Mammogramy, fMRI, CXR, CT

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44

Where Wavelets in Statistics?

• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis

• Theory of Shapes, Wavelet Based Bookmarks

• Biased Sampling

• Medical Image Enhancement, Mammogramy, fMRI, CXR, CT

• Bayesian Applications. Bayesian Nonparametrics

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44

Where Wavelets in Statistics?

• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis

• Theory of Shapes, Wavelet Based Bookmarks

• Biased Sampling

• Medical Image Enhancement, Mammogramy, fMRI, CXR, CT

• Bayesian Applications. Bayesian Nonparametrics

• Statistical Calculation, Simulation, Wavestrapping.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44

Where Wavelets in Statistics?

• GLM, GAM, Censored Models, Functional Data Analysis,Change Point Analysis

• Theory of Shapes, Wavelet Based Bookmarks

• Biased Sampling

• Medical Image Enhancement, Mammogramy, fMRI, CXR, CT

• Bayesian Applications. Bayesian Nonparametrics

• Statistical Calculation, Simulation, Wavestrapping.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 19 / 44

Model Based Wavelet Data Processing

DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”

Processed DATAW−1

←− Process (Detail) Coefficients

Process ≡

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44

Model Based Wavelet Data Processing

DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”

Processed DATAW−1

←− Process (Detail) Coefficients

Process ≡Shrink, Threshold, Split

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44

Model Based Wavelet Data Processing

DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”

Processed DATAW−1

←− Process (Detail) Coefficients

Process ≡Shrink, Threshold, Split

Transform

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44

Model Based Wavelet Data Processing

DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”

Processed DATAW−1

←− Process (Detail) Coefficients

Process ≡Shrink, Threshold, Split

TransformSimulate New, Construct

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44

Model Based Wavelet Data Processing

DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”

Processed DATAW−1

←− Process (Detail) Coefficients

Process ≡Shrink, Threshold, Split

TransformSimulate New, Construct

Resample, Permute

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44

Model Based Wavelet Data Processing

DATAW−→ Wavelet Coefficients: “Detail” & “Smooth”

Processed DATAW−1

←− Process (Detail) Coefficients

Process ≡Shrink, Threshold, Split

TransformSimulate New, Construct

Resample, PermuteAssess “Energy” and “Fluxes”

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 20 / 44

Wavelet Shrinkage: Shrinkage Policies

Hard ThresholdingSoft ThresholdingSemi-Soft ThresholdingSmooth ShrinkageVarious Variable Selection Methods

-1 -0.5 0.5 1d

-1

-0.5

0.5

1

-4 -2 2 4

d

-4

-2

2

4

Figure: Examples of thresholding rules

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 21 / 44

Universal Threshold

Universal Threshold

λ =√

2 logn σ̂

σ̂ is an estimator of std of noise, σ. Many proposals – usuallyinvolving only the finest level of detail.

Rationale: If X1, . . . , Xn are i.i.d. N (0, σ2) thenEX(n) = −EX(1) =

√2 logn σ.

• Since the estimated expected range of noise is[ −√2 logn σ̂, √2 logn σ̂ ], any coefficient with the magnitudeoutside the range is attributed to the signal and thus retained inthe model.• Oversmooths in practice.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 22 / 44

DEMO 2

• Dem2a.m, Dem2b.m (Universal Threshold, HardThresholding Policy)

0 0.2 0.4 0.6 0.8 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

0 0.2 0.4 0.6 0.8 1−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

Figure: Left: Doppler + Noise (data in red) and Doppler(green). Right: Doppler estimate by thresholding (black).

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 23 / 44

Shrinkage induced by statistical modeling in thewavelet domain

y˜= f

˜+ ǫ˜

W−→ d˜= θ

˜+ ǫ˜

Estimate θ by θ̂ and obtain f̂ as W−1(θ̂)

Location model on d, f(d− θ˜|parameters)

Dimensionality (Do not worry – wavelets decorrelate)Accounting for dependence (neighbors, parent-children),

Blocking strategies (classical), Many Bayes solutions (MCMC,hidden MC’s).

Model complexity/efficiency compromise. Simplemodels/Fast shrinkage ◦ Realistic? Complex models ◦Useful?

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 24 / 44

BAMS

Compromise between model reality and simplicity

BAMS (Bayesian Adaptive MultiscaleShrinkage/Smoothing)

Model (Likelihood)

[d|θ, σ2] ∼ N (θ, σ2); σ2 ∼ E(µ), µ > 0;[

µe−µσ2

, µ > 0]

Marginal Likelihood

d|θ ∼ DE(

θ,1√2µ

)

;

[

1

2

2µe−√2µ|d−θ|

]

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 25 / 44

Prior

[θ|ǫ] ∼ ǫδ0 + (1− ǫ)DE(0, τ), ǫ = ǫ(multiresolution level)

Predictive Distribution – Marginal

d ∼ m(d) = ǫDE(0, 1√2µ

) + (1− ǫ)τe−|d|/τ − 1√

2µe−

√2µ|d|

2τ 2 − 1/µ

−3 −2 −1 0 1 2 30

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 26 / 44

Bayes Rule: BAMS

δ(d) = (1− ǫ)m(d)δ∗(d)/

[

(1− ǫ)m(d) + ǫDE(0, 1√2µ

)

]

,

δ∗(d) =τ(τ2 − 1/(2µ))de−|d|/τ + τ2/µ(e−|d|√2µ − e−|d|/τ)

(τ2 − 1/(2µ))(τe−|d|/τ − (1/√2µ)e−|d|√2µ)

−6 −4 −2 0 2 4 6

−6

−4

−2

0

2

4

6

Figure: Bayes’ rule for selected values of τ , µ and ǫ.Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 27 / 44

BAMS

Specification of Hyperparameters:• Recall: σ2 ∼ E(µ), IEσ2 = 1/µ ⇒ µ = 1

pseudos ,

(Tukey) pseudos = |Q1 −Q3|/C where Q1 and Q3 are the firstand the third quartile of the finest level of details in thedecomposition and 1.3 ≤ C ≤ 1.5.• ǫ(j) = 1− 1

(j−coarsest+1)γ, γ = 3

2.

• τ = max{√

σ2d − 1

µ, 0}. (Information on selfsimilarity via τ)

0 0.2 0.4 0.6 0.8 1−15

−10

−5

0

5

10

15

0 0.2 0.4 0.6 0.8 1−15

−10

−5

0

5

10

15

Figure: Doppler signal: n = 1024, SNR=7; Noisy version (left) andVidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 28 / 44

DEMO 3

Dem3a.m (BAMS shrinkage, Enrico Caruso: E lucevan lestelle from Tosca, by G. Puccini, Recorded in February1904).

0 1 2 3 4 5

x 104

−5

0

5x 10

4

0 1 2 3 4 5

x 104

−5

0

5x 10

4

0 1 2 3 4 5

x 104

−5000

0

5000

Figure: Noisy Recording; “Denoised” Sound; Residuals

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 29 / 44

Scaling: It Started with Hurst and Nile Data

Harold Edwin Hurst was a poor Leicester boy who madegood, eventually working his way into Oxford, and later became aBritish “civil servant” in Cairo in 1906. He got interested in theNile River.

Hurst spent 62 years in Egypt mostly working on designand construction of reservoirs along the Nile.

By inspecting historical data on the Nile flows, Hurstdiscovered phenomenon (now called Hurst effect).

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 30 / 44

Hurst’s Problem

Optimal reservoir capacity R to accept the river flow in N unitsof time, X1, X2, . . .XN , with a constant withdrawal of X̄ per unittime. The optimal volume of the reservoir is adjusted range,

R = max1≤k≤N

(X1 + · · ·+Xk − kX̄)− min1≤k≤N

(X1 + · · ·+Xk − kX̄).

In order to compare 100 of years worth of data, Hurststandardized the adjusted ranges R, with sample standarddeviation

S =

1

N − 1

N∑

i=1

(Xi − X̄)2 ,

Dimensionless ratio R/S - rescaled adjusted range.Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 31 / 44

River Nile and Scaling

On basis of more that 800 records, Hurst found thatquantity R/S scales as NH , for H ranging from 0.46 to 0.93, withmean 0.73 and standard deviation of 0.09.

0.4 0.5 0.6 0.7 0.8 0.9 10

10

20

30

40

50

60

70

80

90

100

Easy: H = 1/2 for iid normal; Feller: any iid with finite variance;Barnard (1956) Markovian dependent variables.

Mandelbrot (1975), Mandelbrot and Van Ness (1968), andMandelbrot and Wallis (1968) associated the Hurst phenomenonwith the presence of long-memory (

iCov(X1, Xi) =∞).Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 32 / 44

River Nile and Scaling

0 100 200 300 400 5009

10

11

12

13

14

15

years

Nile

riv

er

min

imu

m le

ve

l

0 1 2 3 4 5 6 7 8 9−3

−2

−1

0

1

2

3

4

slope=−0.80

Dyadic Scale

log

2 S

ca

le−

Ave

rag

ed

En

erg

y

Figure: (a) Nile Yearly Minimum Level Data for n = 512 ConsecutiveYears from 622 A.D.; (b) Wavelet Log-spectrum[0.80 = 2H − 1→ H = 0.90]

• But what is Wavelet Log-spectrum?Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 33 / 44

What is Wavelet (Log)Spectrum?

• y - data, n× 1, n = 2J .• d =Wy - discrete wavelet transform, n× 1, n = 2J .• d={cJ−m, dJ−m, dJ−m+1, . . . , dJ−2, dJ−1}.

•Wavelet(Log)Spectrum

{(

j, log2[12j

k d2j,k]

)

, J −m ≤ j ≤ J − 1}

• SLOPE either −(2H + 1)(cumulative) or −(2H − 1)(differenced). For example, Brownian Motion and White Noiseboth share H = 0.5.• MATLAB’s function waveletspectra.m finds and plots waveletspectra.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 34 / 44

“Ubiquitous” – The Epithet of Scaling

Atmospheric Turbulence

0 1 2 3 4 5 6

x 104

−3

−2

−1

0

1

2

3

4

time

U−

co

mp

on

en

t o

f ve

locity

0 1 2 3 4 5 6 7 8−20

−18

−16

−14

−12

−10

−8

−6

−4

−2

slope = −5/3

0 5 10 15−15

−10

−5

0

5

10

15

slope = − 5/3

Dyadic Scales

log

2 S

ca

le−

Ave

rag

ed

En

erg

y

Figure: (a) U Velocity Component; (b) Scaling in the Fourier Domain;(c) Scaling in the Wavelet Domain. [5/3 = 2H + 1→ H = 1/3]

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 35 / 44

DNA Scales

A DNA molecule consists of long complementary double helix ofpurine nucleotides (denoted as A and G) and pyrimidinenucleotides (denoted as C and T). [A,G→ +1; C, T → −1]

0 1000 2000 3000 4000 5000 6000 7000 8000

−50

0

50

100

150

200

250

300

350

400

index

DN

A R

W

3 4 5 6 7 8 9 10 11 12−15

−10

−5

0

5

10

15

20

slope=−2.24

Figure: (a) 8196-long DNA Walk for Spider Monkey, from EMLBNucleotide Sequence Alignment DNA Database; (b) Wavelet ScalingWith Slope −2.24. [2.24 = 2H + 1→ H = 0.62]

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 36 / 44

Money scales

Daily rates of exchange between Korean Won (₩) and Euro (€)as reported by the European Central Bank between 4 January1999 and 21 June 2013, http://sdw.ecb.europa.eu

0 500 1000 1500 2000 2500 3000 3500 4000800

1000

1200

1400

1600

1800

2000

0 2 4 6 8 10

24

26

28

30

32

34

36

38

40

−1.99392

dyadic level

log

sp

ect

rum

2048 days starting with 1/4/1999

0 2 4 6 8 10

24

26

28

30

32

34

36

38

40

−1.99413

dyadic level

log

sp

ect

rum

2048 days prior to 6/21/2013

Figure: (a) Daily exchange Rates of ₩ to € between 1/4/1999 and6/21/2013 (Source European Central Bank) (b) Scaling behavior in the“red interval” and (c) in the “green” interval. [2 ≈ 2H + 1→ H = 1/2]

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 37 / 44

Various measurements scale!

Other Examples

Various Geophysical High Frequency Measurements.Biometric ResponsesEconomic IndicesInternet Measurements.Industrial Measurements.Astronomy.Medicine. Brain and Cancer Research

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 38 / 44

DEMO 4

• Dem4a.m (Fractional Brownian Motion)• Dem4b.m (DNA RW)• Dem4c.m (Scaling of Exchange Rates ₩ vs €)• Dem4d.m (Gait Data)• Dem4e.m (Coca Cola Company Stocks)

Dem4b.m Output

1000 2000 3000 4000 5000 6000 7000 8000

−400

−350

−300

−250

−200

−150

−100

−50

0

0 2 4 6 8 10 12

0

5

10

15

20

−2.24122

dyadic level

log

spec

trum

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 39 / 44

DEMO 4

Demo4d.m Output

500 1000 1500 2000

−0.04

−0.02

0

0.02

0.04

0.06

500 1000 1500 2000

−6

−5

−4

−3

−2

−1

0

1 2 3 4 5 6 7 8 9 10

−13

−12

−11

−10

−9

−8

−7−1.02486

dyadic level

log

spec

trum

1 2 3 4 5 6 7 8 9 10

−10

−5

0

5

10

−2.88611

dyadic level

log

spec

trum

Figure: Gait Data: Timing between steps. Slope for the cumulativetime is −(2H + 1) = −2.81 → H = 0.905.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 40 / 44

Case Study: Filtering Ambient Ozone

Katul, Ruggeri, and Vidakovic (2005), JSPI

0 5 10 15 20 2510

20

30

40

50

60

70

80

time (min)

me

asu

rem

en

ts

1 2 3 4 5 6 7 8 9 10−6

−4

−2

0

2

4

6

8

10

12

14

log scalelo

g e

ne

rgy

Figure: Raw data from a Gas-Analyzer (21Hz); Wavelet Spectra.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 41 / 44

Ozone Case Study

Estimator of O3

1 2 3 4 5 6 7 8 9 10−15

−10

−5

0

5

10

15

log scale

log

en

erg

y

0 5 10 15 20 2510

20

30

40

50

60

70

80

time (min)e

stim

ate

of th

e o

zo

ne

sig

na

l

Figure: De-whitened Spectra; Estimator of O3 Concentration.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 42 / 44

Ozone Case Study

0 5 10 15 20 25

−30

−20

−10

0

10

20

30

time (min)

resi

du

als

−30 −20 −10 0 10 20 3010

0

101

102

103

104

log

f(x

)

x

0 5 10 15 20 25 30−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

lag

acf

(o

zon

e)

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 43 / 44

Conclusions

Wavelets becoming standard tools (like Fourier transform)

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 44 / 44

Conclusions

Wavelets becoming standard tools (like Fourier transform)

Wavelet shrinkage and scaling assessment: Useful tools indata analysis.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 44 / 44

Conclusions

Wavelets becoming standard tools (like Fourier transform)

Wavelet shrinkage and scaling assessment: Useful tools indata analysis.

Goal of the talk was to build intuition about wavelet dataprocessing and demonstrate fundamental operational concepts.

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 44 / 44

Conclusions

Wavelets becoming standard tools (like Fourier transform)

Wavelet shrinkage and scaling assessment: Useful tools indata analysis.

Goal of the talk was to build intuition about wavelet dataprocessing and demonstrate fundamental operational concepts.

MATLAB DEMOS:http://zoe.bme.gatech.edu/~bv20/isye6420/supporting.html;(Under April 14, 2015 entry).

Vidakovic, B. (GaTech) WAVELETS: Data Analytic View July 11, 2013 44 / 44