+ All Categories
Home > Documents > Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf ·...

Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf ·...

Date post: 15-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
62
Image Processing
Transcript
Page 1: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Image Processing

Page 2: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Outline

• Logistics

• Motivation

• Convolution

• Filtering

Page 3: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Waitlist

• We are at 103 enrolled with 158 students on wait list. This room holds 107.

• I’m getting numerous requests of the form “how likely is it that I’ll get registered?” unlikely :(

• If you are considering dropping, please do so quickly

Page 4: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Some final class philosophies

• Diverse background of class implies folks will find some topics will be redundant/new (e.g., EE folks might be bored by today’s signal processing)

• I think 1-way lectures are boring (and such context can easily be found elsewhere). Discussions are way more fun! I encourage you to come to class.

• I hate power-point. I’d rather write on board, but this room is not conducive for it. I still encourage you to take notes.

• If you are going to come and check e-mail / Facebook, I’d rather you drop now to make room for someone else who’d get more out of lecture.

Page 5: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Outline

• Logistics

• Motivation

• Convolution

• Filtering

Page 6: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Lecture 1 - !!!

Fei-Fei Li & Andrej Karpathy! 5"Jan"15'12'

David Marr, 1970s

David Marr, 1982

Computational perspectiveCredited with early computational approach for vision

Page 7: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

David Marr

Low-level Mid-level High-level

Page 8: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Low-level vision

Finding edges, blobs, bars, etc….

Page 9: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Consider family of low-level image processing operations

Photoshop / Instragram filters: blur, sharpen, colorize, etc….

Are certain combinations redundant? Is there a mathematical way to characterize them?

Page 10: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Recall: what is a digital (grayscale) image?

Matrix of integer values

Page 11: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Let’s think of image as zero-padded functions

Images as height fields

F[i,j]

Page 12: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Characterizing image transformations

F[i,j] G[i,j]T

F[i] G[i]T

G[i] = T (F [i])

T (↵F1 + ↵F2) = ↵G1 + �G2

G[i� j] = T (F [i� j])

(Abuse of notation: [i] does not mean transformation is applied at each pixel separately)

G = T (F )

5 4 2 3 7 4 6 5 3 6 5 4 2 3 7 4 6 5 3 6

Page 13: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

How do we characterize image processing operations ?

Properties of “nice” functional transformations

Additivity

Scaling

Shift Invariance

T (F1 + F2) = T (F1) + T (F2)

T (↵F ) = ↵T (F )

G[i� j] = T (F [i� j])

Direct consequence: LinearityT (↵F1 + ↵F2) = ↵G1 + �G2

Page 14: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Impulse response= 1 for i = 0 (0 othwerwise)�[i]

[also called delta function]

What does this look like for an image?

Any function can be written as linear combination of shifted and scaled impulse reponses

= ++... + + ...

Figure 1: Staircase approximation to a continuous-time signal.

Representing signals with impulses. Any signal can be expressed as a sum of scaled andshifted unit impulses. We begin with the pulse or “staircase” approximation to a continuoussignal , as illustrated in Fig. 1. Conceptually, this is trivial: for each discrete sample of theoriginal signal, we make a pulse signal. Then we add up all these pulse signals to make up theapproximate signal. Each of these pulse signals can in turn be represented as a standard pulsescaled by the appropriate value and shifted to the appropriate place. In mathematical notation:

As we let approach zero, the approximation becomes better and better, and the in the limitequals . Therefore,

Also, as , the summation approaches an integral, and the pulse approaches the unit impulse:

(1)

In other words, we can represent any signal as an infinite sum of shifted and scaled unit impulses. Adigital compact disc, for example, stores whole complex pieces of music as lots of simple numbersrepresenting very short impulses, and then the CD player adds all the impulses back together oneafter another to recreate the complex musical waveform.

This no doubt seems like a lot of trouble to go to, just to get back the same signal that weoriginally started with, but in fact, we will very shortly be able to use Eq. 1 to perform a marveloustrick.

Linear Systems

A system or transform maps an input signal into an output signal :

where denotes the transform, a function from input signals to output signals.

Systems come in a wide variety of types. One important class is known as linear systems. Tosee whether a system is linear, we need to test whether it obeys certain rules that all linear systemsobey. The two basic tests of linearity are homogeneity and additivity.

4

F[i] = ?

F [i] = F [0]�[i] + F [1]�[i� 1] + . . .

F [i] =X

u

F [u]�[i� u]

T (F [i]) =X

u

F [u]T (�[i� u])

G[i] =X

u

F [u]H[i� u] where H[i] = T (�[i]), G[i] = T (F [i])

G = F ⇤H

Page 15: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Convolution= ++... + + ...

Figure 1: Staircase approximation to a continuous-time signal.

Representing signals with impulses. Any signal can be expressed as a sum of scaled andshifted unit impulses. We begin with the pulse or “staircase” approximation to a continuoussignal , as illustrated in Fig. 1. Conceptually, this is trivial: for each discrete sample of theoriginal signal, we make a pulse signal. Then we add up all these pulse signals to make up theapproximate signal. Each of these pulse signals can in turn be represented as a standard pulsescaled by the appropriate value and shifted to the appropriate place. In mathematical notation:

As we let approach zero, the approximation becomes better and better, and the in the limitequals . Therefore,

Also, as , the summation approaches an integral, and the pulse approaches the unit impulse:

(1)

In other words, we can represent any signal as an infinite sum of shifted and scaled unit impulses. Adigital compact disc, for example, stores whole complex pieces of music as lots of simple numbersrepresenting very short impulses, and then the CD player adds all the impulses back together oneafter another to recreate the complex musical waveform.

This no doubt seems like a lot of trouble to go to, just to get back the same signal that weoriginally started with, but in fact, we will very shortly be able to use Eq. 1 to perform a marveloustrick.

Linear Systems

A system or transform maps an input signal into an output signal :

where denotes the transform, a function from input signals to output signals.

Systems come in a wide variety of types. One important class is known as linear systems. Tosee whether a system is linear, we need to test whether it obeys certain rules that all linear systemsobey. The two basic tests of linearity are homogeneity and additivity.

4

impulse response, filter, kernel

F [i] = F [0]�[i] + F [1]�[i� 1] + . . .

F [i] =X

u

F [u]�[i� u]

T (F [i]) =X

u

F [u]T (�[i� u])

G[i] =X

u

F [u]H[i� u] where H[i] = T (�[i]), G[i] = T (F [i])

G = F ⇤H

Page 16: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Example

5 4 2 3 7 4 6 5 3 61 2 3 *

Template

Deva Ramanan

January 20, 2015

G[i] = F [i] ⇤H[i] =X

u

F [u]H[i� u]

= H[i] ⇤ F [i] =X

u

H[u]F [i� u]

G[i] = F [i]⌦H[i] =X

u

H[u]F [i+ u]

= F [i] ⇤H[�i]

G[i, j] = F ⇤H =X

u

X

v

F [u, v]H[i� u, j � v]

G[i, j] = F ⇤H = H ⇤ F =X

u

X

v

H[u, v]F [i� u, j � v]

G[i, j] = F ⌦H =X

u

X

v

H[u, v]F [i+ u, j + v]

G[i, j] = F ⇤H =kX

u=�k

kX

v=�k

H[u, v]F [i+ u, j + v]

1

0 1 2 3 4 5 6 7 8 90 1 2

H F

G[0] = ? G[1] = ?

Page 17: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Example5 4 2 3 7 4 6 5 3 61 2 3 *

3 2 1

G[0] = 5x1 = 5 G[1] = 5x2+ 4x1 = 14 G[2] = 5x3 + 4x2 + 2x1 = 25 …

-3 -2 -1 0 1 2 3 4 5 6 7 8 9

Page 18: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Preview of 2D

f

h

Page 19: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Properties of convolution

Commutative

Associative

Distributive

Implies that we can efficiently implement complex operations

F ⇤H = H ⇤ F(F ⇤H) ⇤G = F ⇤ (H ⇤G)

(F ⇤G) + (H ⇤G) = (F +H) ⇤G

Powerful way to think about any image transformation that satisfies additivity, scaling, and shift-invariance

Page 20: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Proof: commutativityH ⇤ F =

X

u

H[u]F [i� u] =X

u0

H[i� u0]F [u0] where u0 = i� u

=X

u

F [u]H[i� u] = F ⇤H

Conceptually wacky: allows us to interchange the filter and image

Page 21: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

SizeGiven F of length N and H of length M, what’s size of G = F * H?

Page 22: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

SizeGiven F of length N and H of length M, what’s size of G = F * H?

>>conv(F,H,’full’) >>conv(F,H,’valid’) >>conv(F,H,’same’)

N+M-1N-M+1

N

Page 23: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

A simpler approach

5 4 2 3 7 4 6 5 3 61 2 3

0 1 2 3 4 5 6 7 8 9-1 0 1

Template

Deva Ramanan

January 14, 2015

G[i, j] = F ⌦H =X

u

X

v

H[u, v]F [i+ u, j + v]

G[i, j] = G ⇤H =X

u

X

v

H[u, v]F [i� u, j � v]

1

Template

Deva Ramanan

January 14, 2015

G[i, j] = F ⌦H =X

u

X

v

H[u, v]F [i+ u, j + v]

G[i, j] = G ⇤H =X

u

X

v

H[u, v]F [i� u, j � v]

1

Scan original F instead of flipped version. What’s the math?

Page 24: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

(Cross) correlation

5 4 2 3 7 4 6 5 3 61 2 3

0 1 2 3 4 5 6 7 8 9-1 0 1

Template

Deva Ramanan

January 14, 2015

G[i, j] = F ⌦H =X

u

X

v

H[u, v]F [i+ u, j + v]

G[i, j] = G ⇤H =X

u

X

v

H[u, v]F [i� u, j � v]

1

Template

Deva Ramanan

January 14, 2015

G[i, j] = F ⌦H =X

u

X

v

H[u, v]F [i+ u, j + v]

G[i, j] = G ⇤H =X

u

X

v

H[u, v]F [i� u, j � v]

1

Scan original F instead of flipped version. What’s the math?

F [i]⌦H[i] =u=kX

u=�k

H[u]F [i+ u]

Page 25: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Properties

Associativity, Commutative properties do not hold

… but correlation is easier to think about

Page 26: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Convolution vs correlation (1-d)

(commutative property)

(convolution)

Template

Deva Ramanan

January 20, 2015

G[i] = F [i] ⇤H[i] =X

u

F [u]H[i� u]

= H[i] ⇤ F [i] =X

u

F [i� u]H[u]

G[i] = F [i]⌦H[i] =X

u

F [i+ u]H[u]

= F [i] ⇤H[�i]

G[i, j] = F ⇤H =X

u

X

v

F [u, v]H[i� u, j � v]

G[i, j] = F ⇤H =X

u

X

v

H[u, v]F [i� u, j � v]

G[i, j] = F ⌦H =X

u

X

v

H[u, v]F [i+ u, j + v]

G[i, j] = F ⇤H =kX

u=�k

kX

v=�k

H[u, v]F [i+ u, j + v]

1

(exercise for reader!)

Template

Deva Ramanan

January 20, 2015

G[i] = F [i] ⇤H[i] =X

u

F [u]H[i� u]

= H[i] ⇤ F [i] =X

u

H[u]F [i� u]

G[i] = F [i]⌦H[i] =X

u

H[u]F [i+ u]

= F [i] ⇤H[�i]

G[i, j] = F ⇤H =X

u

X

v

F [u, v]H[i� u, j � v]

G[i, j] = F ⇤H = H ⇤ F =X

u

X

v

H[u, v]F [i� u, j � v]

G[i, j] = F ⌦H =X

u

X

v

H[u, v]F [i+ u, j + v]

G[i, j] = F ⇤H =kX

u=�k

kX

v=�k

H[u, v]F [i+ u, j + v]

1

Template

Deva Ramanan

January 20, 2015

G[i] = F [i] ⇤H[i] =X

u

F [u]H[i� u]

= H[i] ⇤ F [i] =X

u

H[u]F [i� u]

G[i] = F [i]⌦H[i] =X

u

H[u]F [i+ u]

= F [i] ⇤H[�i]

G[i, j] = F ⇤H =X

u

X

v

F [u, v]H[i� u, j � v]

G[i, j] = F ⇤H = H ⇤ F =X

u

X

v

H[u, v]F [i� u, j � v]

G[i, j] = F ⌦H =X

u

X

v

H[u, v]F [i+ u, j + v]

G[i, j] = F ⇤H =kX

u=�k

kX

v=�k

H[u, v]F [i+ u, j + v]

1

(cross-correlation)

Page 27: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

2D correlation

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 10 20 30 30 30 20 10

0 20 40 60 60 60 40 20

0 30 60 90 90 90 60 30

0 30 50 80 80 90 60 30

0 30 50 80 80 90 60 30

0 20 30 50 50 60 40 20

10 20 30 30 30 30 20 10

10 10 10 0 0 0 0 0

[.,.]g[.,.]f

Image filtering 1 1 1 1 1 1 1 1 1 ],[ ⋅⋅h

Credit: S. Seitz

],[],[],[,

lnkmflkhnmglk

++=∑60

Gaussian filtering

A Gaussian kernel gives less weight to pixels further from the center of the window

This kernel is an approximation of a Gaussian function:

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 90 0 90 90 90 0 0

0 0 0 90 90 90 90 90 0 0

0 0 0 0 0 0 0 0 0 0

0 0 90 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

1 2 1

2 4 2

1 2 1

Slide by Steve Seitz

G[i, j] = F ⌦H =kX

u=�k

kX

v=�k

H[u, v]F [i+ u, j + v]

Page 28: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Convolution vs correlation (2-d)

>> conv2(H,F) >> filter2(H,F)

Convolution:

Correlation:

Template

Deva Ramanan

January 14, 2015

G[i, j] = F ⌦H =X

u

X

v

H[u, v]F [i+ u, j + v]

G[i, j] = G ⇤H =X

u

X

v

H[u, v]F [i� u, j � v]

1

Template

Deva Ramanan

January 20, 2015

G[i] = F [i] ⇤H[i] =X

u

F [u]H[i� u]

= H[i] ⇤ F [i] =X

u

H[u]F [i� u]

G[i] = F [i]⌦H[i] =X

u

H[u]F [i+ u]

= F [i] ⇤H[�i]

G[i, j] = F ⇤H =X

u

X

v

F [u, v]H[i� u, j � v]

G[i, j] = F ⇤H = H ⇤ F =X

u

X

v

H[u, v]F [i� u, j � v]

G[i, j] = F ⌦H =X

u

X

v

H[u, v]F [i+ u, j + v]

G[i, j] = F ⇤H =kX

u=�k

kX

v=�k

H[u, v]F [i+ u, j + v]

1

convolutioncorrelation

Can we compute correlation with convolution?

f

hf

h

Page 29: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Border effects

Annoying detailsWhat is the size of the output?• MATLAB: filter2(g, f, shape)

• shape = ‘full’: output size is sum of sizes of f and g• shape = ‘same’: output size is same as f• shape = ‘valid’: output size is difference of sizes of f and g

f

gg

gg

f

gg

gg

f

gg

gg

full same valid

Page 30: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Border paddingBorders!

From Szeliski, Computer Vision, 2010

Page 31: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Examples of correlationLinear filters: examples

Original

1 1 1 1 1 1 1 1 1

Blur (with a mean filter)

Source: D. Lowe

=

Page 32: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Practice with linear filters

0 0 0 0 1 0 0 0 0

Original

?

Source: D. Lowe

Examples of correlation

Page 33: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Practice with linear filters

0 0 0 0 1 0 0 0 0

Original Filtered (no change)

Source: D. Lowe

Examples of correlation

Page 34: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Practice with linear filters

0 0 0 0 0 1 0 0 0

Original

?

Source: D. Lowe

Practice with linear filters

0 0 0 1 0 0 0 0 0

Original Shifted left By 1 pixel

Source: D. Lowe

Examples of correlation

Page 35: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Practice with linear filters

0 0 0 1 0 0 0 0 0

Original Shifted left By 1 pixel

Source: D. Lowe What would this look like for convolution?

Examples of correlation

Page 36: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Practice with linear filters

0 0 0 0 0 1 0 0 0

Original

?

Source: D. Lowe

Examples of correlationPractice with linear filters

0 0 0 1 0 0 0 0 0

Original Shifted left By 1 pixel

Source: D. Lowe

Page 37: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Practice with linear filters

0 0 0 0 0 1 0 0 0

Original

?

Source: D. Lowe

Examples of correlationPractice with linear filters

0 0 0 1 0 0 0 0 0

Original Shifted left By 1 pixel

Source: D. Lowe

Practice with linear filters

0 0 0 1 0 0 0 0 0

Original Shifted left By 1 pixel

Source: D. Lowe

Page 38: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Practice with linear filters

0 0 0 0 1 0 0 0 0

Original Filtered (no change)

Source: D. Lowe

1 2 12 4 21 2 1

/16

What would this look like for convolution?

Examples of correlation

Page 39: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Practice with linear filters

0 0 0 0 1 0 0 0 0

Original Filtered (no change)

Source: D. Lowe

0 0 00 2 00 0 0

- ?0 0 00 1 00 0 0

-

Practice with linear filters

0 0 0 0 1 0 0 0 0

Original Filtered (no change)

Source: D. Lowe

Examples of correlation

Page 40: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Practice with linear filters

0 0 0 0 1 0 0 0 0

Original Filtered (no change)

Source: D. Lowe

0 0 00 1 00 0 0

-1 2 12 4 21 2 1

/16- )( +0 0 00 1 00 0 0

Unsharp filter

Examples of correlation

Sharpen filter

Gaussianscaled impulseLaplacian of Gaussian

imageblurredimage unit impulse

(identity)

Page 41: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

ExamplesImage!rota>on!

g[m,n]

h[m,n]

= ?

f[m,n]

It is linear, but not a spatially invariant operation. There is not convolution.

Image!rota>on!

g[m,n]

h[m,n]

= ?

f[m,n]

It is linear, but not a spatially invariant operation. There is not convolution.

? ? ?? ? ?? ? ?

Can rotations be represented with a convolution? Are they linear shift-invariant (LSI) operations G[i,j] = T(F[i,j])?

Page 42: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Derivative filters (correlation)

⇥�1 1

⇤�11

Page 43: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Question: what happens as we repeatedly convolve an image F with filter H?

Practice with linear filters

0 0 0 0 1 0 0 0 0

Original Filtered (no change)

Source: D. Lowe

F F*H

Page 44: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Aside for the probability junkies: The PDF of the sum of two random variables = convolution of their PDFs functions. Repeated convolutions => repeated sums => CLT

Page 45: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Gaussian

1

16

2

41 2 12 4 21 2 1

3

5

Page 46: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Gaussian filters

= 30 pixels = 1 pixel = 5 pixels = 10 pixels

Page 47: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Implementation

63

Gaussian Kernel

• Standard deviation σ: determines extent of smoothing

Source: K. Grauman

σ = 2 with 30 x 30 kernel

σ = 5 with 30 x 30 kernel

Matlab: >> G = FSPECIAL('gaussian',HSIZE,SIGMA)

1

16

2

41 2 12 4 21 2 1

3

5

Page 48: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Finite-support filters

65

Choosing kernel width

• The Gaussian function has infinite support, but discrete filters use finite kernels

Source: K. Grauman What should HSIZE be?

Page 49: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Rule-of-thumb

Set radius of filter to be 3 sigma

Page 50: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Useful representation: Gaussian pyramid

Figure 1: Gaussian Pyramid. Depicted are four levels of the Gaussian pyamid,levels 0 to 3 presented from left to right.

[2] P.J. Burt. Fast filter transforms for image processing. Computer Graphics

and Image Processing, 1981.

[3] P.J. Burt. Fast algorithms for estimating local image properties. Computer

Graphics and Image Processing, 1983.

[4] P.J. Burt and E.H. Adelson. The laplacian pyramid as a compact imagecode. IEEE Transactions on Communication, 31(4):532–540, April 1983.

[5] L.I. Larkin and P.J. Burt. Multi-resolution texture energy measures. InIEEE Conference on Computer Vision and Pattern Recognition, 1983.

2

Filter + subsample (to exploit redundancy in output)

http://persci.mit.edu/pub_pdfs/pyramid83.pdfBurt & Adelson 83

Page 51: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Smoothing vs edge filtersGaussian filters

= 30 pixels = 1 pixel = 5 pixels = 10 pixels

How should filters behave on a flat region with value ‘v’ ?

Page 52: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Smoothing vs edge filtersGaussian filters

= 30 pixels = 1 pixel = 5 pixels = 10 pixels

How should filters behave on a flat region with value ‘v’ ?

Output ‘v’ Output 0

X

ij

H[i, j] = 1X

ij

H[i, j] = 0

Page 53: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

53

Template matching with filtersTemplate matching

Goal: find in image Main challenge: What is a

good similarity or distance measure between two patches? • Correlation • Zero-mean correlation • Sum Square Difference • Normalized Cross Correlation

Side by Derek Hoiem

Template matching Goal: find in image Main challenge: What is a

good similarity or distance measure between two patches? • Correlation • Zero-mean correlation • Sum Square Difference • Normalized Cross Correlation

Side by Derek Hoiem

Can we use filtering to build detectors?

H[i,j]

F[i,j]

Page 54: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Matching with filters Goal: find in image Method 0: filter the image with eye patch

Input Filtered Image

],[],[],[,

lnkmflkgnmhlk

++=∑

What went wrong?

f = image g = filter

Side by Derek Hoiem

Attempt 1: correlate with eye patch

Template matching Goal: find in image Main challenge: What is a

good similarity or distance measure between two patches? • Correlation • Zero-mean correlation • Sum Square Difference • Normalized Cross Correlation

Side by Derek Hoiem

G[i, j] =kX

u=�k

kX

v=�k

H[u, v]F [i+ u, j + v]

= HTFij = ||H||||Fij || cos ✓, H, Fij 2 R(2K+1)2

H

Page 55: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Matching with filters Goal: find in image Method 0: filter the image with eye patch

Input Filtered Image

],[],[],[,

lnkmflkgnmhlk

++=∑

What went wrong?

f = image g = filter

Side by Derek Hoiem

Attempt 1: correlate with eye patch

Template matching Goal: find in image Main challenge: What is a

good similarity or distance measure between two patches? • Correlation • Zero-mean correlation • Sum Square Difference • Normalized Cross Correlation

Side by Derek Hoiem

G[i, j] =kX

u=�k

kX

v=�k

H[u, v]F [i+ u, j + v]

= HTFij = ||H||||Fij || cos ✓, H, Fij 2 R(2K+1)2

Useful to think about correlation and convolution

H

FijH

Fij

✓ij

Page 56: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

Attempt 1.5: correlate with transformed eye patchTemplate matching

Goal: find in image Main challenge: What is a

good similarity or distance measure between two patches? • Correlation • Zero-mean correlation • Sum Square Difference • Normalized Cross Correlation

Side by Derek Hoiem

Template matching Goal: find in image Main challenge: What is a

good similarity or distance measure between two patches? • Correlation • Zero-mean correlation • Sum Square Difference • Normalized Cross Correlation

Side by Derek Hoiem

Let’s transform filter such that response on a flat region is 0

Page 57: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

57

Matching with filters

Goal: find in image Method 1: filter the image with zero-mean eye

Input Filtered Image (scaled) Thresholded Image

)],[()],[(],[,

lnkmgflkfnmhlk

++−=∑

True detections

False detections

mean of f

Attempt 1.5: correlate with zero-mean eye patch

G[i, j] =kX

u=�k

kX

v=�k

(H[u, v]� H̄)F [i+ u, j + v]

=kX

u=�k

kX

v=�k

H[u, v]F [i+ u, j + v]� H̄kX

u=�k

kX

v=�k

F [i+ u, j + v]

Page 58: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

58

Attempt 2: SSDMatching with filters

Goal: find in image Method 2: SSD

Input 1- sqrt(SSD) Thresholded Image

2

,

)],[],[(],[ lnkmflkgnmhlk

++−=∑

True detections

Can this be implemented with filtering?

SSD[i, j] = ||H � Fij ||2

= (H � Fij)T (H � Fij)

Page 59: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

59

Matching with filters Goal: find in image Method 2: SSD

Input 1- sqrt(SSD)

2

,

)],[],[(],[ lnkmflkgnmhlk

++−=∑

What’s the potential downside of SSD?

Side by Derek Hoiem

Template matching Goal: find in image Main challenge: What is a

good similarity or distance measure between two patches? • Correlation • Zero-mean correlation • Sum Square Difference • Normalized Cross Correlation

Side by Derek Hoiem

What will SSD find here?

(where eyes have been darkened by .5 scale factor)

SSD will fire on shirt

Page 60: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

60

Normalized cross correlation

Matching with filters

Goal: find in image Method 3: Normalized cross-correlation

Input Normalized X-Correlation Thresholded Image

True detections Template matching Goal: find in image Main challenge: What is a

good similarity or distance measure between two patches? • Correlation • Zero-mean correlation • Sum Square Difference • Normalized Cross Correlation

Side by Derek Hoiem

where H, F are mean-centered

H

Fij

✓ij

NCC[i, j] =HTFij

||H||||Fij ||

=

HTFijpHTH

qFTijFij

= cos ✓

Page 61: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

61

8 SERGE BELONGIE, CS 6670: COMPUTER VISION

image that returns the median value. Another example of a nonlinear filteris “Non-Local Means,” which we describe next.

In Non-Local Means, for every pixel p we look for patches elsewhere inthe image that look similar to the patch surrounding p. We then averagethis set of patches to determine the filtered value of p.

One nice feature of NL-Means is that it is “edge preserving,” while othermethods of smoothing/de-noising can result in blurry edges.

8.5. Looking ahead: modern applications of filter banks

The above approaches to filtering were largely hand designed. This is partlydue to limitations in computing power and lack of access to large datasets inthe 80s and 90s. In modern approaches to image recognition the convolutionkernels/filtering operations are often learned from huge amounts of trainingdata.

In 1998 Yann LeCun created a Convolutional Network (named “LeNet”)that could recognize hand-written digits using a sequence of filtering op-erations, subsampling and assorted nonlinearities the parameters of whichwere learned via stochastic gradient descent on a large,labeled training set.Rather than hand selecting the filters to use, part of LeNet’s training was topick for itself the most e↵ective set of filters. Modern ConvNets use basicallythe same structure as LeNet but because of richer training sets and greatercomputing power we can recognize far more complex objects than handwrit-ten digits (see, for example, GoogLeNet in 2014 and other submissions toImageNet Large-Scale Visual Recognition Challenge).

Modern filter banksLearn filters from training data to look for low, mid, and high-level features

Convolutional Neural Nets (CNNs) Lecun et al 98

Page 62: Image Processing - Carnegie Mellon University16720.courses.cs.cmu.edu/lec/convolution_lec1.pdf · Some final class philosophies • Diverse background of class implies folks will

62

• Any linear shift-invariant operation can be characterized by a convolution • (Convolution) correlation intuitively corresponds to (flipped) matched-filters • Derive filters by continuous operations (derivative, Gaussian, …) • Contemporary application: convolutional neural networks

A look back


Recommended