Segmentation. (template) matching - UVic.caaalbu/computer vision 2010/L20... · 2010-02-26 · 19...

transcript

Segmentation. (template) matching

Announcements

  Midterm March 3 2010, in class   Duration: 45 min   Weight: 20% of final course mark   Closed book, closed notes   Calculator allowed   Practice midterm posted. Solution will

be posted on Sunday.   Midterm Review: March 2   Office hours: March 2, 3-5 pm or by

appointment 2

Reading

  6.4   Adelson and Bergen, Image

Pyramids

Template matching

  Assumes you know what you are looking for (supervised process)

Comparing neighborhoods to templates

  By linear filtering

Correlation can be considered as a dot product between two vectors:

- the pattern and the considered image region.

- The dot product is maximal (maximum correlation) when the pattern is very similar to the corresponding image region.

Optimality matching criterion evaluation

Challenge

  We need scaled representations because the details of interest can occur at various scales

A bar in the big images is a hair on the zebra’s nose; in smaller images, a stripe; in the smallest, the animal’s nose

Aliasing   Can’t shrink an image by taking

every second pixel   If we do, characteristic errors

appear

Detecting a target pattern

  The target pattern may appear at any scale   We want to use only convolutions

Construct copies of the target at several expanded scales, and convolve them with the original image

Detecting a target pattern (cont’d)

  Or maintain a fixed scale of the target and change the scale of the image

Detecting a target pattern

  Both approaches should give equivalent results

  The difference is in the computational complexity

  A convolution with the target pattern expanded in scale by a factor s requires s2 more operations than the convolution with the image reduced in scale by s. s=2..32

  A series of images at iteratively reduced scales will form a pyramid.

A Gaussian Pyramid

Levels of the Gaussian pyramid expanded to the size of the original image

How to construct a Gaussian pyramid

At each iteration:   Filtering with a low-pass filter (ex: Gaussian with

constant σ or other)   Subsampling

form the correlation kernel. The same kernel is used to produce all levels in the pyramid. Kernel should be small and separable

GL=Reduce(Gl-1)

The Laplacian Pyramid

  series of band-pass images   obtained by subtracting each Gaussian

(low-pass) pyramid level from the next-lower level in the pyramid.

Flexible templates   Target might not be exactly the same in every image   Idea: break the template into pieces and try to match

each piece   Position the entire template over the neighborhood,

then search around the position of each subtemplate for the best match

  Overall match is best combined match for all subtemplates

From B. Morse, http://morse.cs.byu.edu/650/

Evaluation issues in segmentation

  Reading 6.5

Evaluating segmentation techniques

  As in other areas of vision, evaluation is a problem

  We need to know what the correct result is

  We need some way to compare the result of each algorithm to the ideal situation

From Tony Pridmore’s Lecture Notes on Image Processing and Interpretation, University of Nottingham

Evaluating segmentation

  Possible approaches   Ground truth – get a ‘correct’

segmentation and compare the results of the algorithm to it

  Evaluations based on region properties – we want the regions to be uniform, and for adjacent regions to be different

  Evaluating robustness   If we deliberately introduce noise or

partially mask the object of interest, how will the segmentation result be affected?

Adapted from Tony Pridmore’s Lecture Notes on Image Processing and Interpretation, University of Nottingham

Ground truth segmentation

  Typically used in medical imaging applications

  Issue: human segmentations can vary significantly

  How do we build a ground truth segmentation from several human segmentations?

Statistical ground truth

Ground truth in other applications

  Experiment: segmenting an image by hand

  Human segmentation of complex scenes is subjective; it depends on visual representation among many other things

  Are human segmentations consistent?

Comparing image segmentations

  Suppose we have a agreed ground truth   We need to compare two sets of regions   What does it mean for two sets of regions

to be similar?   Is the number of regions important?   Does it matter if two regions are merged or

if one is split in two?

Ground truth partition Which result is better?

Segmentation of complex scenes

Current measures of similarity: region-based

  Applicable when only one region of interest in image   Region-based: Mutual overlap

  Limits   Does not give any information about boundaries   Conceals quality differences between

segmentations   Assumes a closed contour   Large errors for small objects

Current measures of similarity: border-based

Hausdorff distance   Idea: consider the two contours as two

finite sets of points

h(A,B) = maxa∈A

mind (a,b)b∈B

⎝ ⎜

⎠ ⎟

H (A,B) = max h(A,B),h(B,A)( )

Unsupervised evaluation

Haralick and Shapiro:   Regions should be uniform and homogeneous with respect to some characteristic(s)   Adjacent regions should have significant differences with respect to the characteristic on which they are

uniform   Region interiors should be simple and without holes   Boundaries should be simple, not ragged, and be

spatially accurate

Segmentation. (template) matching - UVic.caaalbu/computer vision 2010/L20... · 2010-02-26 · 19...

Documents