High Level Computer Vision SS2015
Exercise 2: Object Identification(Released on 8th May, due on 15th May.Send your solution to [email protected] adding [hlcv] to the caption)
High Level Computer Vision SS2015 | Tutorial for Exercise 2
Question 1: Image Representations, Histogram Distances
• normalized_hist.m :return normalized histogram of pixel intensities for a gray image
2
color image (remember to rgb2gray)
hist.m(matlab builtin function)
normalized_hist.m(remember normalization)
High Level Computer Vision SS2015 | Tutorial for Exercise 2
Question 1: Image Representations, Histogram DistancesColor Histograms: rgb_hist.m, rg_hist.m
• rgb_hist.m : ‣ Compute 3D histogram:
H(R, G, B) = #(pixels with color (R,G,B)) ‣ Normalized the H(R,G,B) then return
as a vector of size (num_bins)3
• rg_hist.m : ‣ Instead of R, G, B values, we use Chromatic representation ‣ Use only r and g to build the histogram
of size (num_bins)2
‣ Similarly, normalize and return as a vector
3
High Level Computer Vision - April 29 2015 82
Color Histograms
• Color statistics ‣ Given: tri-stimulus R,G,B for each pixel ‣ Compute 3D histogram
- H(R,G,B) = #(pixels with color (R,G,B))
[Swain & Ballard, 1991]
High Level Computer Vision - April 29 2015 82
Color Histograms
• Color statistics ‣ Given: tri-stimulus R,G,B for each pixel ‣ Compute 3D histogram
- H(R,G,B) = #(pixels with color (R,G,B))
[Swain & Ballard, 1991]
High Level Computer Vision - April 29 2015 84
Color
• One component of the 3D color space is intensity ‣ If a color vector is multiplied by a scalar, the intensity changes, but not
the color itself. ‣ This means colors can be normalized by the intensity.
- Intensity is given by: I = R + G + B:
‣ „Chromatic representation“
r =R
R + G + B
g =G
R + G + B
b =B
R + G + B
High Level Computer Vision - April 29 2015 85
Color
• Observation: ‣ Since r + g + b = 1, only 2 parameters are necessary ‣ E.g. one can use r and g ‣ and obtains b = 1 - r - g
r + g + b = 1⇥ b = 1� r � g
R + G + B = 1
R
B
G
High Level Computer Vision SS2015 | Tutorial for Exercise 2
Question 1: Image Representations, Histogram DistancesHistogram of Gaussian Partial Derivatives: dxdy_hist.m
• First compute Gaussian partial derivatives on x and y directions
• How to determine the numerical ranges for bins of histogram? ‣ As we learnt from the Exercise 1, Dx can be gotten by first gaussian
filtering on y axis, then gaussian derivative filtering on x axis ‣ Assume 𝝈 for gaussian here is 6.0,
when we use this gaussian derivative filterto convolve a image with extreme case, then the maximum value we can get is ~33.5420
‣ Therefore to have bins of histogram distributed within [-34, 34] might be a good idea
4
gray image Dx Dy
High Level Computer Vision SS2015 | Tutorial for Exercise 2
Question 1: Image Representations, Histogram DistancesHistogram of Gaussian Partial Derivatives: dxdy_hist.m
• Since we have have to assign both Dx and Dy into different bins, so in total there will be a histogram with size: (num_bins)2
• Please still remember to do the normalization then the summation of a histogram will be 1
5
High Level Computer Vision SS2015 | Tutorial for Exercise 2
• dist_intersect.m : ‣ common part between histograms
• dist_l2.m : ‣ Euclidean distance
• dist_chi2.m : ‣ Chi-square
• Please check pages 88, 89 and 90 on lecture slides: CV-SS15-04-29-filtering-instance for their propertiesHigh Level Computer Vision - April 29 2015 88
Histogram Comparison
• Comparison measures ‣ Intersection
• Motivation ‣ Measures the common part of both histograms ‣ Range: [0,1] ‣ For unnormalized histograms, use the following formula
Question 1: Image Representations, Histogram DistancesHistogram Distances: dist_intersect.m, dist_l2.m, dist_chi2.m
6
High Level Computer Vision - April 29 2015 88
Histogram Comparison
• Comparison measures ‣ Intersection
• Motivation ‣ Measures the common part of both histograms ‣ Range: [0,1] ‣ For unnormalized histograms, use the following formula
High Level Computer Vision - April 29 2015 89
Histogram Comparison
• Comparison Measures ‣ Euclidean Distance
• Motivation ‣ Focuses on the differences between the histograms ‣ Range: [0,∞] ‣ All cells are weighted equally. ‣ Not very discriminant
High Level Computer Vision - April 29 2015 90
Histogram Comparison
• Comparison Measures ‣ Chi-square
• Motivation ‣ Statistical background:
- Test if two distributions are different
- Possible to compute a significance score
‣ Range: [0,∞] ‣ Cells are not weighted equally!
- therefore more discriminant
- may have problems with outliers (therefore assume that each cell contains at least a minimum of samples)
High Level Computer Vision SS2015 | Tutorial for Exercise 2
Question 2: Object Identification• find_best_match.m: query-by-example scenario:
‣ Note that in the model and query folders we have arranged them such that the groundtruth match of i-th query image is the i-th model image
• Use the histogram and distance functions from Question 1 to find the matches of query images.
• Rank the similarities of all models images w.r.t query images. ‣ a distance matrix between all pairs of model and query images
7
mod
elqu
ery
High Level Computer Vision SS2015 | Tutorial for Exercise 2
Question 3: Precision and Recall
8[http://en.wikipedia.org/wiki/Precision_and_recall]
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press 3
A (binary) classifier classifies data points as + or −
If we also know the true classification, the performance of the classifier is a 2x2 contingency table, in this application usually called aconfusion matrix.
good!
good!
bad! (Type I error)
bad! (Type II error)
As we saw, this kind of table has many other uses: treatment vs.outcome, clinical test vs. diagnosis, etc.
[Some figures are from Prof. William H. Press, UT Austin]
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press 4
Most classifiers have a “knob” or threshold that you can adjust: How certain do they have to be before they classify a “+”? To get more TP’s, you have to let in some FP’s!
Notice there is just one free parameter, think of it as TP, since
FP(TP) = [given by algorithm]TP + FN = P (fixed number of actual positives, column marginal)FP + TN = N (fixed number of actual negatives, column marginal)
So all scalar measures of performance are functions of one free parameter (i.e., curves).
And the points on any such curve are in 1-to-1 correspondence with those on any other such curve.
If you ranked some classifiers by how good they are, you might get a different rankings at different points on the scale.
On the other hand, one classifier might dominate another at all points on the scale.
more conservative more liberal
clas
sifie
r
clas
sifie
r
clas
sifie
r
clas
sifie
r TP FP
FN TN
Cartoon, not literal:
threshold: How certain do classifierhave to be before they classify a “+” ?
High Level Computer Vision SS2015 | Tutorial for Exercise 2
Question 3: Precision and Recall
9[http://en.wikipedia.org/wiki/Precision_and_recall][Some figures are from Prof. William H. Press, UT Austin]
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press 9
actual
clas
sifie
r
+ −
+
−
TP FP
FN TN
true pos rate (TPR)≡ sensitivity≡ recall
actual
clas
sifie
r
+ −
+
−
TP FP
FN TN
pos. predictive value (PPV)≡ precision
precision-recall curve
Precision-Recall curves overcome this issue by comparing TP with FN and FP
prec = tpr*100./(tpr*100+fpr*9900);prec(1) = prec(2); % fix up 0/0reca = tpr;plot(reca,prec)
Continue our toy example:note that P and N now enter
never better than ~0.13
0.01
By the way, this shape “cliff” is what the ROC convexity constraint looks like in a Precision-Recall plot. It’s not very intuitive.
plot_rpc.m : use 1) distance matrix btw model and query images2) different thresholdsto plot the precision/recall curve.