Robust StatisticsRobust StatisticsWhy do we use the norms we do?
Henrik Aanæs
IMM,DTU
A good general reference is:Robust Statistics: Theory and Methods, by Maronna, Martin and Yohai. Wiley Series in Probability and Statistics
How Tall are You ?How Tall are You ?
Idea of Robust Statistics
To fit or describe the bulk of a data set well without being perturbed (influenced to much) by a small portion of outliers. This should be done without a pre-processingsegmentation of the data.
We thus now model our data set as consisting of inliers, thatfollow some distribution, at outliers which do not.
Inliers Outliers
Outliers can be interesting too!
Line ExampleLine Example
..
.
.
.
.
.
.
.
.
..
.
.
Robust Statistics in Computer VisionRobust Statistics in Computer VisionImage Smoothing
Image by Frederico D'Almeida
Robust Statistics in Computer VisionRobust Statistics in Computer VisionImage Smoothing
Robust Statistics in Computer VisionRobust Statistics in Computer Visionoptical flow
Play Sequence MIT BCS Perceptual Science Group. Demo by John Y. A. Wang.
Robust Statistics in Computer VisionRobust Statistics in Computer Visiontracking via view geometry
Image 1 Image 2
Gaussian/ Normal DistributionGaussian/ Normal DistributionThe Distribution We Usually UseThe Distribution We Usually Use
Nice Properties:• Central Limit Theorem.• Induces two norm.• Leads to linear computations.
But:• Is fiercely influenced by outliers.• Empirical distributions often have
‘fatter’ tails.
Gaussians Just are Models Gaussians Just are Models TooToo
Alternative title of this talk
Error or Error or ρρ-functions-functionsConverting from Model-Data Deviation to Objective Function.
ρρ-functions and ML-functions and MLA typical way of forming ρ- functions
ρρ-functions and ML II-functions and ML IIA typical way of forming ρ- functions
ρρ-functions and ML II-functions and ML IIA typical way of forming ρ- functions
Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From
• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared
General Idea: Down weigh outliers, i.e. ρ(x) should be ‘smaller’ for large |x|.
Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From
• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared
• Induced by Gaussian.• Very non-robust.• ‘Standard’ distribution.
Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From
• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared
• Quite Robust.• Convex.• Corresponds to Median.
The Median and the 1-NormThe Median and the 1-Norm
The Median and the 1-NormThe Median and the 1-NormExample with 2 observations
The Median and the 1-NormThe Median and the 1-NormExample with 2 observations
The Median and the 1-NormThe Median and the 1-NormExample with more observations
Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From
• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared
• Quite Robust.• Convex.• Corresponds to Median.
Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From
• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared
• Mixture of 1 and two norm.• Convex.• Has nice theoretical
properties.
Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From
• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared
• Discards Outlier’s.• For inliers works as
Gaussian.• Has discontinues
derivative.
Typical Typical ρρ-functions-functionsWhere the Robustness in Practice Comes From
• 2-norm• 1-norm• Huber norm• Truncated quadratic• Bi-Squared
• Discards Outlier’s.• Smooth.
Quantifying RobustnessQuantifying RobustnessA peak at tools for analysis
Bias vs. Variance
Quantifying RobustnessQuantifying RobustnessA peak at tools for analysis
®= 10
E± cieny of an estimator µ̂:
e(µ̂) =1=I (µ)var(µ̂)
;
where I (µ) is the F isher information of thesample. In general e(µ̂) · 1 so:
var(µ̂) ¸1
I (µ):
Related to variance, on the previous slide
Quantifying RobustnessQuantifying RobustnessYou want to be robust over a range of models
®= 10
Model F = (1¡ ²)N (0;1)+²N (0;10) the asymp-totic variance of the mean using the Huber-norm is given by:
k ² = 0 ² = 0:05 ² = 0:100 1.571 1.722 1.897
0.7 1.187 1.332 1.5011.0 1.107 1.263 1.4431.4 1.047 1.227 1.4391.7 1.010 1.233 1.4792.0 1.010 1.259 1.5501 1.000 5.950 10.900
from " Robust statistics: theory and methods" by Maronnaet. al.
Quantifying RobustnessQuantifying RobustnessA peak at tools for analysis
®= 10
Other measures (Similar):
• Breakage Point: How many outliers can an estimator handle and still give ‘reasonable’ results.
• Asymptotic bias: What bias does an outlier impose.
Back to ImagesBack to Imageshere we have multiple ‘models’here we have multiple ‘models’
To fit or describe the bulk of a data set well without being perturbed (influenced to much) by a small portion of outliers. This should be done without a pre-processingsegmentation of the data.
Optimization MethodsOptimization Methods
Typical Approach:
1. Find initial estimate.
2. Use Non-linear optimization and/or EM-algorithm.
NB: In this course we have and will seen other methods e.g. with guaranteed convergence
Hough TransformHough TransformOne off the oldest robust methods in ‘vision’
Often used for initial estimate.
Example from MatLab help
Curse of Dimesionality= PROBLEM
RanSaCRanSaCSampling in Hough space, better for higher dimensions
In a Hough setting:• 1. and 2. corresponds
to finding a ‘good’ bin in Hough space.
• 3. Corresponds to calculating the value.
RANdom SAmpling
Consensus, RANSAC
Iterate:
1. Draw minimal sample.
2. Fit model.
3. Evaluate model by Consensus.
Run RanDemo.m
RansacRansacHow many iterations
Inliers Outliers
Need to sample only Inliers to ‘succed’. Naïve scheme; try allcombinations i.e. all
E.g. For 100 points and a samplesize of 7, this is 8.0678e+013 trials.
)!(
!
SampleObs
Obs
NN
N
Preferred stopping scheme: •Stop when there is a e.g. 99% chance of getting all inliers.•Chance of getting an inlier
•Use consesus of best fit as estimate of N_in
See e.g. Hartley and Zisserman: “Multiple view geometry”
OutIn
In
NN
N
Iteratively Reweighted Least Squares IRLSIteratively Reweighted Least Squares IRLS
EM-type or chicken and egg optimization