Date post: | 29-Jan-2018 |
Category: |
Engineering |
Upload: | nicolau-werneck |
View: | 228 times |
Download: | 1 times |
Corisco: Robust edgel-basedorientation estimation forgeneric camera models
Nicolau L. Werneckhttp://nic.hpavc.net
Supervisor: Profa. Dra. Anna Helena Reali CostaIntelligent Techniques Laboratory, LTI — PCS — Poli
Universidade de São Paulo
Augsburg, Deutschland — 8/10/2015
Meta-introduction
Nicolau Werneck, Sc.D.
E.Eng. graduated from UFMG, Unicamp and USP.Specialized in signal processing and patternrecognition, especially in computer vision andparameter estimation problems.Ex-Google and ex-Geekie.Wants to help robots help us.
Presentation adapted from my doctorate defense.Results were published in Werneck and Costa [2013].
1 / 40
Anthropic EnvironmentsAn anthropic environment is composed by straight lines,parallel to the directions of a natural reference frame.
The orientation we find is a three-dimensional rotationbetween the natural and the camera reference frame.
3 / 40
Edgels and straight linesEdgels are points samples over curves or (straight) lines.
Edgels have many applications.
4 / 40
Proposal
We propose a monocular vision method, denominatedCorisco, that can estimate the orientation of a camerarelative to an anthropic environment.
Evolution of existing edgel based methods [Coughlan andYuille, 2003].
Aplication examples:Guiding a mobile robot.Initial estimates for multi-view reconstruction.Object orientation estimation.
5 / 40
Proposal
We propose a monocular vision method, denominatedCorisco, that can estimate the orientation of a camerarelative to an anthropic environment.
Evolution of existing edgel based methods [Coughlan andYuille, 2003].
Aplication examples:Guiding a mobile robot.Initial estimates for multi-view reconstruction.Object orientation estimation.
5 / 40
Proposal
We propose a monocular vision method, denominatedCorisco, that can estimate the orientation of a camerarelative to an anthropic environment.
Evolution of existing edgel based methods [Coughlan andYuille, 2003].
Aplication examples:Guiding a mobile robot.Initial estimates for multi-view reconstruction.Object orientation estimation.
5 / 40
Contributions
Corisco has the following peculiarities:Supports any possible camera model.Compromise between speed and precision.Dismisses the use of very costly operations (sin,atan, exp, log). Uses the function x−1/2.
There are also no assumptions about the solution.
6 / 40
High-level viewInputs: Image, intrinsic and control parameters.Output: Orientation Ψ (three-dimensional rotation).
7 / 40
GeometryThe direction v of a line over the point p depends on Ψ.
(Ψ,p)→ v
http://youtu.be/PbUyvdnBIzs
8 / 40
Objective
Our questionWhat could be the best way to estimate the orientationof a camera in real time in an anthropic environmentbased on a single distorted image?
9 / 40
Objective
Our questionWhat could be the best way to estimate the orientationof a camera in real time in an anthropic environmentbased on a single distorted image?
9 / 40
Edgels versus lines
Orientation can be found from lines (Caprile and Torre[1990], Cipolla et al. [1999], Rother [2002]).
More intuitive.Attain a good precision.
On the other hand:Restricted to perspective projection.Rely on complicated line extraction.
Edgels allow us to handle distortions and are easier toextract and sub-sample.
10 / 40
Edgel extractionSimilar to the Canny border detection.Borders are local maxima in the gradient direction.
0 100 200 300 400 500 600
0
100
200
300
400
Imagem
0 100 200 300 400 500 600
0
100
200
300
400
Derivada em x
0 100 200 300 400 500 600
0
100
200
300
400
Derivada em y
0 100 200 300 400 500 600
0
100
200
300
400
Intensidade do gradiente
13 / 40
Edgel extraction
Sweep the image over a set of lines and columns thatconstitute a grid mask.
Each border found produces an edgel. Its directionshould be approximately orthogonal to the line swept.
0 100 200 300 400 500 600
0
100
200
300
400
Grade de amostragem
0 100 200 300 400 500 600
0
100
200
300
400
Edgels extraídos
14 / 40
Camera models
Bijective map between the image points and directionsaround the camera focal point.
q p
15 / 40
Equações de modelos de câmera
Perspective Equiretangular (lat-lon)px = (qx/qz)f +cx
py = (qy/qz)f +cypx = f tan−1(qz ,qx )py = f sin−1 (qy/|q|)
Harris (radial distortion) Polar Equidistant (fisheye)
g(x) = 1√1−2κx2
px = p′xg(|p′|) +cx
py = p′yg(|p′|) +cy
ϕ = cos−1 (qz/|q|)px = ϕ
qx√qx 2+qy 2
f +cx
py = ϕqy√
qx 2+qy 2f +cy
17 / 40
Edgel projectionA point qn from a line in the direction rk is projected onpn, producuing an edgel com with direction
vnk ∝ Jnrk (v← (Ψ,p))
The projection Jacobian matrix Jn depends on pn. Thedirection orthogonal to vn is denominated un.
18 / 40
Edgel projectionA point qn from a line in the direction rk is projected onpn, producuing an edgel com with direction
vnk ∝ Jnrk (v← (Ψ,p))
The projection Jacobian matrix Jn depends on pn. Thedirection orthogonal to vn is denominated un.
18 / 40
Edgel normal vector
Direction of a plane defined by the focal point and by anedgel, or a corresponding environment line.
nk = uxkJx
k +uykJ
yk
19 / 40
Objective function
The objective function is the heart of the method.Does not depend on the extraction technique.Determines the result.Conducts the choice of the optimization algorithm.
The expression has the shape of a summation of errorsobtained from each observation.
yn = xn− x̂n(Ψ)
F (Ψ) = ∑n
(yn)2
20 / 40
Rotation matrixAn edgel direction prediction starts by finding thedirections of the natural frame rk as a function of Ψ.
Corisco works with quaternions.
Ψ = (Ψa,Ψb,Ψc ,Ψd) |Ψ|= 1
For a general orientation Ψ we have
R(Ψ) =[rx ry rz
]T =[Ψa2 + Ψb2−Ψc2−Ψd 2 2ΨbΨc +2ΨaΨd 2ΨbΨd −2ΨaΨc
2ΨbΨc −2ΨaΨd Ψa2−Ψb2 + Ψc2−Ψd 2 2Ψc Ψd +2ΨaΨb
2ΨbΨd +2ΨaΨc 2Ψc Ψd −2ΨaΨb Ψa2−Ψb2−Ψc2 + Ψd 2
]
21 / 40
Edgel residueGiven the rk , we calculate the predicted vnk using Jk .
Ψ→ rk → vnk
vnk ∝ Jnrk
The residue compares:vn (ou un) taken from the imagem.vnk predicted from Ψ e pn.
Previous methods: Corisco:∠v = arctan(vx ,vy ) unvnk∠vn−∠vnk
22 / 40
Error functionComparison with existing edgel based techiques:
MAP EM M-estimationModels Probabilistic Probabilistic ρ(x)Classes 4 4 3Classification × Iterative Direct
0.2 0.1 0.0 0.1 0.20.0
0.2
0.4
0.6
0.8
1.0
1.2
Funções de Tukey, e quadrática
23 / 40
Objective function expressionMAP estimation (Coughlan and Yuille [2003]):
F (Ψ) =−∑nlog(
∑k
p(cn = k)p(∠vn|Ψ,pn,cn = k))
EM estimation (Schindler and Dellaert [2004]):
F (Ψ) =−∑n
∑k
p(cn = k) log (p(∠vn|Ψ,pn,cn = k))
F (Ψ) = ∑n
∑k
p(cn = k)(∠vn−∠vnk)2
Corisco:F (Ψ) = ∑
nmin
kρ(unvnk)
24 / 40
Objective function expressionMAP estimation (Coughlan and Yuille [2003]):
F (Ψ) =−∑nlog(
∑k
p(cn = k)p(∠vn|Ψ,pn,cn = k))
EM estimation (Schindler and Dellaert [2004]):
F (Ψ) =−∑n
∑k
p(cn = k) log (p(∠vn|Ψ,pn,cn = k))
F (Ψ) = ∑n
∑k
p(cn = k)(∠vn−∠vnk)2
Corisco:F (Ψ) = ∑
nmin
kρ(unvnk)
24 / 40
F (Ψ) =−∑n
∑k
p(cn = k) log (p(∠vn|Ψ,pn,cn = k))
Objective function expressionMAP estimation (Coughlan and Yuille [2003]):
F (Ψ) =−∑nlog(
∑k
p(cn = k)p(∠vn|Ψ,pn,cn = k))
EM estimation (Schindler and Dellaert [2004]):
F (Ψ) =−∑n
∑k
p(cn = k) log (p(∠vn|Ψ,pn,cn = k))
F (Ψ) = ∑n
∑k
p(cn = k)(∠vn−∠vnk)2
Corisco:F (Ψ) = ∑
nmin
kρ(unvnk)
24 / 40
F (Ψ) = ∑n
∑k
p(cn = k)(∠vn−∠vnk)2
Objective function expressionMAP estimation (Coughlan and Yuille [2003]):
F (Ψ) =−∑nlog(
∑k
p(cn = k)p(∠vn|Ψ,pn,cn = k))
EM estimation (Schindler and Dellaert [2004]):
F (Ψ) =−∑n
∑k
p(cn = k) log (p(∠vn|Ψ,pn,cn = k))
F (Ψ) = ∑n
∑k
p(cn = k)(∠vn−∠vnk)2
Corisco:F (Ψ) = ∑
nmin
kρ(unvnk)
24 / 40
Corisco:F (Ψ) = ∑
nmin
kρ(unvnk)
Optimization
First step: RANSAC, stochastic search guided by data.Inherently inefficient and imprecise.
→Ψ inicial
Second step: FilterSQP, continuous optimization, moreefficient and precise than RANSAC.
→Ψ final
25 / 40
Optimization
First step: RANSAC, stochastic search guided by data.Inherently inefficient and imprecise.
→Ψ inicial
Second step: FilterSQP, continuous optimization, moreefficient and precise than RANSAC.
→Ψ final
25 / 40
Optimization
First step: RANSAC, stochastic search guided by data.Inherently inefficient and imprecise.
→Ψ inicial
Second step: FilterSQP, continuous optimization, moreefficient and precise than RANSAC.
→Ψ final
25 / 40
RANSACObservation triplets are randomly selected. From eachone we calculate a hypothetical Ψ, using nk .
The Ψ with the smallest F (Ψ) is the initial estimate.
http://i.imgur.com/O9jP8tz.gifv
26 / 40
FilterSQPMinimize F (Ψ) over the 4D and constrained to |Ψ|= 1.
Non-linear program → SQP: Sequential QuadraticProgramming
FilterSQP (Fletcher and Leyffer [2002]) dismisses penaltyfunctions.
27 / 40
Derivative
Corisco calculates the Ψ derivatives by closed formulas.
∂F∂ Ψa (Ψ) = ∑
nkKnk ρ
′(unvnk)(
uxn
∂vxnk
∂ Ψa + uyn
∂vynk
∂ Ψa
)
The derivatives of the directions rk are trivial:
∂ rx∂ Ψa = 2(Ψa,Ψd ,−Ψc)
∂ rx∂ Ψb = 2(Ψb,Ψc ,Ψd)
· · ·
28 / 40
Experiments
We performed 3 experiments to assess the performanceof Corisco.
Each experiment used a different image set, and methodto obtain the reference orientations.
The observed error is the displacement in degrees of the“rotação residual” entre cada estimativa e referência.
Corisco was executed changing the settings:grid size Cg ,number of RANSAC iterations Cr .
29 / 40
YorkUrbanDB
Images: 101 imagens from anthropic environments.
Model: Perspective.
Reference: Semi-automatic method based on lines.
Comparison: Methods studied by Denis et al. [2008].
Camera parameters, reference orientations andperformance statistics provided by the authors.
30 / 40
YorkUrbanDB
32 / 40
Method Time Error[s] Mean σ 1/4 Median 3/4
EM Newton 27+? 4.00◦ 1.00◦ 1.15◦ 2.61◦ 4.10◦MAP Quasi-Newton 6+? 4.00◦ 1.00◦ 1.32◦ 2.39◦ 4.07◦EM Quasi-Newton 1+? 9.00◦ 1.00◦ 4.04◦ 6.21◦ 10.33◦J-linkage 1.13 8.23◦ 13.76◦ 1.14◦ 2.36◦ 4.44◦Corisco Cr = 104 Cg = 1 47.20 1.51◦ 3.26◦ 0.69◦ 1.09◦ 1.51◦Corisco Cr = 104 Cg = 4 16.68 1.71◦ 3.35◦ 0.72◦ 1.14◦ 1.64◦Corisco Cr = 104 Cg = 32 7.57 2.43◦ 4.03◦ 0.97◦ 1.54◦ 2.42◦Corisco Cr = 103 Cg = 1 8.12 1.70◦ 3.22◦ 0.70◦ 1.11◦ 1.77◦Corisco Cr = 103 Cg = 4 2.50 2.02◦ 3.86◦ 0.81◦ 1.24◦ 1.80◦Corisco Cr = 103 Cg = 32 0.99 2.44◦ 3.54◦ 1.00◦ 1.68◦ 2.64◦Corisco Cr = 200 Cg = 1 5.34 2.08◦ 3.38◦ 0.72◦ 1.22◦ 1.78◦Corisco Cr = 200 Cg = 4 1.89 3.27◦ 6.38◦ 0.85◦ 1.34◦ 2.35◦Corisco Cr = 200 Cg = 32 0.45 3.29◦ 4.99◦ 0.99◦ 1.72◦ 3.46◦
YorkUrbanDB
32 / 40
Method Time Error[s] Mean σ 1/4 Median 3/4
EM Newton 27+? 4.00◦ 1.00◦ 1.15◦ 2.61◦ 4.10◦MAP Quasi-Newton 6+? 4.00◦ 1.00◦ 1.32◦ 2.39◦ 4.07◦EM Quasi-Newton 1+? 9.00◦ 1.00◦ 4.04◦ 6.21◦ 10.33◦J-linkage 1.13 8.23◦ 13.76◦ 1.14◦ 2.36◦ 4.44◦Corisco Cr = 104 Cg = 1 47.20 1.51◦ 3.26◦ 0.69◦ 1.09◦ 1.51◦Corisco Cr = 104 Cg = 4 16.68 1.71◦ 3.35◦ 0.72◦ 1.14◦ 1.64◦Corisco Cr = 104 Cg = 32 7.57 2.43◦ 4.03◦ 0.97◦ 1.54◦ 2.42◦Corisco Cr = 103 Cg = 1 8.12 1.70◦ 3.22◦ 0.70◦ 1.11◦ 1.77◦Corisco Cr = 103 Cg = 4 2.50 2.02◦ 3.86◦ 0.81◦ 1.24◦ 1.80◦Corisco Cr = 103 Cg = 32 0.99 2.44◦ 3.54◦ 1.00◦ 1.68◦ 2.64◦Corisco Cr = 200 Cg = 1 5.34 2.08◦ 3.38◦ 0.72◦ 1.22◦ 1.78◦Corisco Cr = 200 Cg = 4 1.89 3.27◦ 6.38◦ 0.85◦ 1.34◦ 2.35◦Corisco Cr = 200 Cg = 32 0.45 3.29◦ 4.99◦ 0.99◦ 1.72◦ 3.46◦
YorkUrbanDB
32 / 40
Method Time Error[s] Mean σ 1/4 Median 3/4
EM Newton 27+? 4.00◦ 1.00◦ 1.15◦ 2.61◦ 4.10◦MAP Quasi-Newton 6+? 4.00◦ 1.00◦ 1.32◦ 2.39◦ 4.07◦EM Quasi-Newton 1+? 9.00◦ 1.00◦ 4.04◦ 6.21◦ 10.33◦J-linkage 1.13 8.23◦ 13.76◦ 1.14◦ 2.36◦ 4.44◦Corisco Cr = 104 Cg = 1 47.20 1.51◦ 3.26◦ 0.69◦ 1.09◦ 1.51◦Corisco Cr = 104 Cg = 4 16.68 1.71◦ 3.35◦ 0.72◦ 1.14◦ 1.64◦Corisco Cr = 104 Cg = 32 7.57 2.43◦ 4.03◦ 0.97◦ 1.54◦ 2.42◦Corisco Cr = 103 Cg = 1 8.12 1.70◦ 3.22◦ 0.70◦ 1.11◦ 1.77◦Corisco Cr = 103 Cg = 4 2.50 2.02◦ 3.86◦ 0.81◦ 1.24◦ 1.80◦Corisco Cr = 103 Cg = 32 0.99 2.44◦ 3.54◦ 1.00◦ 1.68◦ 2.64◦Corisco Cr = 200 Cg = 1 5.34 2.08◦ 3.38◦ 0.72◦ 1.22◦ 1.78◦Corisco Cr = 200 Cg = 4 1.89 3.27◦ 6.38◦ 0.85◦ 1.34◦ 2.35◦Corisco Cr = 200 Cg = 32 0.45 3.29◦ 4.99◦ 0.99◦ 1.72◦ 3.46◦
YorkUrbanDB
32 / 40
Method Time Error[s] Mean σ 1/4 Median 3/4
EM Newton 27+? 4.00◦ 1.00◦ 1.15◦ 2.61◦ 4.10◦MAP Quasi-Newton 6+? 4.00◦ 1.00◦ 1.32◦ 2.39◦ 4.07◦EM Quasi-Newton 1+? 9.00◦ 1.00◦ 4.04◦ 6.21◦ 10.33◦J-linkage 1.13 8.23◦ 13.76◦ 1.14◦ 2.36◦ 4.44◦Corisco Cr = 104 Cg = 1 47.20 1.51◦ 3.26◦ 0.69◦ 1.09◦ 1.51◦Corisco Cr = 104 Cg = 4 16.68 1.71◦ 3.35◦ 0.72◦ 1.14◦ 1.64◦Corisco Cr = 104 Cg = 32 7.57 2.43◦ 4.03◦ 0.97◦ 1.54◦ 2.42◦Corisco Cr = 103 Cg = 1 8.12 1.70◦ 3.22◦ 0.70◦ 1.11◦ 1.77◦Corisco Cr = 103 Cg = 4 2.50 2.02◦ 3.86◦ 0.81◦ 1.24◦ 1.80◦Corisco Cr = 103 Cg = 32 0.99 2.44◦ 3.54◦ 1.00◦ 1.68◦ 2.64◦Corisco Cr = 200 Cg = 1 5.34 2.08◦ 3.38◦ 0.72◦ 1.22◦ 1.78◦Corisco Cr = 200 Cg = 4 1.89 3.27◦ 6.38◦ 0.85◦ 1.34◦ 2.35◦Corisco Cr = 200 Cg = 32 0.45 3.29◦ 4.99◦ 0.99◦ 1.72◦ 3.46◦
ApaSt
Images: 24+24 building images.
Model: Perspective with radial distortion (Harris).
Reference: Bundler.
33 / 40
ApaStBundler
(Snavely et al. [2006]) foi used to obtain the referenceorientations and intrinsic parameters.
Point-based multi-view method.
34 / 40
ApaSt
0 5 10 15 20Error [degrees]
1
2
4
8
16
32
64
128
Gri
d s
paci
ng C
g [
pix
els
]
Error distribution
10 100Time [seconds]
Process duration
TotalRANSAC
Corisco performance on ApaSt, Cr=10000 RANSAC iterations
35 / 40
ApaSt
0 5 10 15 20Error [degrees]
1
2
4
8
16
32
64
128
Gri
d s
paci
ng C
g [
pix
els
]
Error distribution
1 10Time [seconds]
Process duration
TotalRANSAC
Corisco performance on ApaSt, Cr=1000 RANSAC iterations
35 / 40
StreetViewImages: 250 images from an urban environment.
Model: Equiretangular projection.
Reference: Physical sensors (IMU).
36 / 40
StreetViewErro ' 1◦, time ' 17s.
0 50 100 150 200 2500.9970
0.9975
0.9980
0.9985
0.9990
0.9995
1.0000
0 50 100 150 200 2500.03
0.02
0.01
0.00
0.01
0.02
0.03
0 50 100 150 200 2500.03
0.02
0.01
0.00
0.01
0.02
0.03
0 50 100 150 200 2500.03
0.02
0.01
0.00
0.01
0.02
0.03
Parâmetros estimados e referência ajustada
Estimativa Referência
37 / 40
Conclusion
Corisco is a method with great potential for immediateapplication.
Relatively simple implementation.Great robustness (distortions, RANSAC).Good performance.
We demonstrated the advantages of applying the gridmask and M-estimation to the problem.
We also demonstrated how to use FilterSQP in order towork with quaternions in a convenient way.
38 / 40
Future Work
Automatic parameter control. Replace RANSAC.Apply to multi-view reconstruction, monocularSLAM, object tracking...Use directional filters to measure the angular error.Use the grid mask in other problems.Estimate orientation, camera parameters, andextract curves all in a single unified process. (MRF?)
39 / 40
Referências BibliográficasB. Caprile and V. Torre. Using vanishing points for camera calibration.International Journal of Computer Vision, 4(2):127–139, March 1990.ISSN 0920-5691. doi: 10.1007/BF00127813. URLhttp://www.springerlink.com/content/k75077108473tm15/.
R Cipolla, T Drummond, and D Robertson. Camera calibration fromvanishing points in images of architectural scenes. In British MachineVision Conference, volume 2, pages 382–391, Nottingham, England,1999. BMVA.
James M. Coughlan and A. L. Yuille. Manhattan World: Orientation andoutlier detection by Bayesian inference. Neural Computation, 15(5):1063—-1088, March 2003. URL http://www.mitpressjournals.org/doi/abs/10.1162/089976603765202668.
Patrick Denis, James H Elder, and Francisco J Estrada. Efficientedge-based methods for estimating Manhattan frames in urban imagery.In European Conference on Computer Vision, pages 197–210, Marselha,França, 2008. Springer. doi: 10.1007/978-3-540-88688-4\_15.
Roger Fletcher and Sven Leyffer. Nonlinear programming without a penaltyfunction. Mathematical Programming, 91(2):239–269, January 2002.ISSN 0025-5610. doi: 10.1007/s101070100244. URLhttp://www.springerlink.com/content/qqj37x00y79ygdl8/.
1 / 3
Carsten Rother. A new approach to vanishing point detection inarchitectural environments. Image and Vision Computing, 20(9-10):647–655, 2002. doi: 10.1016/S0262-8856(02)00054-9. URLhttp://dx.doi.org/10.1016/S0262-8856(02)00054-9.
G. Schindler and F. Dellaert. Atlanta World: An expectation maximizationframework for simultaneous low-level edge grouping and cameracalibration in complex man-made environments. In Conference onComputer Vision and Pattern Recognition, pages 203–209, Washington,DC, USA, 2004. IEEE. ISBN 0-7695-2158-4. doi:10.1109/CVPR.2004.1315033. URL http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1315033.
Noah Snavely, Steven M. Seitz, and Richard Szeliski. Photo Tourism:Exploring photo collections in 3D. In SIGGRAPH, pages 835–846,Boston, MA, USA, 2006. ACM. URLhttp://dl.acm.org/citation.cfm?id=1141964.
Nicolau Leal Werneck and Anna Helena Reali Costa. Corisco: Robustedgel-based orientation estimation for generic camera models. Imageand Vision Computing, 12(31):969—-981, 2013. URLhttp://nic.hpavc.net/almoxarifado/imavis2013-final.pdf.
2 / 3
Calibration test
Estimating focal distance with the same F (Ψ).
600 700 800 900 1000 1100 1200Distância focal
0
500
1000
1500
2000
Funçã
o o
bje
tivo (
transl
adada)
N8 - todas imagens
600 700 800 900 1000 1100 1200Distância focal
0
20
40
60
80
100
120
140
160
Funçã
o o
bje
tivo (
transl
adada)
N8 - imagens individuais
600 700 800 900 1000 1100 1200Distância focal
0
500
1000
1500
2000
Funçã
o o
bje
tivo (
transl
adada)
a230 - todas imagens
600 700 800 900 1000 1100 1200Distância focal
0
20
40
60
80
100
120
140
160
Funçã
o o
bje
tivo (
transl
adada)
a230 - imagens individuais
Calibração baseada no Corisco - Função objetivo variando com a distância focal
Referência (Bundler) Estimado (Corisco)Referência (Bundler) Estimado (Corisco)Referência (Bundler) Estimado (Corisco)Referência (Bundler) Estimado (Corisco)
3 / 3