1
BSc Informatica
Camera pose estimation withcircular markers
Joris Stork
August 13, 2012
Supervisor: Rein van den Boomgaard
Signed:
2
Estimating the position of a fiducial marker relative to a robot up to a distanceof 10 to 20 metres requires only a single camera in the one to two megapixel range,and a pattern produced on an A4 sheet of paper with a consumer grade inkjetprinter. Pose estimation of this type can lower the overall cost of robot naviga-tion systems. A review of related work shows that circular fiducial markers offersuperior pose estimation characteristics compared to square fiducial markers, butthat no up-to-date implementations based on circular markers are freely availablein the public domain. This project implements a technique for estimating the poseof a circular marker. In 2004, Chen et al. described the technique in theoreticalterms. Pagani et al. reported in 2011, in general terms, that they had implementedthe technique with promising results. The implementation is, however, not freelyavailable. Here, a new implementation of the technique is presented, and its per-formance is evaluated in detail. This thesis confirms that the technique proposedby Chen et al. does produce estimates that systematically approximate the posi-tion of a circular marker within certain parameters. With further development inmind, the system presented here promises to have useful applications in roboticnavigation and task execution.
Contents
1 Introduction 51.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.3 Research question . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Background 72.1 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Choice of technique . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3 Theory 143.1 Oblique elliptical cone Q . . . . . . . . . . . . . . . . . . . . . . . . 143.2 Oblique circular cone Qc . . . . . . . . . . . . . . . . . . . . . . . . 153.3 Rotation R1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.4 Rotation R and translation t . . . . . . . . . . . . . . . . . . . . . . 163.5 Centre C and normal N of circle . . . . . . . . . . . . . . . . . . . 17
4 Implementation 194.1 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.2 Pose estimation pipeline . . . . . . . . . . . . . . . . . . . . . . . . 244.3 Notes from debugging . . . . . . . . . . . . . . . . . . . . . . . . . 29
5 Experiments 345.1 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345.3 First experiment: real camera . . . . . . . . . . . . . . . . . . . . . 355.4 Second experiment: simulated . . . . . . . . . . . . . . . . . . . . . 395.5 Third experiment: simulated . . . . . . . . . . . . . . . . . . . . . . 42
6 Conclusion 566.1 Assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 566.2 Further work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586.4 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4
CHAPTER 1
Introduction
“The nature of God is a circle of which the center is everywhere andthe circumference is nowhere.” - Empedocles
The scientific literature describes practicable techniques to deter-mine the poses of readily observable instances of ellipse. As used inthis document, “pose” is a concept from the field of computer vision,denoting the position and orientation of an object relative to a givencoordinate system.
This thesis attempts to describe the feasibility of implementing onetechnique for the estimation of a circle’s pose from its image on acamera sensor.
The remainder of this section explains the motivation and aims be-hind this project, and formulates the research question. Chapter 2places the research question in a broader context and justifies the choiceof pose estimation technique. Chapter 3 explains the technique in the-oretical terms, and chapter 4 presents an implementation of the tech-nique. Three series of experiments to evaluate the implementation aredetailed in chapter 5. By way of a conclusion, chapter 6 reviews andinterprets the experimental findings, before suggesting areas for furtherwork.
1.1 Motivation
The motivation for this project lies both in the general advantage ofgaining a greater familiarity with the field of computer vision, and inthe specific advantage of obtaining a potentially useful marker-based,single camera pose estimation system. As shown in section 2.1, thepresent landscape of such systems offers a limited range of free - as inboth “free beer” and “free speech” - implementations with the charac-
5
teristics needed for indoor and outdoor robotic navigation.A robotic navigation system centred on a single, simple digital cam-
era, possibly in combination with rudimentary obstacle avoidance sen-sors, such as ultrasound range finders, promises the advantages of lowcost and low weight. In addition, location-dependent task informationmay be embedded in the data payload of markers deployed for a fiducialpose estimation system. A well designed combination of the pose andtask components of such a system could represent a low cost robotictask execution framework that would be easy to deploy in a wide rangeof applications and environments. Given the active review of regula-tions governing the commercial use of robots in various jurisdictions, avery high level, easy to use yet flexible framework to design and deployrobotic task execution algorithms may soon become desirable.
Should the process of implementing a promising but non-proprietaryvisual pose estimation technique turn out to be successful, it would con-tribute to the construction of a robotic navigation and task executionsystem that the author has envisaged (c.f. section 2.2.1). Whether ornot this project results in a useful implementation, it will have taughtme a useful thing or two, namely about: concepts and techniques incomputer vision; conducting research; implementing a mathematicallyformulated technique; building and conducting experiments; and writ-ing a reasonably scientific report.
Last but not least, this thesis could partly satisfy the curiosity ofanyone who wonders whether the technique in [6] is valid and whetherit was indeed successfully implemented in [18] (refer to chapter 2 foran explanation of this question).
1.2 Objective
This thesis is constructed to achieve four successive goals. First itshould demonstrate a reasonable understanding of the chosen pose es-timation technique: in other words, why should it work? Second, itshould offer a reasonably functional and maintain-able implementationof the technique. Then, insight into the performance of the implemen-tation should be achieved through relevant experimentation. Finally,the performance data should be analysed to identify prospects for fu-ture improvements and applications.
1.3 Research question
Is it possible to implement the technique for visual pose estimationdescribed in Chen et. al.’s 2004 paper, [6], as a system with potentialapplications in robotic navigation?
6
CHAPTER 2
Background
The field of computer vision has produced dozens of pose estimationsystems since the 1980s. The types of marker-based systems that aremost useful for robotic navigation rely on so-called fiducial markers.Table 2.1 presents an overview of the better known fiducial markerbased systems, collated in preparation for this thesis during a reviewof the relevant literature. Fiducial markers serve primarily to reveal in-formation about the poses of objects in a scene, in contrast to “barcodestyle” visual markers, which are designed to work as general purposedata media in a scene. Section 2.1.1 describes some notable examplesof the barcode-style marker.
2.1 Related work
2.1.1 Barcode-style markers
A few examples of the barcode type of marker will help the reader todistinguish these from fiducial markers. While some of these markersallow for a degree of pose estimation, their primary design goal is ro-bust marker identification and marker data registration to improve onthe one-dimensional barcode. The best known system is the QR Code,a proprietary design intended for “direct part mapping” (DPM), whichits creators, Denso Wave, released into the public domain, allowing itsfree use. QR code has found a wide range of applications, notably innon-industrial applications. The ECC200 Data Matrix is another sys-tem developed for DPM, specifically along assembly lines.[9] An exam-ple of a large, non-DPM system is the US Postal Service’s deploymentof Maxicode[5],to track parcels in its distribution network.
2.1.2 Fiducial markers
Fiducial marker based systems have applications in various fields thatrequire reliable identification and relatively precise pose estimation.
7
These fields include Augmented Reality (AR), photogrammetry, medi-cal imaging, and robotics. Table 2.1 lists notable fiducial marker basedsystems, along with, where possible, details of their respective geome-tries, source code availability, licensing, and functional characteristics.The best known pose estimation system that employs fiducial mark-ers is ARToolkit.[12] ARToolkit has been used widely for research andcommercial applications since the late 1990s due to its relative ease ofuse and due to the availability of the application source code. How-ever, ARToolkit has been superseded in terms of performance by morerecent systems.
One can classify fiducial markers in terms of their geometry, whichmay consist of: square or generally rectilinear patterns; circular pat-terns; or dots. Different materials may be specified, too. For exam-ple: polychrome inks to increase information density; infrared reflectivecoatings to make the markers invisible to the naked eye; retro-reflectivematerials to enhance marker visibility under illumination from the cam-era’s position, and; three-dimensional substrates to enlarge the fieldwithin which the camera can register the marker.
Most fiducial markers, including ARToolkit, carry little more datathan is required to distinguish marker instances. Several marker speci-fications allow a choice of data size to provide some flexibility in tradingoff information density against error correction and effective image size.
System Geometry Notes
AprilTag (2010) square open source / “fully open”; developed at U. ofMichigan; similar to ARTag; claimed: good doc-umentation; employs line detection.[17]
ARTag (2005) square GPL only; claimed: enhanced data encoding ro-bustness due to implementation of checksumsand ”forward error correction (FEC)”; up to2002 unique IDs; no need to store patterns; re-silience against partial occlusion by “estimatingborder segments”.[9].
ARToolkit (2000) square GPLv2 and commercial license.[12]
ARToolkit+ (2007) square GPLv3; claimed improvements over ARToolkit:easier C++ api; reduced jitter; compensation forvignetting; auto thresholding. [28]
Aruco (2011) square BSD license; developed at U. of Cordoba; C++
library; claimed: only dependency is OpenCV;up to 1024 distinct markers.[8]
8
System Geometry Notes
Cantag (2006) n/a GPLv2; developed at Cambridge U.; modularframework for comparing and developing fiducialsystem designs, including a simulation module;written in C++; no official release; source codenot maintained since 2009.[20]
CyberCode (2000) square commercial; designed for consumer AR applica-tions involving mobile phone cameras.[19]
Cho et. al. (1998) circular unknown licensing / availability; multi-ringed,colour markers; theoretical research, implemen-tation unavailable.[7]
Fourier Tag (2007) circular unknown licensing / availability; developed atMcGill and Montreal U.s; uses Fourier transformand periodic patterns of circles; claimed: grace-ful degradation; designed for “human robot in-teraction”.[23]
HOM (2001) square Hoffman Marker; reportedly used by SiemensCorporate Research and Framatome ANP.[1, 29]
Homest (2010) n/a GPL; homography estimation library written inC and C++; does not find point correspondences;depends on the levmar library.[15]
Intersense IS-1200 circular commercial, patented; MS Windows only; closedsource; claimed: 5 mm position error; 1◦ angularerror; 150Hz update rate; 6 DOF; “unlimited”markers.[11]
IGD marker (1998) square commercial; developed by the Fraunhofer IGDinstitute; performed well in AR competitions,pre-ARToolkit.
isotropic (2003) circular theoretical, implementation unavailable, licens-ing unknown; developed at U. of Taiwan.[25]
MFD-5 (2008) square licensing / source code status unknown; de-veloped by Mark Fiala; deployed in ISMAR2008.[13]
Nakazato et. al. (2005) square unknown source, licensing status; retro-reflectivematerial to make markers invisible except usingan infrared flash; designed to be affixed on ceilingfor detection by “wearable” AR systems.[16]
9
System Geometry Notes
Photomodeler circular commercial; Eos Systems’ Photomodeler soft-ware package; circular markers named ”codedtargets”; photogrammetry applications.[10]
Ptrack (2006) dots proprietary, commercial; developed by theFraunhofer IGD institute; infrared reflectivemarkers.[22]
Runetag (2011) circular open source; license unknown; in development;claimed: strong occlusion resilience[2]
SCR marker (2000) square unknown source code, license status; developedby Siemens Corporate Research.[30]
Studierstube (2008) square unknown source, licensing status; developed byGraz U. of Tech.;designed for low-power plat-forms, mobile phones; successor to ARToolk-itPlus to replace “dated” techniques; modularwith six marker types, two pose estimators, threethresholding algorithms.[24]
SwisTrack (2011) n/a APL; modular (marker/markerless) trackingframework; requires OpenCV; written in C++.[14]
Uchiyama et. al. (2011) dots [27]
Table 2.1: Fiducial marker systems
2.2 Choice of technique
2.2.1 Requirements
The motivation for this project (discussed in section 1.1) includes theprospect of building a system to enhance robot navigation and taskexecution. A marker based pose estimation mechanism with certaincharacteristics would be central to the envisaged system. Here followsa review of the characteristics in question (a few of these criteria areinspired from [9]), and, where appropriate, specifications that we canidentify at this stage.
10
Pose
In this document, pose is taken to signify the coordinates of the markercentre and of the marker normal vector, in the same order of decreasingpriority, with respect to the camera’s pre-determined coordinate system(c.f. chapter 4).
Number of markers
A system that requires a single marker per pose estimate is preferred,as this would be beneficial in terms of: system complexity; visual foot-print; ease of deployment; and cost.
Marker image size
This requirement combines factors such as sensor size, field of view,marker size, system range, and marker payload size. Taking the A4format (210× 297 mm) for its ubiquity in consumer printers, a squaremarker of 210× 210 mm offers the largest marker size - minimising therestrictions on the combination of camera, payload and range - thatremains practical to produce and use. Then, to minimise cost, thechosen sensor size is taken to be a relatively modest 1280x960, or justunder 1.3 megapixels. Note that a larger sensor increases the cost ofthe image processing hardware as well as of the camera. A horizontalfield of view of around 43◦ (resulting in a vertical field of view of around33◦) reflects a fair compromise between marker depth range and thesystem’s ability to spot markers at a greater angle from the optica axis,keeping in mind the kind of pre-positioning afforded by GPS navigationand odometry. With these restrictions in place, a detection range ofone to 20m and a registration and pose estimation range of one to 10mimpose minimum bounds on the marker image height of 17 pixels and34 pixels respectively for those categories.
Occlusion
In a non-laboratory environment, objects are liable to partly or com-pletely occlude the marker in the camera image. In outdoor scenarios,rain and snow could also cause a degree of marker occlusion. Thepose estimation system should provide some resilience against partialocclusion.
Lighting and colour
Non-laboratory conditions include variability in environmental lightintensity, direction and colour. To minimise the impact of these fac-tors, and to otherwise reduce the cost and complexity of the markers,the implementation will involve bi-tonal, namely black and white pat-terns rather than polychromatic patterns. The use of fewer colours
11
implies a reduced data payload. This will have little impact on theintended system, as its data requirements extend to fewer than a hun-dred unique IDs plus error correction overhead. High-contrast blackand white patterns additionally reduce the requirements relating tocamera sensitivity and signal linearisation.[9]
False positive/negative rates
False positive and - to a lesser extent - false negative marker identifi-cation and registration could significantly impair the performance of anavigation and task execution system.[9] Unique characteristics of themarker patterns, such as ratios of the sizes of key pattern elements, andconscious data payload design should reduce the risk of false markeridentification and registration to negligible levels.
Substrate shape
While a three-dimensional marker, such as the conic markers describedin [26], offers a greater “working volume” in which the camera is ableto register the marker, two dimensional markers of the format alreadyspecified are better suited to the intended application’s requirementsfor cost and ease of use.
Marker viewing angle
A maximum effective marker viewing angle of 60◦ from the markernormal is envisaged.
Data payload
As discussed above, the data payload requirement is for the numberof identifiers envisaged - less than a hundred - in addition to the errorcorrection overhead.
Speed and jitter
To allow for vehicle vibration and velocity, the sensor should be capa-ble of taking sufficiently exposed images within a shutter time of onemillisecond or less. The pose estimation pipeline should additionallyprovide updates within the kinds of bounds relating to throughput andjitter that enable reasonably responsive and consistent navigation.
2.2.2 Chen et. al., 2004
A review of related work (c.f. section 2.1) reveals that, as Rice et. al.conclude in [20],
12
“There are numerous digital marker systems and numeroustag designs but most are basically either a square or a circularshape. [. . . ] Our results demonstrate that square tags arebeneficial for large data capacity and that circular tags arebeneficial for location recovery due to the behaviour of acircle under perspective distortion.”,
and as Pagani et. al. point out in [18]:
“[circular markers] are easier to detect and provide a poseestimate that is more robust to noise.”,
circular markers generally offer better pose estimation characteristicsthan square ones. However, the literature review has also shown thatthere are no tried and tested circular marker systems which do notrequire multiple markers and which are free to use and open-source.
These insights dovetailed with a review of [6], which describes amethod for camera calibration using the images of circles. The methodwas originally intended to facilitate eye tracking. Given two or morecoplanar circles of unknown size, the method can derive the cameracoordinates of the centres and normal vectors of the circles, as wellas the camera’s focal length. If the camera’s intrinsic parameters areknown, the pose of a single arbitrary circle of known radius can berecovered from its image. This last capability, and the performance ofone implementation, as described in [18], indicates that the techniquein [6] offers the best prospect for satisfying the requirements outlinedin section 2.2.1. The technique additionally does not in principle con-tradict any of those requirements. As with other circular marker basedsystems, it is possible with current libraries to accurately fit an ellipseeven if only a fraction of its length is visible, meaning that reason-ably accurate pose estimates are possible for partially occluded markerimages.
Enquiries with the authors of [6] and [18] to obtain a working im-plementation were unsuccessful. The former no longer possess usefulsource code related to that work. The latter, who built their implemen-tation at the German Research Centre for Articial Intelligence (DFKI),did not make their source code available. As a result, this thesis is di-rected at answering the question that is formulated in 1.3.
13
CHAPTER 3
Theory
This chapter attempts to explain why the pose estimation method pro-posed in [6] should, in theory, work. We examine the geometry of acircle and its image in a camera, and how this can be used to deriveequations for the rotation R and translation t relating the marker co-ordinate system and the camera coordinate system.
3.1 Oblique elliptical cone Q
Let us consider the oblique cone with the camera focal point at itsapex and the elliptical image of the marker as its intersection with theimage plane. In the camera coordinate system (CCS), we can expressthe ellipse as a general quadratic curve, with equation 3.1.
Ax2e + 2Bxeye + Cy2e + 2Dxe + 2Eye + F = 0. (3.1)
Describing a point on the curve with the vector
xeye1
, we can encode
the parameters of the quadratic curve in matrix form, as:
[xe ye 1
] A B DB C ED E F
xey31
= 0. (3.2)
In the CCS, z0 = −f where f is the camera’s focal length. Then,by multiplying the vector of a point on the ellipse by a scale factor kdirectly related to the distance of the point to the cone’s apex, we canencode all points on the cone with equation 3.3.
P = k
xeye−f
. (3.3)
14
Now, to substitute the points on the ellipse,
xeye1
, with the points
on the oblique cone, P , in equation 3.2, we must alter the matrix ofparameters so that the resulting equation holds. The matrix becomesthat in equation 3.4. Note that Q is a symmetric matrix: this willenable the eigendecomposition of Q in section 3.5.
Q =
A B −Df
B C −Ef
−Df−E
fFf2
. (3.4)
Equation 3.5 then holds, and encodes the oblique cone under con-sideration.
P TQP = 0. (3.5)
3.2 Oblique circular cone Qc
Let us now consider the oblique cone with the camera focal point atits apex and the marker circle as its intersection with the marker’ssupporting plane. Let us call the orthonormal coordinate sytem with
as its Z-axis the unit normal vector originating in the centre
x0y0z0
of
the marker circle, the marker coordinate system (MCS). Then, taking r
as the ray of the circle of points
xyz
, equation 3.6 encodes the marker
circle in the MCS. Note that in the MCS, z = z0 = 0.
(x− x0)2 + (y − y0)2 = r2. (3.6)
Note that equation 3.6 encodes the circle under consideration in thissection in the MCS, just as equation 3.1 encodes the ellipse underconsideration in section 3.1 in the CCS. We can now encode the points
Pc =
xyz
on the oblique cone under consideration in this section
analogously to equation 3.2 by substituting Pc for P and substitutinga matrix Qc for Q so that equation 3.6 holds for the oblique cone underconsideration. We then have the matrix Qc in equation 3.7.
Qc =
1 0 −x0z0
0 1 −y0z0
−x0z0−y0z0
x20+y20−r2z20
. (3.7)
15
Equation 3.8 describing the oblique cone under consideration in thissection then holds.
P Tc QcPc = 0 (3.8)
3.3 Rotation R1
The cones described in sections 3.1 and 3.2 are in fact the same cone,described with different orthonormal bases. The result is that a ro-tational transformation of the one cone (or seen another way, of itsbasis), places all its points on the other. So, a rotation - let us call itR1 - relates the points P and Pc. Encoding R1 as a matrix, we obtainequation 3.9.
P = R1Pc (3.9)
3.4 Rotation R and translation t
Now, determining the unit normal vector and centre of the circle amountsto determining R1 and Qc. Let us isolate those terms, as follows.
From equations 3.5 and 3.8 we obtain equation 3.10
P TQP = P Tc QcPc (3.10)
Then, substituting equation 3.9 in equation 3.10, we obtain (3.11).
(R1Pc)TQR1Pc = P T
c QcPc⇔ P T
c RT1QR1Pc = P T
c QCPC⇔ P T
c (RT1QR1 −Qc)Pc = 0
(3.11)
Given that, if for any n-dimensional vector v and n × n matrix A,vTAv, then A = 0, we can derive equation 3.12 from equation 3.11.
RT1QR1 = Qc (3.12)
Since for any scalar kc, kcQc encodes the same cone as Qc, equa-tion 3.12 can become equation 3.13.
kcRT1QR1 = Qc (3.13)
Now, given that Q is a symmetric matrix, we can factorize it byeigen decomposition, into the form in (3.14). Λ is the 3 × 3 diagonalmatrix of eigenvalues, {λ1, λ2, λ3}. V is the 3×3 matrix whose columnvectors are Q’s eigenvectors.
Q = V ΛV T =
| | |v1 v2 v3| | |
λ1 0 00 λ2 00 0 λ3
− v1 −− v2 −− v3 −
, (3.14)
16
Substituting equation 3.14 into equation 3.13 we obtain equation 3.15.
kRT1 V ΛV TR1 = Qc (3.15)
Let us take R = V TR1 ⇔ RT = RT1 V for convenience and substitute
into equation 3.15 to obtain equation 3.16
kRTΛR = Qc (3.16)
At this stage, Chen et al. appear to combine the derivations explainedso far in this section with the general form of a three dimensionalrotation (left handed, clockwise with Euler angles φ, θ and ψ)),
cos θ cosψ − cosφ sinψ + sinφ sin θ cosψ sinφ sinψ + cosφ sin θ cosψcos θ sinψ cosφ cosψ + sinφ sin θ sinψ − sinφ cosψ + cosφ sin θ sinψ− sin θ sinφ cos θ cosφ cos θ
,to arrive at the definition for the rotation R shown in equation 3.17
and the definition for the translation t, between the MCS and CCSorigins, shown in equation 3.18. Note that these definitions are am-biguous with respect to the undetermined signs {S1, S2, S3} and the asyet unknown rotation angle α.
R = V
√
λ2−λ3λ1−λ3 cosα S1
√λ2−λ3λ1−λ3 sinα S2
√λ1−λ2λ1−λ3
sinα −S1 cosα 0
S1S2
√λ1−λ2λ1−λ3 cosα S2
√λ1−λ2λ1−λ3 sinα S1
√λ2−λ3λ1−λ3
, (3.17)
t =
−S2S3 r cosα
√(λ1−λ2)(λ2−λ3)
−λ1λ3
−S1S2S3 r sinα√
(λ1−λ2)(λ2−λ3)−λ1λ3
S3λ2r√−λ1λ3
, (3.18)
where: λ1λ2 > 0,
λ1λ3 < 0,
|λ1| ≥ |λ2|.. (3.19)
3.5 Centre C and normal N of circle
Chen et al. derive equation 3.20 for the circle’s centre, which is thekey equation for our pose estimation system, and 3.21 for the circle’snormal, in the CCS:
C = z0V
S2
λ3λ2
√λ1−λ2λ1−λ3
0
−S1λ1λ2
√λ2−λ3λ1−λ3
, (3.20)
17
N = V
S2
√λ1−λ2λ1−λ30
−S1
√λ2−λ3λ1−λ3
, (3.21)
where
z0 = S3λ2r√−λ1λ3
, (3.22)
and where λ1, λ2 and λ3 are ordered according to 3.19.
18
CHAPTER 4
Implementation
4.1 Framework
4.1.1 Marker
In its current form the application recognises instances of the markermodel shown in figure 4.1. The thick black circular band against awhite background provides two sharp circular contours that are sep-arated widely enough to be discernable at a distance of up to 20 m,given the marker and camera parameters envisaged (c.f. section 2.2.1),but leave a large enough enclosed white space to include a data payloadpattern.
Figure 4.1: marker model
4.1.2 Language
The pose estimation system presented here is a command-line applica-tion written in Python v.2.7. Choosing a language is partly a matterof taste. Yet Python does offer some concrete benefits, including ma-ture implementations of essential libraries such as OpenCV for imageprocessing, Numpy for numerical work, and PyOpenGL for generating3D graphics - although the latter could do with better documenta-tion. Python is interpreter driven, but CPU-intensive libraries such as
19
the ones just mentioned internally consist of highly optimised C andC++ code. Development time is reduced due to the sparse syntax anduntyped variable names; a large and friendly scientific programmingcommunity; and the interactive interpreters, such as iPython, whichfacilitate tests on new snippets of code before these are included intothe application. Finally, the Python language, or rather its interpreter,is cross-platform, although unfortunately it is not natively supportedon any major mobile operating systems.
As shown in section 5, the use of an interpreted language slows theapplication down compared to, say, an implementation in C++. Still,the speedier development process and the consideration against pre-mature optimisation combine to compensate sluggish performance inearly iterations. Python makes it easy to profile an application’s per-formance at a later stage, and thereafter to incorporate bindings of themost CPU intensive functions if these are re-implemented in C or C++.
4.1.3 Concurrency
In simulation mode the application structure invites a degree of con-currency, as it comprises a relatively separate simulator componentthat generates and then passes images to a pose estimation pipelinecomponent. This structure motivates the separation of the simulationand pipeline components into separate processes handled by Python’smultiprocessing package. Unlike threading the multiprocessing
library spawns separate processes to circumvent Python’s Global In-terpreter Lock, thus enabling parallel execution across multiple coresor machines. The result is a two-process application in simulationmode (c.f. figure 4.2). Synchronisation is achieved with the q2pipe
and q2sim queues, which uses pickle internally to transport objects,such as simulated camera parameters, Numpy arrays of image data, andsynchronisation strings.
4.1.4 A brief manual
The application is designed to be easy to install, use, debug, and main-tain. This section provides basic instructions to get it up and running.
Installation
Please contact the author if you received this report without the ac-companying source code. The code is packaged in a .tar.bz2 archive.The application’s main file, main.py, is located in the root of the un-packed directory structure. In addition to the included modules, theapplication requires the following Python libraries (in some cases loweror higher version numbers will work):
• Python 2.7 (interpreter and standard library)
20
ieee 1394 module
images
pydc1394
pipeline
contours, ellipses, estimates
pipeline, pipeline modules
process 1/1
printer
plots, printouts
outputs.printer
(a) Real camera
simulator
images
simulator.*
process 1/2
pipeline
contours, ellipses, estimates
pipeline, pipeline modules.*
printer
plots, printouts
outputs.printer
process 2/2
(b) Simulation
Figure 4.2: Application concurrency in simulation mode
• OpenCV 2.3.1 (with Python bindings)
• Matplotlib 1.1.1
• NumPy 1.6.2
• Python Imaging Library (PIL) 1.1.7
• PyOpenGL 3.0.1
Note that the syntax for udev rules, which are required on a rangeof Linux configurations in order to interface the camera without rootprivileges, changed in version .17x, making the udev rules currentlyprovided on many forums obsolete. The correct udev rule for recentsystems is included in the application’s root directory.
Interface
On a typical POSIX compliant machine, the programme is executedfrom the application root directory with the command form python
main.py [options]. The bash script in run provides some commonuse cases, and running main.py with -h or --help lists all the avail-able command line options. These include: image source type; pipelinemodule selection; toggling module output windows; camera parame-ters; and execution time. For example, the command
21
python main . py −n 3 −s −1 −t 30
runs the application with the first three pipeline modules (the con-tour finder, ellipse fitter, and first pose estimator), in simulation modewith markers generated randomly within view, for 30 seconds. Thelast execution log is located in log, and error-level log entries are ad-ditionally written to /dev/stdout. All functions are documented witha single docstring,
To exit the application, send ctrl+c from the shell or press q in thesimulation window (if present). One can safely ignore OpenGL bufferswap warnings at this stage.
4.1.5 Application modules
Overall initialisation is performed in main.py whereas the pipeline
module drives the pipeline process and contains the main programmeloop. The CameraVals object is defined in the camera values module.In addition, the application contains the following package directories:
admin modules
The loginit and argparse modules handle logging and command lineoptions, respectively.
calibration
In order to derive pose estimates, the pipeline requires two values inaddition to the image: the radius of the outer marker circle, and thecamera’s focal length. In simulation mode the focal length x and y com-ponents are chosen by editing the relevant values in the camera values
module source code. When using a real camera, the focal length isobtained by calibrating the camera, a process that obtains the cam-era’s intrinsic matrix. The calibrate module facilitates the calibra-tion step.calibrate discovers the camera’s intrinsic matrix by searching for
a best fit solution to the homography that relates the points on a“calibration rig”, often a flat black-and-white chessboard pattern, totheir correspondences in a set of images of the rig taken with thecamera in question. The function used to find the point correspon-dences is OpenCV’s findChessboardCorners. The resulting 20 setsof corresponding points are then passed to another OpenCV function,calibrateCamera, which returns estimates for the camera’s intrinsicmatrix, the extrinsic matrix, and the camera distortion. The cali-bration function performs best when provided with guesses for thecamera’s principal point coordinates and its focal length components.calibrate stores the intrinstic matrix containing the focal length ina Numpy binary file, intrinsic.npy. Whenever a pose estimation
22
object is created, it is initialised with the focal length stored in thatfile.
Note that OpenCV’s findChessboardCorners function requires theparameter CALIB CB FAST CHECK to prevent it from searching for toolong in images that do not feature the entire calibration pattern. A testwith timeit shows a ratio of 14:1 between the execution time with andwithout this optimisation. Another - easily overlooked - requirementfor the corner finding function is that the chessboard dimensions shouldbe given in the order (height, width), as the alternative order resultsin incorrect calibration results.
marker
The markers module defines a marker class. Bitmap and vector graphicrepresentations of the marker model are included, e.g. for printing outmarkers. The dimensions text file contains the respective dimensionsof the marker circles.
output
A wrapper class for all data that needs to be transported between thevarious application components is defined in pipeline output. Thisfacilitates communication between the simulation, pipeline and print-ing/plotting components. The wrapper includes methods for extractingdata, such as arrays of coordinates for plots. Writing these wrapperobjects to disk via pickle allows data from previous sessions to bere-used for plotting or for processing by additional pipeline modules.
The printer module combines various methods for printing outoverviews of pipeline data or for producing plots with matplotlib.
pipeline modules
The pipeline modules respectively define the contour finder, ellipse fit-ter and pose estimator classes. The payload decoding and marker iden-tification modules are included as stub files for later implementation.
pydc1394
This package by Holger Rapp and others (see the readme file in thepydc1394 package directory) is a Python wrapper for Damien Doux-champs’ libdc1394 library to interface ieee 1394 cameras. The packagehas not been maintained for some time and has been modified for thisproject to make it compatible with the PTGrey Chameleon camera,used in experiments, and with the application.
23
simulator
The simulator module uses OpenGL to generate images of the 3Dscene containing the marker model, as shown in figure 4.3.
The pose generator module produces the marker centre and vertexcamera coordinates for each image frame. Note that the marker textureand drawing functions are defined in the markers module mentionedabove.
Figure 4.3: Simulation frame
4.2 Pose estimation pipeline
This section describes the principal algorithms at each stage of thepose estimation pipeline. Chapter 3 places the equations referencedin this section into a theoretical context with regard to the techniquedescribed in [6]. Figure 4.6 shows an overview of the equations usedthroughout the pipeline.
4.2.1 Contours
The first processing stage normalises the brightness and increases thecontrast in the image using OpenCV’s equalizeHist function. Then,the image is blurred with OpenCV’s GaussianBlur function using a7×7 Gaussian kernel and a standard deviation of 1.4. These two stagescombine to reduce noise and improve the likelihood that the images ofcircles have both a steep gradient and are uninterrupted. Now themodule applies a Canny Edge filter in the form of OpenCV’s Canny
function, which finds edges in the image as follows:
• the edge detection operator returns the first derivative of the in-tensity function in the horizontal and the vertical directions: re-spectively, Gx and Gy;
24
image source
pydc1394/simulator
contrast stretchcontour
blurcontour
edge detection
contour
contour detectioncontour
ellipse fitting
ellipse
first pose estimation
posea
camera calibration
calibrate
print, plot
printer
final identification (future work)
identify, decode, poseb
navigation/task application (future work)
image
fx, fy estimates
contrast enhanced img
blurred img
binary img (edge, non-edge)
arrays of contour points
inscribing rectangles
pose estimates, marker images
Ci, Ni, Ri, ti, αi, data(i), i ∈ [0, nr. markers)
Figure 4.4: Core pipeline functions
• the edge gradient is then given by: G =√G2x +G2
y, which is the
norm of the gradient vector, whose components are Gx and Gy;
• the angle of the edge gradient is given by: Θ = arctan Gy
Gx;
• each gradient angle is rounded to one of four values;
• non-maximum suppression (NMS) is carried out: the local maxi-mum gradients are found for each of the four angles / directions.A maximum gradient is one that is greater than those on eitherside of the potential edge (source: OpenCV 2.4 documentation).
25
The opencv edge finding implementation employs hysteresis thresh-olding, which selects initial edge segments using a higher thesholdvalue, and then links edges using a lower threshold value. John Canny,the inventor of the technique, recommended in [4] that the thresholdvalues be set at a ratio between 2:1 and 3:1. The pipeline uses a lowerthreshold of 50 and a higher threshold of 150.
Finally an array of contours is obtained from the binary image viaOpenCV’s findContours function. The pipeline receives each contouras an array of all the pixel coordinates that belong to the contour.
4.2.2 Ellipses
The ellipse module passes set of contours, which contour produces,to OpenCV’s fitEllipse function, which in turn produces the corre-sponding best-fit ellipses, represented in the form:
((x0, y0), (xmajor, yminor), α), (4.1)
where: (x0, y0) is the intersection of the ellipse’s major and minor axes;xmajor and yminor are the lengths of the semimajor and semiminor axesof the ellipse respectively; and α is the rotation of the semimajor axisfrom the x-axis. ellipse then filters the ellipses (see description below)before converting the representation from image coordinates to thepipeline’s camera coordinate convention: z-axis along the optical axis,positive in the direction that the camera faces; y axis positive in theupward direction when the camera is upright, and x axis positive inthe right hand direction when the camera is upright.
Ellipse filter
The contour module produces hundreds of contours per frame duringtypical operation with a real camera (c.f. section 5.3 for the series ofexperiments with a real camera source). As shown in the screenshotsin figure 4.5, OpenCV’s fitEllipse function fits an ellipse to everycontour it is given. This does not cause a significant performancepenalty within that function, but does significantly slow down the poseestimation in posea since that algorithm contains a number of CPUintensive loops written in Python.
To reduce the data flow to the pose estimation bottleneck, ellipsecontains two filters. First, contours containing less than ten pixelsare discarded before the ellipse fitting stage. After the fitting stage, asecond filter retains only those ellipses that meet five criteria, namely:that the ellipse must be the larger of a pair with approximately thesame centres and inclinations; that both ellipses have aspect ratiosthat are not too large; and that the ratio between the sizes of the twoellipses should correspond closely to that in the model.
The filters result in a noticeable pipeline speedup and are likely toreduce the probability of false positives in real-world use.
26
(a) Without ellipse filter
(b) With ellipse filter
Figure 4.5: Fitting the right ellipses
4.2.3 Cones
To obtain the estimator for Q in the form shown in equation 3.4, theposea module first converts the ellipse representation that fitEllipsereturns, which is in the form shown in equation 4.1, into the generalquadratic form shown in equation 3.1. The quadratic form then yieldsthe estimator for the elliptical cone, Q, provided the focal length f isalso known. Note that in simulation mode, the focal length taken fromthe value in the GLCameraValsclass. The values and units chosen forthe focal length in simulation mode, given a field of view, determinethe scale of the pose estimates. The following is an explanation ofthe conversion algorithm implemented in ellipse. The algorithm is
27
inspired from [3].The return values from fitEllipse directly parametrise the follow-
ing equation for the ellipse:
X =
[xy
]= X0 +Re
[a cos θb sin θ
], (4.2)
where Re encodes a counter-clockwise rotation by α:
Re =
[cosα − sinαsinα cosα
], (4.3)
and X0 is the centre of the ellipse:
X0 =
[x0y0
]. (4.4)
To obtain the quadratic form, θ needs to be eliminated. We re-arrange equation 4.2 into equation 4.5, so that, given that uTu = 1, θis eliminated to obtain equation 4.6.[
1/a 00 1/b
]RTe (X −X0) =
[cos θsin θ
]= u. (4.5)
(X −X0)TRe
[1/a2 0
0 1/b2
]RTe (X −X0) = 1. (4.6)
Now, taking M (symmetric):
M =
[A BB C
]= Re
[1/a2 0
0 1/b2
]RTe , (4.7)
we have:
(X −X0)TM(X −X0) = 1
⇔ XTMX − 2XT0 MX +XT
0 MX0 − 1 = 0⇔ Ax2 + 2Bxy + Cy2 + 2Dx+ 2Ey + F = 0,
(4.8)
where: D = Ax0 +By0,
E = Bx0 + Cy0,
F = XT0
[A B
B C
]X0 − 1.
(4.9)
4.2.4 Poses
At this stage, an estimate for the matrix Q, as defined in equation 3.4,is available, and the posea module can proceed to compute the cen-tre of the marker circle, C, using equation 3.20, and the normal ofthe marker’s supporting plane, N , using equation 3.21. The function
28
get chen eigend orders the eigenvalues and eigenvectors to satisfy thesystem of equations 3.19, before poesa calculates the estimates for Cand N .
Since C and N are estimated as a function of three undeterminedsigns, {S1, S2, S3}, there are 23 = 8 estimates of each vector, per ellipse.Note that the marker’s rotation, α, is not needed to establish C andN . Finding α would form part of future work on a identify module,as discussed in sections 4.2.5 and 6.2.
In a final step, posea removes pose estimates that are impossiblea priori. According to [18], four out of the six estimates representmarkers that are either behind the camera’s x-y plane, i.e. where C < 0,or which face away from the camera, i.e. where N > 0.
4.2.5 Future modules
In order to obtain a single pose estimate per marker, and to register themarkers’ data payloads, further work would be necessary to implementthe identify module, described below, as well as a marker payloaddesign.
identify
In a first stage the identify module would un-distort the sub-imagescorresponding to the candidate poses for each ellipse. It would thencross-correlate these images with images in a marker model databasefor a range or possible rotations α around the marker normal, possiblyusing a technique similar to that in [18]. The highest correlation scorewould then yield a single, complete pose estimate for each ellipse to-gether with an identifier from the marker database. The module woulddiscard candidate ellipses that did not achieve a threshold score value.Finally, the the module would query the database for the data payloadscorresponding to the identified markers and append these to the rele-vant marker objects. The resulting list of marker objects could be madeavailable to a robot’s task execution module, to provide the instructioncodes and relative poses of markers in the robot’s environment.
4.3 Notes from debugging
Developing algorithms that closely follow the kinds of mathematicalformulations explained in chapter 3, and shown in figure 4.6, presentsa software engineering challenge, albeit a modest one. This section de-scribes the nature of the challenge and the methods used to overcome it.The changes described in this section contributed to the improvementsseen in the results of the third experiment, as described in section 5.5.
29
ellipse: bounding box
(x0, y0), a, b, α
ellipse: quadratic form
E =
A B DB C ED E F
oblique elliptical cone
Q =
A B −D/fB C −E/f−D/f −E/f F/f 2
eigendecomposition of Q
l = (λ1, λ2, λ3),V =
| | |v1 v2 v3
| | |
six centres and normals
(c1,n1), . . . (c6,n6)
two centres and normals
(C1,N1), (C2,N2)
X = X0 +Re
[a cos θb sin θ
]⇒ Ax2 + 2Bxy + Cy2 + 2Dx+ 2Ey + F = 0
Q = VΛV−1, λ1, λ2, λ3 and v1,v2,v3 ordered as described in section ??
C = S3λ2r√−λ1λ3
V
S2
λ3λ2
√λ1−λ2λ1−λ3
0
−S1λ1λ2
√λ2−λ3λ1−λ3
, N = V
S2
√λ1−λ2λ1−λ30
−S1
√λ2−λ3λ1−λ3
, where Si ∈ {+,−}
require: cz > 0 ∧ nz < 0
Figure 4.6: Pose estimation pipeline, key formulae
4.3.1 Systematic yet opaque
The results of the first experiment (section 5.3) and the second exper-iment (section 5.4) indicated the existence of systematic errors in thepose estimation code. They also seemed to promise that the applica-tion could be debugged to create a working pose estimation pipeline,a positive answer to the research question (section 1.3). The math-ematical nature of the application, however, poses a challenge to thedeveloper, since key sections of code can be semantically opaque andtherefore difficult to debug.
30
4.3.2 Names of variables
The first approach to facilitate bugging was to refactor the mathe-matical functions in posea to more closely reflect the naming conven-tions used in the mathematical formulae shown in chapter 3. Single-character and upper case variable names contradict a programmingstyle that normally favours clear, descriptive and unambiguous variablenaming conventions. In mathematical code the priority shifts towardsnaming schemes that make it easier to spot errors by comparing codewith corresponding formulae. After refactoring it was discovered thatthe conversion between ellipse representations contained discrepancieswith respect to the formulae.
4.3.3 Binary search
A debugging approach used in various engineering disciplines where themechanism in question effectively works as a series of “black boxes”,is to perform a binary-search style series of tests for inconsistencies inthe flow of inputs and outputs. This approach was adopted and helpedto find the bugs that were subsequently fixed.
4.3.4 Skew
The source of the skew shown in figure 5.6 was traced to the conversionin posea of the ellipse parameters from ellipse to the quadratic form.This was done by plotting the quadratic forms being produced, whichshowed that the ellipses were off-centre. The shift represented a missingconversion from a bottom-left x-y origin in image coordinates to aprincipal (middle) point x-y origin in camera coordinates.
4.3.5 Scale
Refactoring the camera and marker classes ironed out a bug involvingdiffering units of measure, which explained the scaling effect on theestimated z coordinates, as shown in figure 4.7.
4.3.6 Sign
Side-by-side printouts of actual and estimated coordinates showed thatsome coordinate elements occasionally had the wrong sign (note thatposea produces two estimates per ellipse). By drawing up the table ofthe most frequent discrepancies of signs, as shown in 4.8a, it became ap-parent that the coordinates were being systematically reflected aroundthe x-z plane, and other causes were ruled out. This was confirmedwith single line plots such as that in figure 4.8, revealing the pitfall ofonly plotting a single line of poses along the optical axis or symmetricallines of poses as shown in figure 4.7, since both these type are invariant
31
(a) “Flattened” estimates (units: mm) (b) Actual and estimated depths (mm)
Figure 4.7: z-scale bug
under x-z plane symmetry. The error was traced to the fact that theimage arrays obtained from OpenGL’s buffers had their origin in thetop-left corner, whereas the pipeline assumes a top-left corner imageorigin.
actual x actual y estimated x estimated y
+ + + −+ − + +− + − −− − − +
(a) Pattern of signs
(b) Symmetry (units: mm)
Figure 4.8: y sign bug
4.3.7 Radii
Interpreting r as the length of the circle diameter instead of the circleradius, significantly reduces the errors in the estimates for C. Thismay indicate an inconsistency in the notation used in [6].
4.3.8 Focal length
The camera incorrectly calculated its focal length as 8.211 instead of6.158, because the image height and width values were reversed at onepoint in the calculation.
32
4.3.9 Ellipse filter
The ellipseFitter function encodes near-circular ellipses with highlyvariable values for the bounding box angle of inclination α. In otherwords, the rotation of the bounding box becomes almost arbitrary fornear-circular ellipses. As a result the inner and outer circles appear tohave non-matching inclinations, and the ellipse filter discarded manyellipses with a small aspect ratio. This partly explains the small num-ber of estimates in the first and second experiments. Restricting therelative inclination test to ellipses with a larger aspect ratio fixed theissue.
4.3.10 Duplicate duplicates
Duplicate ellipses per marker circle gave around two estimates permarker pose, a coincidence - given the pose estimation ambiguity -that concealed a bug in the pipeline, which caused only one of the twoestimates per ellipse to be available downstream of the posea module.
33
CHAPTER 5
Experiments
This chapter describes the goals, methods and results of three seriesof experiments involving the implementation described in chapter 4.The three experiments are presented in chronological order. The firsttwo experiments informed major improvements to the pose estimationpipeline, through debugging work explained in section 4.3. The changesare reflected in the results of the third experiment. Other improvementsseen in the third experiments include more detailed output from theprinter module, most notably the histograms shown in section 5.5.3.
5.1 Goal
The overarching goal of these experiments is to help answer the re-search question formulated in section 1.3. The experiments, then, needto show whether the implementation can estimate the pose of a markeras designed, and what its potential accuracy might be. With the rightexperiments, it may be possible to identify the sources of pose estima-tion errors. These in turn could help future iterations of the pipelineto approach the limits of this pose estimation technique’s accuracy.
5.2 Method
This section describes details of experimental method that are commonto the three experiments. Each experiment also has its own methodsection for details specific to that experiment.
5.2.1 Output
The results of all three experiments in this chapter are produced fromthe application’s execution log and from output from the printer mod-ule. The execution log records general execution data such as the
34
pipeline framerate, and the numbers of frames, contours, pre-filter el-lipses and post-filter ellipses processed. The printer module writespose estimation statistics to /dev/stdout and displays matplotlib
based point clouds, 2D plots and histograms, which can be saved asvector graphics files. Refer to the source code documentation for fur-ther details.
5.2.2 Materials
All experiments were conducted with the same PC. Table 5.1 detailsits configuration.
Chipset Intel i7 2620MArchitecture x86-64GPU Intel HD 3000CPU clock speed (GHz) 2.7-3.4Cores 2Threads (“hyperthreading”) 4L1 cache (KB) 128L2 cache (KB) 512L3 cache (KB) 4096Main memory (MB) 3849OS Linux 3.2.0
Table 5.1: PC configuration
5.2.3 Notation
The notation in this chapter follows that of the application code wherepossible: C and eC are the marker centre and estimated marker centrerespectively; N and eN are the marker normal and estimated markernormal respectively; and µ, σ and Σ denote a mean, standard deviationand covariance matrix, respectively. A subscript x, y, or z representsthe relevant element of the subscripted vector. O is the origin of thecamera coordinate system. All coordinates are rounded to the nearestmillimetre.
5.3 First experiment: real camera
5.3.1 Goal
The goal of the first experiment is to discover patterns in the poseestimation data, at a stage where large and highly variable errors char-acterise the pose estimation results.
35
5.3.2 Method
Figure 5.1: First experiment, setup
Figure 5.1 shows a portion of the experimental setup. The experi-ment consists of collating pose estimates from a series of marker centrepositions (C), approximated by positioning the marker and camera byhand. In the first instance, separate pose estimates were made formarker positions approximately along the camera’s optical axis, at dis-tances respectively of 1, 2, . . . , 5m in front of the camera. The pipelineprocessed several hundred frames per position. Then, the pipeline pro-cessed images of the marker while the latter was gradually slid by handfrom a distance of 1m to a distance of 5m. This sliding position is de-
noted in the results as[0 0 zslide
]T.
The marker positions were located using measuring tape and markedat 1m intervals with sections of sticking tape on the floor. The camerawas fixed to a miniature tripod, in a raised position on a flat surfaceat approximately the same height as the fiducial marker’s centre. Toincrease accuracy, the marker’s centre was then manually aligned withthe image centre by rotating the camera before each experimental test,and the marker was rotated so that its normal appeared to the naked
eye to approximate[0 0 −1
]T.
36
Materials
camera Point Grey Chameleon CMLN-13S2Mlens Fujinon DF6HA-1Blens focal length (mm) 6calibration focal length (mm) 6.1152hor. field of view 42.5828◦
vert. field of view 32.5855◦
sensor Sony 1/3 inch CCD monochromesensor size 1296× 964pixel size (µm) 3.75× 3.75framerate (fps) 15bus USB 2image size 1280× 960
Table 5.2: First experiment, camera configuration
The experimental camera is described in table 5.2. The marker con-sists of a black-and-white inkjet printout of the marker model shownin figure 4.1, taped to a cardboard file to keep it more or less flat.
5.3.3 Results
The average pipeline throughput for the six tests in the first exper-iment series was 7.39 fps. The tests involved between 348 and 1129image frames, with an average of 593 frames across the tests. The av-erage values of the pose estimates for each marker position are listedin table 5.3, alongside the corresponding covariance matrices for theelements of the pose estimate coordinates. “ellipses” correponds tothe average per-image number of ellipses the pipeline considers outermarker circles for a given position. Figures 5.2 and 5.3 show pointclouds of pose estimates for the various fixed positions and for the1-5m slide, respectively. Each point corresponds to a single estimate.
5.3.4 Interpretation
Point cloud patterns
Figures 5.2 and 5.3 clearly show the ambiguity in every pose estimatefrom the posea module: the points appear in pairs, varying mostlyin the z direction. In addition, the pose estimates in figures 5.2cthrough 5.2h appear in two main clusters per position. The fact, seenin the ellipse fitting window during pipeline execution, that the pipelinefrequently mistakes the inner circle for the outer circle, probably ex-plains these double clusters. The fact that these pairs of clusters donot vary principally in the z direction, as should be the case betweenellipses along the optical axis that vary only with respect to diameter
37
C (m) ellipses µeC (mm) ΣeC (mm)
001
1.0
−368−280
4
0.4 0.2 −0.00.2 0.2 −0.0−0.0 −0.0 0.5
0
02
1.0
−743−603
8
627.0 509.2 −6.0509.2 413.6 −4.9−6.0 −4.9 0.1
0
03
1.2
−1123−913
11
20560 16820 −20016820 13760 −164−200 −164 2.0
0
04
1.4
−1557−1073
14
19500 20110 −21420110 175700 −1075−214 −1075 7.1
0
05
0.9
−1923−1554
19
21710 17540 −21117540 14170 −170−211 −170 2.1
0
0zslide
1.1
Table 5.3: First experiment, pose estimates (zslide ∈ [1000, 5000])
size, is likely due to the same type of pose estimation error that causesthe line of estimates shown in 5.3 to be skewed away from the actualline of marker poses.
Statistical patterns
The number of ellipses in table 5.3 increases with distance, before drop-ping again at 5m. This may reflect some combination of a decrease infalse negatives and an increase in false positives from 1 to 4m. Sincethe camera does not move significantly between marker positions, thiscould reflect an influence of the changing shape of the marker and itssupporting structure on the numbers of ellipses detected, as a functionof distance. The pipeline stage windows showed that ellipse fits alarge ellipse to the supporting structure at 3 and 4m. The size andaspect ratio of the larger, false positive ellipse seems to explain thestripe of estimates away from the two main clusters in figures 5.2dthrough 5.2g, and may contribute to the increased variances at thosedistances. The pattern of variances of the y coordinate suggest it may
38
have successively increased from 1 to 5m had there not been the peakin false positives at 3 and 4m. A corresponding peak in variances isnotably evident for the y and z coordinates at 3 and 4m.
After factoring out the effect of false positives, it appears that thevariances of all three coordinates increase with distance, much as theerror in the x and y coordinate estimates increase with distance. Inter-estingly, the z coordinate estimates increase roughly in step with theactual z coordinate values, but at a much smaller scale.
Overall pattern of errors
Overall, the pattern of clusters in figure 5.2 and the well defined, almoststraight line in figure 5.3 correspond to the pattern of actual poses -a point and a near-straight line - in each case. This and the skew inboth the line and the pairs of inner/outer circle estimates (discussedabove) suggests that the pose estimates may suffer from systematicerrors, indicating a bug in the implementation at this stage. The factthat the z coordinates increase in step with the actual depth as shownin table 5.3 reinforces this hypothesis.
5.4 Second experiment: simulated
5.4.1 Goal
The second experiment is designed to validate and verify the applica-tion’s simulation module; to confirm the interpretation of the previousexperiment; and to gain further insights, if possible, into the patternof pose estimation errors.
5.4.2 Method
simulation unit size (µm) 3.75pixel size (µm) 3.75× 3.75focal length (mm) 6vert. field of view 32.5855◦
image size 1280× 960near clipping depth (mm) 1far clipping depth (m) 100
Table 5.4: Second experiment, simulation parameters
The second experiment replaces the camera and scene of the first ex-periment with an OpenGL-based 3D simulation. Figure 4.3 shows anexample of an image generated in this way. simulator and markers
39
simulate markers with OpenGL polygons bound to a texture gener-ated from a vector graphic of the marker model. Only one marker iscontained in the scene at a time.
Simulations were executed with varying sample sizes (n = 200,n = 1500 or n = 2000) combined with one of two marker positionpatterns (five lines or single line). The marker normal throughout this
experiment was[0 0 −1
]T. No attempt is made in this experiment
to measure the estimation error for N , due to the errors evident inthe estimation of C, which is a greater priority. The third experiment,described in section 5.5, tackles the accuracy of eN .
5.4.3 Results
n (frames) 2000
throughput (fps) 2.17pre-filter ellipses p.f. 1.51post-filter ellipses p.f. 0.24max(||C − eC||) (mm) 10870.6min(||C − eC||) (mm) 459.8µ(||C − eC||) (mm) 3945.6σ(||C − eC||) (mm) 2241.4
µ(C − eC) (mm)
−1320.1−956.33289.8
Table 5.5: Second experiment, statistics
Table 5.5 shows application and pose estimation statistics for a 2000-frame simulation. Figure 5.4 shows the actual positions C at whichan ellipse is detected. The plot in figure 5.5 shows how depths andcorresponding estimated depths are related in a 1500-frame simulation.Figure 5.6 shows point clouds of the actual positions (C) and estimatedpositions (eC) of the marker in various simulations in “straight line”mode (c.f. section 5.4.2).
5.4.4 Interpretation
Application speed
The pipeline throughput has slowed by a factor of 3.4 compared tothe first experiment. A test run without pipeline modules or windowsshows that the application does not speed up significantly with onlythe simulator running. This may be evidence that a bottleneck exists
40
in the simulator rather than in the pipeline or multiple process syn-chronisation, although synchronisation overhead should not be ruledout as a cause. A performance profile as part of further work - as dis-cussed in section 6.2 - should help to identify the source of the reducedperformance in simulation mode.
Detection
As shown in figure 5.4 and table 5.5, the marker detection rates arelow, and fall to nil beyond a depth of around 10m. The cause may bea combination of: the ellipse filter conditions in ellipe, since on aver-age around 84 percent of ellipses are discarded; and artifacts relatingto contrast, lighting, smoothness and other factors, arising from theOpenGL configuration for this experiment.
Statistics
The average estimation error for each element of C confirms the equiv-alent results from the first experiment, in that the averages in eachdirection roughly correspond to the averages of the results in the firstexperiment.
Depth and scale
Figures 5.5 and 5.6 reveal that the estimated z coordinates increasein step with the actual z coordinates, but at a smaller scale. Thisconfirms the interpretation of the first experiment, and indicates thata proportional relation does exist between the estimated z and actualz values. Tests place the discrepancy of scale at around a factor of 200,so that figures 5.6c to 5.6f result when the estimated z coordinates arescaled up by that factor.
Point cloud patterns
Figure 5.6 shows that straight lines of actual poses result in straightlines of corresponding estimates that are apparently skewed to one sideof the original lines. This result confirms the interpretation of the pointcloud patterns in the first experiment, and again points to a systematicerror in an implementation that would otherwise produce meaningfulestimates for C.
The pattern of ambiguous estimates is also confirmed, as shown inthe pairs of estimates along the central line in figure 5.6c and in thepairs of estimates in the detail shown in figure 5.6f. Figure 5.6f alsosuggests that the pairs of estimates converge as the marker-cameradistance increases.
41
5.5 Third experiment: simulated
5.5.1 Goal
The first two experiments hinted at various kinds of systematic errorin the implementation of [6]. These results led to a debugging effort,which is described in section 4.3. The goal of this experiment is, first,to confirm the improvements to the pipeline, and, second, to providea detailed picture of the system’s performance that will inform theconclusions in chapter 6.
5.5.2 Method
This experiment follows the method of the second experiment, as de-scribed in section 5.4.2, with a few additions that are described here.
The simulator now has two modes of operation. The first simulationmode, which ressembles the method used in the second experiment, isenabled with the flag -s -2 and generates sequences of images corre-sponding to straight lines of marker poses with the marker normal set
to[0 0 −1
]T. The second simulation mode - enabled with -s -1 -
generates images of random poses, including random rotations.Several additional parameters are varied in this experiment, includ-
ing the minimum and maximum camera-marker distance; the executiontime in seconds (flag -t); and the maximum value of θ = 6 (−C)ON ,amongst other options. Refer to the source code documentation formore details regarding simulator options.
The experiment consists of two simulations in the first mode, involv-ing five straight lines in the camera-marker distance range [360, 22000](mm), and five simulations in the second - random - mode with vary-ing marker distance and angle parameters. The results are collated andpresented as: tables of statistics; two dimensional plots; point clouds;and histograms. The random simulations dominate the results andanalysis, since various combinations of variables and ranges of valuesmay usefully be analysed from the random samples.
5.5.3 Results
Statistical data for five random-mode simulations are shown in ta-ble 5.6. Point clouds for two straight-line mode simulations and onerandom mode simulation are shown in figure 5.7. Figure 5.8 containstwo dimensional plots relating estimated depth to actual depth. Theremaining figures each show five plots or histograms corresponding tofive combinations of simulation parameter values: maximum cameradistance either 10m or 18m; and maximum θ either 30◦, 60◦ or 90◦.Apart from figures 5.7a and 5.7a, all results are taken from random-mode simulations.
42
Notation
To the notation defined in section 5.2.3, are added the angle betweenN and eN , denoted 6 NOeN , and the angle at which the marker isviewed, θ, which equals 6 (−C)ON .
n (frames) 7900 7813 7665 7947 7439z range (cm) [36, 1836] [36, 1036] [36, 1836] [36, 1036] [36, 1036]θ range (deg) [0, 90] [0, 90] [0, 60] [0, 60] [0, 10]
throughput (fps) 2.15 2.17 2.13 2.21 2.07pre-filter ellipses p.f. 2.96 3.47 3.83 4.00 4.00post-filter ellipses p.f. 1.35 1.55 1.91 1.96 1.95min. ||C|| (mm) 727 720 727 750 720max. ||C|| (mm) 19705 11145 19952 11228 11225max. θ (deg) 89.8 89.8 60.0 60.0 30.0min. ||C − eC|| (mm) 1 1 1 1 1max. ||C − eC|| (mm) 25331 24873 9568 2480 804µ||C−eC|| (mm) 439 318 287 175 137σ||C−eC|| (mm) 985 810 509 136 94
µC−eC (mm)
3−4−313
−1−3−304
3−2−113
1−2−163
2−2−121
µ 6 NOeN (deg) 63.5 66.3 49.0 49.3 26.3σ 6 NOeN (deg) 40.5 42.1 30.2 30.6 15.4
Table 5.6: Statistics: third experiment
5.5.4 Interpretation
Detection
The pipeline detects almost all markers for θ < 60◦ and ||C|| < 18 m.Detection rates decrease markedly beyond θ = 75◦ (c.f. figure 5.12e)and beyond ||C|| = 15 m (c.f. figure 5.11d).
Overall the detection rate has increased dramatically compared tothe previous experiments. This is a consequence of changes to theellipse filtering algorithm in ellipse. Note that the actual detectionrate will be slightly lower after factoring out false positives, though theeffect is likely to be negligible.
43
Depth and scale
The z scale bug appears to have been corrected, with the caveat thatr should be interpreted as the length of the circle diameter and not asthat of the circle radius (c.f. also section 4.3 regarding this issue). Theestimates for Cz are promising, though they show rapidly increasingmean errors and deviations after a camera-marker distance of 10m ormarker angles (θ) greater than 60◦ (c.f. figure 5.8).
Statistics: eC
The estimates of marker positions are promising, and potentially moreso when one takes the ambiguity of the pose estimates at this stageinto account. There also exists the possibility that the ambiguousestimates give a false impression of greater accuracy, if the means ofthe ambiguous estimates are generally closer to the actual positionsthan either position of the estimate pairs.
As expected, the estimates for C are more accurate in the x and ydirections than in the z direction (c.f. table 5.6): intuitively, changesin depth lead to smaller changes in the image than do changes in thex and y directions.
Overall the error in the eCx and eCy estimates is low, between oneand four millimetres either way on average across all simulation sce-narios shown. The statistics show a “sweet spot” for estimates in the||C|| ∈ [1, 10] metre and θ ∈ [0, 30] degree ranges, where the distancebetween actual poses and their corresponding estimated poses averages137 mm despite the pose estimation ambiguity inherent in posea.
Statistics: eN
The estimates for the marker normal show high levels of error andvariability (c.f. table 5.6): even within the “sweet spot” for eC alreadymentioned, the estimates for N are on average 26.3◦ off-target, with astandard deviation of 15.4◦.
Point cloud patterns
The point clouds in figure 5.7 show that the skew effect seen in theprevious experiments has been corrected. The two non-random plotsin that figure however show errors towards the tips of the lines ofposes, which could correspond to the inner circle being taken for theouter circle: estimates are placed along the same line, but further awayfrom the camera. The plots for a random simulation provide someverification of the simulation itself, since a volume of approximatelythe expected size and shape is filled.
44
Application speed
The application remains slow, at just over two frames per second, whichis unsurprising since no optimisations were made with respect to speed.Prospects for further work in this area are discussed in chapter 6.
45
(a) (0 0 1000)T(b) (0 0 1000)T
(c) (0 0 2000)T (d) (0 0 3000)T
(e) (0 0 3000)T (f) (0 0 4000)T
(g) (0 0 4000)T (h) (0 0 5000)T
Figure 5.2: Experiment 1, estimates for fixed positions
46
Figure 5.3: Experiment 1, estimates for slide along z-axis (1-5m)
Figure 5.4: experiment 2, ellipse detection (units: mm)
47
Figure 5.5: experiment 2, eCz vs. Cz (1500 samples, units: mm)
48
(a) five lines, 200 samples (b) five lines, 200 samples
(c) five lines, 200 samples, eCz scaled (d) five lines, 2000 samples, eCz scaled
(e) single line, 200 samples, eCz scaled (f) single line, 200 samples, eCz scaled, de-tail
Figure 5.6: experiment 2, point clouds (green: C; blue: eC, units: mm)
49
(a) five lines, ||C||max = 22m (b) five lines, ||C||max = 18m
(c) random, θmax = 60◦, ||C||max = 18m (d) random, θmax = 60◦, ||C||max = 18m
Figure 5.7: experiment 3, point clouds (units: mm)
50
(a) θmax = 30◦, ||C||max = 10m
(b) θmax = 60◦, ||C||max = 10m (c) θmax = 90◦, ||C||max = 10m
(d) θmax = 60◦, ||C||max = 18m (e) θmax = 90◦, ||C||max = 18m
Figure 5.8: experiment 3, random mode, eCz vs. Cz (units: mm)
51
(a) θmax = 30◦, ||C||max = 10m
(b) θmax = 60◦, ||C||max = 10m (c) θmax = 90◦, ||C||max = 10m
(d) θmax = 60◦, ||C||max = 18m (e) θmax = 90◦, ||C||max = 18m
Figure 5.9: experiment 3, random mode, µ||C − eC|| vs. ||C|| (units: mm)
52
(a) θmax = 30◦, ||C||max = 10m
(b) θmax = 60◦, ||C||max = 10m (c) θmax = 90◦, ||C||max = 10m
(d) θmax = 60◦, ||C||max = 18m (e) θmax = 90◦, ||C||max = 18m
Figure 5.10: experiment 3, random mode, µ||C − eC|| vs. θ (units: mm)
53
(a) θmax = 30◦, ||C||max = 10m
(b) θmax = 60◦, ||C||max = 10m (c) θmax = 90◦, ||C||max = 10m
(d) θmax = 60◦, ||C||max = 18m (e) θmax = 90◦, ||C||max = 18m
Figure 5.11: experiment 3, random mode, detection rate vs. ||C|| (units: mm)
54
(a) θmax = 30◦, ||C||max = 10m
(b) θmax = 60◦, ||C||max = 10m (c) θmax = 90◦, ||C||max = 10m
(d) θmax = 60◦, ||C||max = 18m (e) θmax = 90◦, ||C||max = 18m
Figure 5.12: experiment 3, random mode, detection rate vs. θ (units: mm)
55
CHAPTER 6
Conclusion
6.1 Assessment
6.1.1 Validation of [6]
The experiments partly validate the theoretical technique in [6] withrespect to the estimation of the marker centre C but not with respectto that of the marker normal N . Development on the pipeline so farhas prioritised the estimates for C. Bugs in the current iteration maycause the errors in the marker normal estimates. Further work on thesystem may prove to completely validate [6].
6.1.2 Speed
The framerate is subjectively slow when the PC configuration is takeninto account (c.f. table 5.1) and needs to be addressed if the applicationis to run on embedded systems.
With a functional prototype complete, it would now be appropriateto optimise the code. To ensure that the optimisation effort is effectivethe code should be profiled for speed bottlenecks with a library suchas cProfile. Any significant bottlenecks - these would most likelybe located in the posea module - could be examined for optimisation,for example with fewer nested loops or more effective use of existinglibraries such as NumPy. If this approach proves to be insufficient, theoffending sections of code could be re-implemented in C++ or C andre-incorporated into the application with a binding.
6.1.3 Accuracy
The estimates for the marker centre are currently accurate enough forrobot navigation in laboratory conditions, notably within the “sweetspot” described in section 5.5.4. The estimates for N would be oflittle use, even in laboratory conditions. With further work on the
56
implementation, the estimates for C could provide sufficient accuracyfor the intended application, which is described in chapters 1 and 2.
6.2 Further work
The results of the third experiment are promising enough to motivatefurther work on several aspects of the implementation. This sectiondescribes some areas that offer room for improvement.
6.2.1 Source code
The emphasis on building a rapid working prototype of a pose estima-tion pipeline came at the cost of unit tests, which are currently notimplemented. To streamline any further work and to minimise futurebugs, unit tests should be implemented as a matter of priority.
Debugging
Several potential bugs are mentioned in this report, and should betackled during future work on the pipeline. Details to check includethe camera calibration values, and the units for all geometric values.
6.2.2 Image processing
As seen during the transition from the second to the third experiment(c.f. 4.3), various image processing steps are open to improvement.Currently used functions may yield better results with different pa-rameters, and libraries outside OpenCV may offer better performancefor certain tasks. The ellipse module returns two ellipses per circlein simulation mode. Changes to any of the image smoothing, edge de-tection, contour finding and ellipse fitting functions may correct this,although this could vary with lighting conditions. The ellipse filterwould benefit from an ellipse fitting error value of some kind to dis-cern the quality of the fit for each ellipse. Alternatives algorithms areavailable for most of the image processing functions. Further researchand experimentation would undoubtedly lead to an improved pose es-timation design.
6.2.3 Experiments
Further experimentation would reveal currently untested performancecharacteristics, such as the system’s resilience against occlusion, andits performance in various real world conditions. The results of futureexperiments as well as those of the third experiment should be com-pared to results, against similar metrics, for popular fiducial systemsin the literature.
57
6.2.4 Marker design
The literature reviewed in chapter 2 includes many insights into markerdesign considerations. Further work on improving the marker designshould begin with a review of the factors described in [21], regarding,for instance, the optimum ratio of circle sizes.
6.2.5 Identication and decoding
To complete the pipeline for incorporation into a robotic task executionsystem, further work is required to implement the modules describedin section 4.2.5.
6.2.6 Simulator
The simulator has proven to be a cost and time effective means ofevaluating the performance of the pose estimation pipeline. Furtherexperiments with a real camera and scene would serve to verify andimprove the simulations.
Further work on the simulator should at least include the additionof an interactively controlled model-view matrix, and the ability to in-clude multiple markers in a simulated environment. This would allowthe user to dynamically control the camera and perform more exten-sive experiments that realistically simulate a robot navigation task.The simulator could also provide a useful platform for system demon-strations.
6.3 Summary
The pose estimation system implemented and described here validatesthe technique for estimating the position of a circle’s centre described,in theoretical terms, in [6]. The system currently performs best at esti-mating the camera coordinates of the centre of a marker within a “sweetspot” in the camera’s field of view, located between a near depth of 1metre and a far depth of 10 metres. Provided the marker viewing angleis below 60◦, the system estimates C with an average error of 1 to 4millimetres in the x and y directions and of 137 millimetres in the z(depth) direction. The mean error and variance in the estimates grad-ually increases - notably in the z direction - beyond 10m. Combinedwith odometry and basic obstacle avoidance sensors, the system couldcurrently support robot navigation in laboratory conditions.
Further work as described in section 6.2 promises to significantlyimprove the system’s performance in future iterations, in terms bothof speed and of pose estimation accuracy. The simulation frameworkimplemented for this project provides a useful testbed for future devel-opment of the pipeline.
58
6.4 Acknowledgements
Significant input from friends and other generous acquaintances ben-efited this thesis, including: Robert Belleman’s technical advice onmaterials and on building a simulation; Rein van der Boomgaard’sextensive guidance, insight and encouragement in all aspects of theproject; Leo Dorst’s assistance with the linear algebra in [6]; CoenStork and Paul Mendel’s support, financial and otherwise; and JeroenZuiddam’s enthusiastic interest and insight.
59
Bibliography
[1] M. Appel and N. Navab. “Registration of technical drawings andcalibrated images for industrial augmented reality”. In: MachineVision and Applications 13.3 (2002), pp. 111–118.
[2] F. Bergamasco et al. “RUNE-Tag: A high accuracy fiducial markerwith strong occlusion resilience”. In: Computer Vision and Pat-tern Recognition (CVPR), 2011 IEEE Conference on. IEEE. 2011,pp. 113–120.
[3] R. Brown. Ellipse implicit equation coefficients. url: www.mathworks.ch/matlabcentral/answers/37124-ellipse-implicit-equation-
coefficients.
[4] J. Canny. “A computational approach to edge detection”. In:Pattern Analysis and Machine Intelligence, IEEE Transactionson 6 (1986), pp. 679–698.
[5] et.al. Chandler. Hexagonal, information encoding article, processand system. US Patent 4,874,936. 1989.
[6] Q. Chen, H. Wu, and T. Wada. “Camera calibration with two ar-bitrary coplanar circles”. In: Computer Vision-ECCV 2004 (2004),pp. 521–532.
[7] Y. Cho and U. Neumann. “Multi-ring color fiducial systems forscalable fiducial tracking augmented reality”. In: Proc. of IEEEVRAIS. Citeseer. 1998, p. 212.
[8] University of Cordoba. Aruco. url: http : / / www . uco . es /
investiga/grupos/ava/node/26.
[9] M. Fiala. “Artag, a fiducial marker system using digital tech-niques”. In: Computer Vision and Pattern Recognition, 2005.CVPR 2005. IEEE Computer Society Conference on. Vol. 2.IEEE. 2005, pp. 590–596.
[10] Eos Systems Inc. Photomodeler automation, coded targets andphotogrammetry targets. url: http://www.photomodeler.com/products/pm-auto.htm.
[11] Intersense. IS-1200 System. url: http://www.intersense.com/pages/21/13.
60
[12] I.P.H. Kato, M. Billinghurst, and I. Poupyrev. “Artoolkit usermanual, version 2.33”. In: Human Interface Technology Lab, Uni-versity of Washington (2000).
[13] S. Lieberknecht et al. “Evolution of a Tracking System”. In:Handbook of Augmented Reality (2011), pp. 355–377.
[14] T. Lochmatter et al. “Swistrack-a flexible open source trackingsoftware for multi-agent systems”. In: Intelligent Robots and Sys-tems, 2008. IROS 2008. IEEE/RSJ International Conference on.IEEE. 2008, pp. 4004–4010.
[15] M. Lourakis. homest: AC/C++ Library for Robust, Non-linearHomography Estimation. 2010.
[16] Y. Nakazato, M. Kanbara, and N. Yokoya. “Localization of wear-able users using invisible retro-reflective markers and an IR cam-era”. In: Proc. SPIE Electronic Imaging. Vol. 5664. 2005, pp. 1234–1242.
[17] Edwin Olson. “AprilTag: A robust and flexible visual fiducialsystem”. In: Proceedings of the IEEE International Conferenceon Robotics and Automation (ICRA). May 2011.
[18] A. Pagani et al. “Circular markers for camera pose estimation”.In: (2011).
[19] J. Rekimoto and Y. Ayatsuka. “CyberCode: designing augmentedreality environments with visual tags”. In: Proceedings of DARE2000 on Designing augmented reality environments. ACM. 2000,pp. 1–10.
[20] A.C. Rice, A.R. Beresford, and R.K. Harle. “Cantag: an opensource software toolkit for designing and deploying marker-basedvision systems”. In: Pervasive Computing and Communications,2006. PerCom 2006. Fourth Annual IEEE International Confer-ence on. IEEE. 2006, 10–pp.
[21] A.C. Rice, R.K. Harle, and A.R. Beresford. “Analysing funda-mental properties of marker-based vision system designs”. In:Pervasive and Mobile Computing 2.4 (2006), pp. 453–471.
[22] P. Santos et al. “Ptrack: introducing a novel iterative geometricpose estimation for a marker-based single camera tracking sys-tem”. In: Virtual Reality Conference, 2006. IEEE. 2006, pp. 143–150.
[23] J. Sattar et al. “Fourier tags: Smoothly degradable fiducial mark-ers for use in human-robot interaction”. In: Computer and RobotVision, 2007. CRV’07. Fourth Canadian Conference on. IEEE.2007, pp. 165–174.
61
[24] D. Schmalstieg and D. Wagner. “Experiences with handheld aug-mented reality”. In: Mixed and Augmented Reality, 2007. ISMAR2007. 6th IEEE and ACM International Symposium on. IEEE.2007, pp. 3–18.
[25] S.W. Shih and T.Y. Yu. “On designing an isotropic fiducial mark”.In: Image Processing, IEEE Transactions on 12.9 (2003), pp. 1054–1066.
[26] J. Steinbis, W. Hoff, and T.L. Vincent. “3D fiducials for scalableAR visual tracking”. In: Proceedings of the 7th IEEE/ACM In-ternational Symposium on Mixed and Augmented Reality. IEEEComputer Society. 2008, pp. 183–184.
[27] H. Uchiyama and E. Marchand. “Deformable random dot mark-ers”. In: Mixed and Augmented Reality (ISMAR), 2011 10th IEEEInternational Symposium on. IEEE. 2011, pp. 237–238.
[28] D. Wagner and D. Schmalstieg. “Artoolkitplus for pose track-ing on mobile devices”. In: Proceedings of 12th Computer VisionWinter Workshop (CVWW’07). 2007, pp. 139–146.
[29] X. Zhang, S. Fronz, and N. Navab. “Visual marker detection anddecoding in AR systems: A comparative study”. In: Proceedingsof the 1st International Symposium on Mixed and AugmentedReality. IEEE Computer Society. 2002, p. 97.
[30] X. Zhang, Y. Genc, and N. Navab. “Taking AR into large scaleindustrial environments: Navigation and information access withmobile computers”. In: Augmented Reality, 2001. Proceedings.IEEE and ACM International Symposium on. IEEE. 2001, pp. 179–180.
62