+ All Categories
Home > Documents > Inertial Sensed Ego���motion for 3D Vision

Inertial Sensed Ego���motion for 3D Vision

Date post: 11-May-2023
Category:
Upload: uc-pt
View: 0 times
Download: 0 times
Share this document with a friend
10
Inertial Sensed Ego-motion for 3D Vision Jorge Lobo and Jorge Dias Institute of Systems and Robotics DEEC, University of Coimbra-Polo II 3030-290 Coimbra, Portugal e-mail: [email protected], [email protected] Received 4 November 2003; accepted 4 November 2003 Inertial sensors attached to a camera can provide valuable data about camera pose and movement. In biological vision systems, inertial cues provided by the vestibular system are fused with vision at an early processing stage. In this article we set a framework for the combination of these two sensing modalities. Cameras can be seen as ray direction measuring devices, and in the case of stereo vision, depth along the ray can also be com- puted. The ego-motion can be sensed by the inertial sensors, but there are limitations determined by the sensor noise level. Keeping track of the vertical direction is required, so that gravity acceleration can be compensated for, and provides a valuable spatial ref- erence. Results are shown of stereo depth map alignment using the vertical reference. The depth map points are mapped to a vertically aligned world frame of reference. In order to detect the ground plane, a histogram is performed for the different heights. Taking the ground plane as a reference plane for the acquired maps, the fusion of multiple maps reduces to a 2D translation and rotation problem. The dynamic inertial cues can be used as a first approximation for this transformation, allowing a fast depth map registration method. They also provide an image independent location of the image focus of expan- sion and center of rotation useful during visual based navigation tasks. © 2004 Wiley Periodicals, Inc. 1. INTRODUCTION Biological vision systems are known to incorporate other sensing modalities. The inner ear vestibular system in humans and in animals provides inertial sensing mainly for orientation, navigation, control of body posture, and equilibrium. This sensorial system also plays a key role in several visual tasks and head stabilization, such as gaze holding and tracking vi- sual movements. 1 Neural interactions of human vi- sion and vestibular system occur at a very early pro- cessing stage. 2 Artificial vision systems can provide better per- ception of their environment by using inertial sensor measurements of camera pose (rotation and transla- tion). As in human vision, low level image processing Journal of Robotic Systems 21(1), 3–12 (2004) © 2004 Wiley Periodicals, Inc. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/rob.10122
Transcript

Inertial Sensed Ego-motion for3D Vision

Jorge Lobo and Jorge DiasInstitute of Systems and RoboticsDEEC, University of Coimbra-Polo II3030-290 Coimbra, Portugale-mail: [email protected], [email protected]

Received 4 November 2003; accepted 4 November 2003

Inertial sensors attached to a camera can provide valuable data about camera pose andmovement. In biological vision systems, inertial cues provided by the vestibular systemare fused with vision at an early processing stage. In this article we set a framework forthe combination of these two sensing modalities. Cameras can be seen as ray directionmeasuring devices, and in the case of stereo vision, depth along the ray can also be com-puted. The ego-motion can be sensed by the inertial sensors, but there are limitationsdetermined by the sensor noise level. Keeping track of the vertical direction is required,so that gravity acceleration can be compensated for, and provides a valuable spatial ref-erence. Results are shown of stereo depth map alignment using the vertical reference. Thedepth map points are mapped to a vertically aligned world frame of reference. In orderto detect the ground plane, a histogram is performed for the different heights. Taking theground plane as a reference plane for the acquired maps, the fusion of multiple mapsreduces to a 2D translation and rotation problem. The dynamic inertial cues can be usedas a first approximation for this transformation, allowing a fast depth map registrationmethod. They also provide an image independent location of the image focus of expan-sion and center of rotation useful during visual based navigation tasks.© 2004 Wiley Periodicals, Inc.

• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

1. INTRODUCTION

Biological vision systems are known to incorporateother sensing modalities. The inner ear vestibularsystem in humans and in animals provides inertialsensing mainly for orientation, navigation, control ofbody posture, and equilibrium. This sensorial systemalso plays a key role in several visual tasks and head

Journal of Robotic Systems 21(1), 3–12 (2004) © 2004 Wiley PerPublished online in Wiley InterScience (www.interscience.wiley.co

stabilization, such as gaze holding and tracking vi-sual movements.1 Neural interactions of human vi-sion and vestibular system occur at a very early pro-cessing stage.2

Artificial vision systems can provide better per-ception of their environment by using inertial sensormeasurements of camera pose (rotation and transla-tion). As in human vision, low level image processing

iodicals, Inc.m). • DOI: 10.1002/rob.10122

4 • Journal of Robotic Systems—2004

should take into account the ego motion of the ob-server. Nowadays micromachined low cost inertialsensors can be easily incorporated in computer visionsystems, providing an artificial vestibular system. In-ertial sensing is totally self-contained, except forgravity which provides an external reference.

This work is part of ongoing research into the fu-sion of inertial sensor data in computer vision sys-tems. In ref. 3 a framework is set for vision and in-ertial sensor cooperation. The use of gravity as avertical reference is explored, enabling camera focaldistance calibration with a single vanishing point,vertical line segmentation, and ground plane seg-mentation. In ref. 4 world vertical feature detectionand 3D mapping is presented. In this article we con-tinue to explore the use of inertial data in vision sys-tems, and present a method for fast alignment andsegmentation of depth maps obtained from correla-tion based stereovision.

1.1. Related Work

Navigation in aerospace and naval applications haslong relied on high grade inertial sensors.5,6 The elec-tronic and silicon micromachining development hasproduced low cost, batch fabricated, silicon sensors.Currently they are not suitable for stand-alone iner-tial systems, but can be useful in many applications.The level of integration is increasing, and single chipinertial systems for inertial aided GPS navigation sys-tems are being developed.7 This development has en-abled many new applications for inertial sensors, notjust in robotics and computer vision but also in largeconsumer commercial devices, such as video cameravibration compensation.

In computer vision applications, Vieville andFaugeras have proposed the use of inertial sensors8

and studied the cooperation of the inertial and visualsystems in mobile robot navigation by using the ver-tical cue, rectifying images and improving self-motion estimation for 3D structure reconstruction.9–12

Inertial sensors were used to improve optical flow forobstacle detection by Bhanu et al.;13 inertial sensedego motion compensation improved interest point se-lection, matching of the interest points, and the sub-sequent motion detection, tracking, and obstacledetection.

Comparison of camera rotation estimate given byimage optical flow with output from a low cost gy-roscope was done by Panerai and Sandini for gazestabilization of a rotating camera.14,15 In ref. 16 theyalso studied the integration of inertial and visual in-formation in binocular vision systems.

Mukai and Ohnishi used a gyroscope sensor todiscriminate rotation and translation effects on theimage and improve the accuracy of 3D shaperecovery.17,18 In ref. 19 Kurazume and Hirose used in-ertial sensors for image stabilization and attitude es-timation of remote legged robots.

Virtual reality modelling and augmented realityare strong applications for inertial aided vision sys-tems. Coorg et al.20 use mosaicing and other tech-niques to perform an automated three-dimensionalmodeling of urban environments using pose imagery(i.e., images with known orientation and position ob-tained by inertial sensors and GPS). A hybrid inertialand vision tracking algorithm for augmented realityregistration was proposed by Suya You et al.21 Hoffet al. used a head mount system with inertial sensorsand cameras, providing 3-D motion and structure es-timation for augmented reality.22,23

A vision system for automated vehicles built byDickmanns et al. has also incorporated inertialsensors.24 The vision feature trackers use feedbackfrom the inertial estimated state that has negligibletime delays, and includes perturbations which mustbe taken into account by the vision system.

2. DATA FROM CAMERA SENSOR

Cameras can be seen as ray direction measuring de-vices. The pinhole camera model considers one centerof projection, where all rays originated from worldpoints converge. The image will be equivalent to aplane cutting that pencil of rays, projecting images ofworld points onto a plane. If we consider a unitsphere around the optical center, we can model theimages as being formed on its surface. The imageplane can be seen as a plane tangent to a sphere ofradius f , the camera’s focal distance, concentric withthe unit sphere, as shown in Figure 1. Using the unitsphere gives an interesting model for central perspec-tive and provides an intuitive visualization of projec-tive geometry.25,26 It also has numerical advantageswhen considering points at infinity, such as vanishingpoints.

2.1. Image Points

A world point Pi will project on the image plane as piand can be represented by the unit vector mi placedat the sphere’s center, the optical center of the camera.With image centered coordinates pi�(xi ,yi) we have

Lobo and Dias: Inertial Sensed Ego-motion • 5

Pi→mi�Pi

�Pi��

1

�xi2�yi

2�f2 � xi

yi

f� . (1)

To avoid ambiguity mi is forced to be positive, so thatonly points on the image side hemisphere areconsidered.

2.2. Image Lines

Image lines can also be represented in a similar way.Any image line defines a plane with the center of pro-jection, as seen in Figure 1. A vector n normal to thisplane uniquely defines the image line and can beused to represent the line. For a given image lineax�by�c�0, the unit vector is given by

n�1

�a2�b2��c/f �2 � ab

c/f� . (2)

As seen in Figure 1, we can write the unit vector of animage line with points m1 and m2 as

n�m1�m2 . (3)

2.3. Vanishing Points

Since the perspective projection maps a 3D worldonto a plane or planar surface, phenomena that onlyoccurs at infinity will project to very finite locations inthe image. Parallel lines only meet at infinity, but inthe image plane, the point where they meet can bequite visible and is called the vanishing point of thatset of parallel lines.

Figure 1. Point projection onto unit sphere.

A space line with the orientation of an unit vectorm has, when projected, a vanishing point with unitsphere vector �m. The vanishing point of a set of 3Dparallel lines with image lines n1 and n2 is given by

m�n1�n2 . (4)

2.4. Ego-motion and Spherical Motion Field

When the camera sensor moves relative to the ob-served scene, image features will have a correspond-ing motion across the image. Using a sphericalmodel, data from different camera configurations,such as omnidirectional images from catadioptricmirrors or several cameras with a common center ofprojection, can be incorporated into a unified model,with better spatial observability.

If the camera experiences a pure rotation �, thefixed world Pi given in the camera referential willhave a motion vector given by

Pi����Pi (5)

as shown in Figure 2. The world point after the ro-tation Pi� is given by Pi���Pi . The unit sphere pointafter the rotation mi� is given by

mi��Pi���Pi

�Pi���Pi��mi���mi . (6)

Since the rotation is centered in the camera projectivecenter, the induced image motion does not depend onthe 3D point depth.

If the camera experiences both rotation � andtranslation t the fixed world Pi given in the camerareferential will have a motion vector given by

Figure 2. Projected unit sphere point motion with camerapure rotation.

6 • Journal of Robotic Systems—2004

Pi��t���Pi (7)

as shown in Figure 3. Projecting onto the unit sphereas before, the motion field on the unit sphere mi isgiven by

mi�1

�Pi���t.mi�mi�t����mi . (8)

This equation describes the velocity vector mi for agiven unit sphere point mi as a function of cameraego-motion (t,�) and depth �Pi� .

3. DATA FROM INERTIAL SENSORS

At the most basic level, an inertial system simply per-forms a double integration of sensed accelerationover time to estimate position. But if body rotationsoccur, they must be taken into account. The measuredaccelerations are given in the body frame of reference,initially aligned with the navigation frame of refer-ence. In strapdown systems the gyros measure thebody rotation rate, and the sensed accelerations arecomputationally converted to the navigation frame ofreference. Figure 4 shows a block diagram of a strap-down inertial navigation system. The system has aninertial measurement unit (IMU) with 3D orthogonalsets of acceleromters and gyrometers. Table I summa-rizes the data that can be obtained from the inertialsensors.

High grade sensors are required for inertial navi-gation, and low-cost MEMs inertial sensors offer lowperformance. Some assumptions can be made on thesystems’s dynamics to cope with the accumulateddrift. If the norm of the sensed acceleration is about9.8 m.s�2, then we can assume that the accelerom-

Figure 3. Projected unit sphere point motion with cameratranslation.

eters only measure g, and the attitude can be directlydetermined, resetting the accumulated drift in the at-titude computation. A low threshold can also be ap-plied to the system, assuming that the system neveraccelerates or rotates below a certain value, prevent-ing the error build up.

It is interesting to notice that human inertial sens-ing has a similar performance to currently availablelow-cost inertial sensors. Measuring the actual vesti-bular perceptual thresholds is difficult; they are de-termined by many factors such as mental concentra-tion, fatigue, other stimulus capturing the attention,and vary from person to person.27 Reasonable thresh-old values for perception of rotation are 0.14, 0.5 and0.5 deg.s�2 for yaw, roll, and pitch motions, respec-tively. Values of 0.01 g for vertical and 0.006 g for hori-zontal acceleration are appropriate representativethresholds for perceptible intensity of linear accelera-tion. These are valid for sustained and relatively lowfrequency stimulus.

The currently available low cost inertial sensorsare capable of similar performances.28 The inertialsystem prototype built for this work, using low costsensors, has gyros with 0.1 deg.s�1 resolution, and

Figure 4. Simplified strapdown inertial navigation sys-tem.

Table I. Data from inertial sensors.

ddt

angular acceleration �=�

rate of linear acceleration (jerk) j� a�x�

angular velocity �� �

linear acceleration�gravity a�g� x�g

� dtangular position (attitude) �

linear velocity v� x

�� dt position x

Lobo and Dias: Inertial Sensed Ego-motion • 7

accelerometers with 0.005 g resolution. Notice thatthe gyros measure angular velocity and not angularacceleration.

These performances are not suitable for stand-alone inertial navigation, but combined with visioncues they contribute to human spatial orientation andbody equilibrium. The inertial cues enhance the per-formance of the vision system in gaze stabilization,tracking, and visual navigation.

4. COMBINING STATIC INERTIAL CUES WITHVISION

4.1. Vertical Reference from Gravity

The measurements taken by the inertial unit’s accel-erometers include the sensed gravity vector summedwith the body’s acceleration. When the system is mo-tionless, or subject to constant speed, gravity pro-vides a vertical reference for the camera system frameof reference given by

n�a

�a� , (9)

where a is the sensed acceleration, in this case the re-active (upward) force to gravity. By performing therotation update using the IMU gyro data, gravity canbe separated from the sensed acceleration. In this casen is given by the rotation update, but must be moni-tored using the low-pass filtered accelerometer sig-nals, for which the above equation still holds, to resetthe accumulated drift.

The vertical unit vector is given in the IMU ref-erential, and has to be converted to the camera ref-erential. Only the rotation is relevant, and can be cali-brated as described below.

4.2. Rotation between IMU and Camera

Figure 5 shows the several frames of reference thatneed to be considered. The inertial measurementshave to be mapped to the camera frame of reference.If the alignment between them is unknown, calibra-tion is required.

Both sensors can be used to measure the verticaldirection, so that the rigid transformation betweenthe IMU frame of reference �I� and the camera frameof reference �C� can be determined. When the IMUsensed acceleration is equal in magnitude to gravity,the sensed direction is the vertical. The camera ver-

tical direction can be taken from the vanishing pointof either a specific calibration target, such as a chess-board placed vertically, or from some known scenevertical edges. However, camera calibration is re-quired to obtain the correct 3D orientation of the van-ishing points.

If n observations are made for distinct camera po-sitions, recording the vertical reference provided bythe inertial sensors and the vanishing point of scenevertical features, the absolute orientation can be de-termined using Horn’s method.29 Since we are onlyobserving a 3D direction in space, we can only deter-mine the rotation between the two frames ofreference.

Let Ivi be a measurement of the vertical by theinertial sensors and Cvi the corresponding measure-ment made by the camera derived from some scenevanishing point. We want to determine the unitquaternion q that rotates inertial measurements in theinertial sensor frame of reference �I� to the cameraframe of reference �C�. In the following equations,when multiplying vectors with quaternions, the cor-responding imaginary quaternions are implied. Wewant to find the unit quaternion q that maximizes

i�1

n

� q Iviq0�•Cvi . (10)

Expressing the quaternion product qvi as a ma-trix multiplication Viq, after some manipulation weget

Figure 5. Camera �C�, IMU �I�, mobile system �N�, andworld fixed �W� frames of reference.

8 • Journal of Robotic Systems—2004

i�1

n

qT IViT CVi q; (11)

factoring out q we get

qT� i�1

nIVi

T CVi� q. (12)

So we want to find q such that

max qTNq , (13)

where N� i�1n IVi

T CVi . Matrix N can be expressedusing the sums for all i of all nine pairing products ofthe components of the two vectors Ivi and Cvi . Thesums contain all the information that is required tofind the solution. Since N is a symmetric matrix, thesolution to this problem is the four-vector qmax cor-responding to the largest eigenvalue max of N—seeref. 29 for details. Results of this calibration methodare presented in ref. 30.

4.3. Using the Inertial Vertical Reference

The vertical reference n corresponds to the north poleof the unit sphere. A set of world vertical features willproject to image lines ni with a common vanishingpoint mvp�n.

Given a single image vanishing point vp�(x ,y)of a levelled plane, the horizon line is given by

nxx�nyy�nzf�0, (14)

where f is the focal distance and n�(nx ,ny ,nz)T.Since the vanishing line is determined alone by theorientation of the planar surface, the horizon line isthe vanishing line of all levelled planes, parallel to theground plane.

If a ground plane world point P, given in the cam-era frame of reference �C�, is known, the plane equa-tion can be determined and is given by

n.P�d�0, (15)

where d is the distance from the origin to the groundplane, i.e., the system height. In some applications itcan be known or imposed by the physical mount, ordetermined using stereo as shown below.

When detecting world features, a convenientframe of reference has to be established. A moving ro-bot navigation frame of reference �N� can be consid-

ered, aligned by the ground pane as shown in Figure5. The vertical unit vector n and system height d canbe used to define �N�; by choosing Nx to be coplanarwith Cx and Cn in order to keep the same heading,28

we have

NP�NTC .CP, (16)

where

NTC���1�nx

2 �nxny

�1�nx2

�nxnz

�1�nx2

0

0nz

�1�nx2

�ny

�1�nx2

0

nx ny nz d

0 0 0 1

� . (17)

If a heading reference is available, then �N�should not be restricted to having Nx coplanar with Cxand Cn, but use the known heading.28 Using the ro-bot’s odometry, the inertial sensors, and landmarkmatching, conversion to the world fixed frame of ref-erence �W� can be accomplished.

4.4. Stereo Depth Map Alignment Using VerticalReference

Stereo vision systems can use correlation based meth-ods to obtain depth maps. With the current technol-ogy, real time systems are commercially available.When the vision system is moving the maps have tobe fused into a single world map. Before fusing thedepth maps, they must be registered to a common ref-erential. This can be done using data fitting alone, oraided by known parameters or restrictions on theway the measurements were made.

Figure 6. Frames of reference and stereo vision systemwith inertial measurement unit.

Lobo and Dias: Inertial Sensed Ego-motion • 9

In order to obtain depth maps with known visionsystem pose, the stereo vision system was mountedonto an inertial measurement unit (IMU), as shown inFigure 6. To compute range from stereo images we areusing the SRI stereo engine31 with the small visionsystem (SVS) software and the MEGA-D digital ste-reo head, shown in Figure 6.

The depth maps are given in the right cameraframe of reference, with the Z axis pointing forwardalong the optical axis. The depth map is given by apencil of rays with known depth from the origin(Figure 7).

Using the vertical reference, the depth maps canbe segmented to identify horizontal and vertical fea-tures. The aim is on having a simple algorithm suit-able for a real-time implementation. Since we are ableto map the points to an inertial reference frame, pla-nar leveled patches will have the same depth z , andvertical features the same xy , allowing simple featuresegmentation using histogram local peak detection.Figure 8 summarizes the proposed depth map seg-mentation method.

The depth map points are mapped to the worldframe of reference. In order to detect the ground

Figure 7. Observed scene and depth map obtained withSVS.31

plane, a histogram is performed for the differentheights. The histogram’s lower local peak, zgnd , isused as the reference height for the ground plane. Fig-ure 9 shows some results of ground plane detectionand depth map rectification.

5. COMBINING DYNAMIC INERTIAL CUES WITHVISION

Inertial sensors can only provide direct measure-ments of angular velocity � and, after subtractinggravity, linear acceleration a. Angular position re-quired to subtract gravity is obtained from integra-tion over time, with unbounded error buildup. Linearvelocity and position is again done by integrationover time with the associated error accumulation. Aspreviously described, some heuristics can be appliedto reset or bound some of the error drift.

From (8) we see how velocities t and � are pro-jected onto the image. Inertial angular velocity mea-surements, being directly measured, should be usedwith more confidence than linear velocities.

5.1. Image Focus of Expansion

When the camera is moving with linear velocity t andnot rotating, from (8) we see that the image point

mFOE�t

�t�(18)

will have no motion, i.e., mFOE�0, and all others willbe expanding or contracting to this point. This point

Figure 8. Summary of depth map vertical alignment method.

10 • Journal of Robotic Systems—2004

Figure 9. On the left the graphical front-end of the implemented system, showing the height histogram for ground planedetection, the detected plane and 3D segmented depth map; on the right the top and front view of the aligned segmenteddepth maps that only require a translation (tx ,ty) and rotation � to be correctly fused.

is known as the the image focus of expansion (FOE).When the system is also rotating, the FOE will havedepth independent velocity

mFOE����mFOE����t

�t�. (19)

The FOE can be found using the inertial data alone,provided that the system has been calibrated.

5.2. Image Center of Rotation

When the camera is moving with angular velocity tand no linear translation, from (8) we see that the im-age point

mCOR��

��� (20)

will have no motion, i.e., mCOR�0, and all others willbe rotating around this point. This point is known asthe image center of rotation (COR). When the systemis also translating at velocity t, the COR will havedepth dependent velocity

mCOR�1

�PFOE� ��t.mCOR�mCOR�t�, (21)

where PFOE in the 3D point in view along the image

ray given by mCOR . The COR can be easily definedusing the inertial data alone, provided that the sys-tem has been calibrated using the procedure de-scribed in Section 4. The definition of the FOE andCOR can be useful during visual based navigationtasks.

5.3. Registering Depth Maps

With the vision system moving, the acquired depthmaps have to be registered to a common frame of ref-erence. After the alignment using the vertical refer-ence and subsequent ground plane detection, the reg-istration is a 2D problem; only a translation (tx ,ty)and rotation � are needed (see Figure 9).

An approximation to these 2D parameters can befound by projecting the inertial sensed parametersonto the level plane. These allow registering dynamicdepth maps, with moving objects, to a common frameof reference.

6. CONCLUSIONS

This paper has presented a framework for the com-bination of inertial and visual sensing modalities.

Keeping track of the vertical direction provides avaluable spatial reference. Results were shown of ste-reo depth map alignment using the vertical reference.The depth map points are mapped to a vertically

Lobo and Dias: Inertial Sensed Ego-motion • 11

aligned world frame of reference. In order to detectthe ground plane, a histogram is performed for thedifferent heights. Taking the ground plane as a ref-erence plane for the acquired maps, the fusion of mul-tiple maps reduces to a 2D translation and rotationproblem. The dynamic inertial cues can be used as afirst approximation for this transformation, allowinga fast depth map registration method.

The definition of the FOE and COR can be donefrom the inertial cues, and used during visual basednavigation tasks.

Future work will address the fusion of opticalflow computation with the inertial ego-motion esti-mate. The image optical flow imposes a further re-striction to bound the drift in the inertial estimatedego-motion. The computed depth from flow andknown ego-motion can be combined with the stereocorrelation computed depth, producing a more ro-bust 3D reconstruction technique.

REFERENCES

1. H. Carpenter, Movements of the eyes, London PionLimited, London, 1988, 2nd ed.

2. A. Berthoz, The brain’s sense of movement, HarvardUP, Harvard, 2000.

3. J. Lobo and J. Dias, Vision and inertial sensor coopera-tion, using gravity as a vertical reference, IEEE TransPattern Analy Mach Intell 25:(12) (2003), in press.

4. J. Lobo, C. Queiroz, and J. Dias, World feature detec-tion and mapping using stereovision and inertial sen-sors, Robot Auton Syst Elsevier Science, 44:(1) (2003),69–81.

5. R.P.G. Collinson, Introduction to avionics, Chapman &Hall, New York, 1996.

6. G.R. Pitman, Inertial guidance, Wiley, New York, 1962.7. J.J. Allen, R.D. Kinney, J. Sarsfield, M.R. Daily, J.R. Ellis,

J.H. Smith, S. Montague, R.T. Howe, B.E. Boser, R.Horowitz, A.P. Pisano, M.A. Lemkin, W.A. Clark, andT. Juneau, Integrated micro-electro-mechanical sensordevelopment for inertial applications, in Proc 1998 Po-sition Location and Navigation Symposium, April1998.

8. T. Vieville and O.D. Faugeras, Computation of inertialinformation on a robot, in Symposium on Robotics Re-search, edited by H. Miura and S. Arimoto, editors,Fifth Inter MIT, Cambridge, 1989, pp. 57–65.

9. T. Vieville and O.D. Faugeras, Cooperation of the iner-tial and visual systems, in Traditional and non tradi-tional robotic sensors, vol. F 63 of NATO ASI, editedby T.C. Henderson, Springer-Verlag, Berlin, 1990, pp.339–350.

10. T. Vieville, F. Romann, B. Hotz, H. Mathieu, M. Buffa,L. Robert, P.E.D.S. Facao, O. Faugeras, and J.T. Audren,Autonomous navigation of a mobile robot using iner-tial and visual cues, in Intelligent robots and systems,edited by M. Kikode, T. Sato, and K. Tatsuno, Yoko-hama, 1993.

11. T. Vieville, E. Clergue, and P.E.D. Facao, Computationof ego-motion and structure from visual and inertialsensor using the vertical cue, in ICCV93, 1993, pp. 591–598.

12. T. Vieville, A few steps towards 3D active vision,Springer-Verlag, New York, 1997.

13. B. Bhanu, B. Roberts, and J. Ming, Inertial navigationsensor integrated motion analysis for obstacle detec-tion, in Proc 1990 IEEE Int Conf on Robotics and Au-tomation, Cincinnati, OH, 1990, pp. 954–959.

14. F. Panerai and G. Sandini, Visual and inertial integra-tion for Gaze stabilization, in Proc SIRS’97, Stockholm,1997.

15. F. Panerai and G. Sandini, Oculo-motor stabilizationreflexes: integration of inertial and visual information,Neural Networks, 11:(7-8) (1998), 1191–1204.

16. F. Panerai, G. Metta, and G. Sandini, Visuo-inertial sta-bilization in space-variant binocular systems, RobotiAuton Syste, 30:(1-2) (2000), 195–214.

17. T. Mukai and N. Ohnishi, The recovery of object shapeand camera motion using a sensing system with avideo camera and a gyro sensor, in Proc Seventh IntConf on Computer Vision (ICCV’99), Kerkyra, Greece,September 1999, pp. 411–417.

18. T. Mukai and N. Ohnishi, Object shape and cameramotion recovery using sensor fusion of a video cameraand a gyro sensor, Inf Fusion, 1:(1) (2000), 15–53.

19. R. Kurazume and S. Hirose, Development of imagestabilization system for remote operation of walkingrobots, in Proc 2000 IEEE Int Conf on Robotics andAutomation, San Francisco, CA, April 2000, pp. 1856–1860.

20. S.R. Coorg, Pose imagery and automated three-dimensional modeling of urban environments, PhDthesis, Massachusetts Institute of Technology, Septem-ber 1998.

21. S. You, U. Neumann, and R. Azuma, Hybrid inertialand vision tracking for augmented reality registration,in Proc IEEE Virtual Reality ’99, Houston, Texas,March 1999, pp. 260–267.

22. W.A. Hoff, K. Nguyen, and T. Lyon, Computer vision-based registration techniques for augmented reality, inProc of Intelligent Robots and Computer Vision, No-vember 1996, pp. 538–548.

23. L.-Chai, W.A. Hoff, and T. Vincent, 3-D motion andstructure estimation using inertial sensors and com-puter vision for augmented reality, Presence: TeleopVirt Environ 11:(5) (2002), 474–492.

24. E.D. Dickmanns, Vehicles capable of dynamic vision: anew breed of technical beings? Artif Intelli 103 (1998),49–76.

25. K. Kanatani, Geometric computation for machine vi-sion, Oxford UP, Oxford, 1993.

26. J. Stolfi, Oriented projective geometry, a framework forgeometric computations, Boston Academic, Boston,1991.

27. K.K. Gillingham and F.H. Previc, Spatial orientation inflight, 2nd ed., chapter 11, Williams and Wilkins, Bal-timore, 1996.

28. J. Lobo, Inertial sensor data integration in computervision systems, Master’s thesis, University of Coimbra,April 2002.

12 • Journal of Robotic Systems—2004

29. B.K.P. Horn, Closed-form solution of absolute orienta-tion using unit quaternions, J Opt Soc Am 4:(4) (1987),629–462.

30. J. Alves, J. Lobo, and J. Dias, Camera-inertial sensormodelling and alignment for visual navigation, in Proc

11th Int Conf on Advanced Robotics, Coimbra, Portu-gal, July 2003, pp. 1693–1698.

31. K. Konolige, Small vision systems: hardware andimplementation, in 8th Int Symposium on Robotics Re-search, Hayama, Japan, October 1997.


Recommended