Image-based ICP Algorithm for Visual Odometry Using a RGB-D Sensor in a Dynamic...

Image-based ICP Algorithm for Visual Odometry Using a

RGB-D Sensor in a Dynamic Environment

Deok-Hwa Kim and Jong-Hwan Kim

Department of Robotics Program, KAIST

335 Gwahangno, Yuseong-gu, Daejeon 305-701, Republic of Korea

[email protected], [email protected]

Abstract. This paper proposes a novel approach to calculate visual odometry

using Microsoft Kinect incorporating depth information into RGB color

information to generate 3D feature points based on speed up robust features

(SURF) descriptor. In particular, the generated 3D feature points are used for

calculating the iterative closest point (ICP) algorithm between successive

images from the sensor. The ICP algorithm works based on image information

of features differently from previous approaches. This paper suggests one of the

modified versions for a state-of-the-art implementation of the ICP algorithm.

Such an approach makes accurate calculation of the rigid body transformation

matrix for visual odometry in a dynamic environment. From this calculation

step, dynamically moving features can be separated into outliers. Then, the

outliers are filtered with random sample consensus (RANSAC) algorithm for

accurate calculation of the rigid body transformation matrix. The experiments

demonstrate that visual odometry is successfully obtained using the proposed

algorithm in a dynamic environment.

Keywords: Visual Odometry, RGB-D Sensor, Iterative Closest Point

Algorithm, Dynamic Environments, SLAM

1 Introduction

Mobile robots containing wheels use their encoder values for odometry information.

However, aerial vehicles or humanoid robots cannot directly use their encoder values

for odometry information. Because of this constraint, visual odometry has become

more important than another sensor values in their fields. Actually, the robotics and

computer vision communities have developed many techniques and algorithm for 3D

mapping and visual odometry using range scanner [5, 6], stereo cameras [3],

monocular cameras [4, 7] and RGB-D sensors [1, 2, 10].

Most visual odometry systems require the spatial alignment of successive camera

frames. To deal with the alignment problem, the ICP algorithm [11] has been used.

This algorithm is to minimize the difference between the two sets of the points. It is

often employed to reconstruct 2D or 3D surfaces from different scans.

If there are some unreliable points in a set of features, the systems using those

points will be unstable. To solve this problem, RANSAC algorithm, which is very

mailto:[email protected]

mailto:[email protected]

simple and useful algorithm, has been used widely [12].

There have been many researches on visual odometry using a RGB-D sensor based

on the ICP and RANSAC algorithms. However, dynamic environments have not been

considered in those researches. Considering dynamic environment applications, this

paper proposes a novel approach to calculate visual odometry with the modified ICP

algorithm based on image information.

2 Visual Odometry System Using a RGB-D Sensor

Many researchers proposed visual odometry systems using a RGB-D sensor. Those

visual odometry systems follow similar algorithm [1, 10]. This chapter introduces the

algorithm for getting the visual odometry information. Visual odometry information

can be obtained by five procedures. First of all, it needs to get feature points. Visual

odometry system works based on rotation-invariant feature points. For the feature

detection, SURF algorithm [8] is employed. SURF algorithm have many parallel

processing computation, so it can be applied to graphic processing unit (GPU)

processing for boosting speed. From the SURF algorithm, the feature information,

which contains feature position, feature scale, etc, can be obtained. After getting

features, feature matching algorithm is computed between prior frame and current

frame. The next step to get odometry information is 3D reconstruction using the

camera intrinsic parameters (focal lengths, principal points) and the depth information.

It can be done from the proportional equation between the focal length and the depth

information, easily. After the step, the inliers detection has to be computed by re-

projection using the homography between the prior image plane and the current image

plane. The homography is found by RANSAC algorithm. From this step, matched

features can be refined.

Fig 1. The Image matching result with SURF GPU algorithm.

Using the matched features, the rigid body transformation matrix [9] is calculated.

It is computed by singular value decomposition (SVD) method for decomposing the

cross-dispersion matrix [C]. The cross-dispersion matrix is computed from

[ ]

∑

,

where is the position of the i-th point measured in the prior image; the

position of the i-th point measured in the current image; the mean position of the

prior image; the mean position of the current image. It can be decomposed to [U],

[W], [V] by SVD where [U] and [V] are the orthogonal matrices, and [W] is the

diagonal matrix which contains the singular values of matrix [C]. For calculating the

rigid body transformation matrix, rotation matrix is computed from

[ ] [ ][ ] .

And translation vector is estimated by

[ ] .

Through the combination of [R] matrix and t vector, the rigid body transformation

matrix can be computed. Based on this matrix, estimation for odometry information is

conducted.

For more accurate estimation of the transformation matrix, RANSAC algorithm is

applied. The result from this procedure is used for initial rigid body transformation

matrix. However, there still remain errors in this transformation matrix. More

compensation algorithm is needed.

Fig 2. The odometry information from up, forward, backward, down movements.

For more compensation of rigid body motion, this paper uses ICP algorithm. This

algorithm has been used for 2D or 3D reconstruction in the vision field. It is consists

of 3 procedures. Firstly, finding closest points from feature sets is required. KD-tree

searching algorithm is used for searching closest points. In this paper, this procedure

is modified to use the image information and it is discussed in detail in the next

section. And next procedure is finding the rigid body transformation matrix from the

closest point sets. To find the matrix, the SVD method is applied. After then, update

procedure is computed. This update step can make new closest sets of feature. From

the result of iterative searching for transformation matrix, odometry information gets

more precise.

Those overall procedures produce visual odometry information. Figure 2 shows the

result from the continuous up, forward, backward and down movements for visual

odometry. In the next section, accurate visual odometry algorithm in a dynamic

environment (consisting of static and dynamic objects) is proposed.

3 The Image-based ICP algorithm in a Dynamic Environment

In a dynamic environment, visual odometry cannot be correctly computed because

dynamic movements affect unfit matching in a part of getting closest sets in the ICP

algorithm. In this section, the ICP algorithm based on images is proposed for the

dynamic environment applications.

3.1 The image-based ICP algorithm

The ICP algorithm based on images is different from original one in detecting the

closest sets of features. At a part of finding closest sets in ICP algorithm, an image

information with the 3D position information of the feature is combined. This image

information can contribute to the precise detection of matched features.

If the features are matched more precisly, they can be divided into inliers and

outliers. Dynamic obstacle features are classified into the outliers because those

features have different movements from the others.

Algorithm 1. The image based ICP

( )

←

for i=0 to max_iteration do

If( ) break;

end

Fig 3. Diagram of finding closest set of feature using the proposed algorithm.

Algorithm 1 shows the image-based ICP algorithm. “FindClosestSetPoint” function

has two more parameters , where is a set of RGB for the prior image

features; is a set of RGB for the current image features. This function uses KD-

tree searching algorithm which is faster than raw searching algorithm for getting the

closest sets. After then, solving the rigid body transformation matrix using RANSAC

algorithm follows.

Figure 3 shows the diagram of finding closest set of feature using the image-based

ICP algorithm. Each finding procedure considers euclidean distance but also

normalized RGB distance.

3.2 Benefits of the image based ICP algorithm

This paper proposes a novel visual odometry algorithm to be used in a dynamic

environment. It is very robust to dynamic obstacles. Therefore, it can be applied to the

real life for computing visual odometry. For example, aerial vehicles are very hard to

get odometry information from their motors. In that field, this visual odometry

algorithm using RGB-D sensor can be applied.

4 Experiments

The proposed algorithm was tested with RGB-D sensor as known as the Kinect sensor.

Experimental environments were Gentoo OS, Intel i5 3.3GHz Quad-core processor,

NVIDIA GTX 560 GPU and 6GB RAM. Average computation time was 152.0 ms

per frame. The experiments were carried out in a static environment and in a dynamic

environment.

4.1 The experiment result in a static environment

Fig 4. The experiment result in a static environment for up, forward and return

movements: (left) using the conventional ICP algorithm; (right) using the image-based

ICP algorithm.

Experiments in a static environment were conducted in a room where there were no

movements excluding the Kinect sensor and experimenter. Experimenter grabbing the

Kinect sensor moved it upward, forward, backward, and downward.

Figure 4 is a result of experiment in a static environment. Left graph is a result of

visual odometry using the conventional ICP algorithm. Right graph is a result of

visual odometry using the image-based ICP algorithm. Both algorithms have showed

similar results in a static environment. From this result, the image-based ICP

algorithm also can be applied to calculate visual odometry in a static environment.

4.2 The experiment result in a dynamic environment

For the realization of a dynamic environment,

experimenter roamed around a room, as Fig. 5 shows.

In this environment, calculating the visual odometry

was performed using a fixed Kinect sensor.

Figure 6 shows a result of the experiment with the

proposed visual odometry algorithm compared with the

conventional ICP.

The left graph is a result of the conventional ICP

algorithm. It shows a result of unstable computation

for the visual odometry. This result came from no inliers detection in the ICP

algorithm. The outliers which came from roaming people were not filtered to solve

rigid body transformation matrix. However, the right graph shows a very stable

computing result for the visual odometry in a dynamic environment with the fixed

Kinect. The proposed algorithm has additional filtering steps, such as the closest set

detection based on images and outlier suppression using the RANSAC algorithm, in

the ICP algorithm. As a result, the visual odometry using the RGB-D sensor by the

proposed algorithm can be more accurate than the conventional ICP algorithm in a

dynamic environment.

Fig 6. The experiment result in a dynamic environment with a fixed RGB-D camera: (left)

using the conventional ICP algorithm; (right) using the image-based ICP algorithm.

Fig 5. A dynamic environment.

5 Conclusion

This paper proposed the image-based ICP algorithm for calculating visual odometry.

It was demonstrated that the proposed algorithm was very robust to dynamic obstacles.

In this algorithm, RGB information of feature point was used for finding the closest

set points in the ICP algorithm. After finding the closest set points, RANSAC

algorithm was computed for solving the rigid body transformation matrix with

eliminating outlier effect of dynamic object features. Such overall procedures

contributed to promote accurate calculation of odometry information in a dynamic

environment. However, the proposed algorithm still has a problem in computation

time caused by RANSAC algorithm. Therefore, we need to solve the real-time

problem considering the case of fast movements of the RGB-D sensor. Furthermore,

the SLAM system has to be considered for our further works using the proposed

visual odometry algorithm.

Acknowledgements

This research was supported by the MKE (The Ministry of Knowledge Economy),

Korea, under the Human Resources Development Program for Convergence Robot

Specialists support program supervised by the NIPA (National IT Industry Promotion

Agency) (NIPA-2012-H1502-12-1002)

References

1. Albert S. Huang, Abraham Bachrach, Peter Henry, Michael Krainin, Daniel Maturana,

Dieter Fox and Nicholas Roy (2011) Visual Odometry and Mapping for Autonomous

Flight Using an RGB-D Camera. Paper presented at International Symposium on

Robotics Research (ISRR), Flagstaff, AZ, USA, Aug 2011

2. Flank Steinbrucker, Jurgen Sturm and Daniel Cremers (2011) Real-Time Visual

Odometry from Dense RGB-D Images. Paper presented at IEEE International

Conference on Computer Vision Workshops, Barcelona, Spain, pp 719-722, Nov 2011

3. Navid Nourani-Vatani, Jonathan Roberts, and Mandyam V. Srinivasan (2008) IMU

aided 3D visual odometry for car-like vehicles. Paper presented at 10th Australasian

Conference on Robotics & Automation, Canberra, ACT, Australia, pp 1-8, Dec 2008

4. Laurent Kneip, Margarita Chli, and Roland Siegwart (2011) Robust Real-Time Visual

Odometry with a Single Camera and an IMU. Paper presented at The 22nd British

Machine Vision Conference, University of Dundee, pp 16.1-16.11, Sep 2011

5. Marco Baglietto, Antonio Sgorbissa, Damiano Verda and Renato Zaccaria (2011)

Human navigation and mapping with a 6DOF IMU and a laser scanner. Paper

presented at Robotics and Autonomous Systems 59:1060-1069. doi:

10.1016/jrobot.2011.08.005

6. Armin Hornung, Kai M. Wurm, and Maren Bennewitz (2010) Humannoid Robot

Localization in Complex Indoor Environments. Paper presented at IEEE/RSJ

International Conference on Intelligent Robots and Systems, Taipei, Taiwan, pp 1690-

1695, Oct 2010

7. Andrew J. Davison, lan D. Reid, Nicholas D. Molton, and Olivier Stasse (2007)

MonoSLAM: Real-Time Single Camera SLAM. Paper presented at IEEE

Transactions On Pattern Analysis And Machine Intelligence 29(6):1052-1067. doi:

10.1109/TPAMI.2007.1049

8. Herbert Bay, Andreas Ess, Tinne Tuytelaars, and Luc Van Gool (2008) Speeded-Up

Robust Feautres (SURF). Paper presented at Computer Vision and Image

Understanding 110(3):346-359. doi: 10.1016/j.cviu.2007.09.014

9. John H. Challis (1994) A Procedure for determining rigid body transformation

parameters. Paper presented at Journal of Biomechanics 28(6):733-737. doi:

10.1016/0021-9290(94)00116-L

10. Peter Henry, Michael Krainin, Evan Herbst, Xiaofeng Ren, and Dieter Fox (2010)

RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor

Environments. Paper presented at International Symposium on Experimental Robotics,

Delhi, India, Dec 2010

11. Aleksandr V. Segal, Dirk Haehnel, and Sebastian Thrun (2009) Generalized-ICP.

Paper presented at Robotics: Science and Systems, Seattle, USA, June 2009

12. Martin A. Fischler, and Robert C. Bolles (1981) Random sample consensus: a

paradigm for model fitting with applications to image analysis and automated

cartography. Paper presented at Communications of the ACM 24(6):381-395. doi:

10.1145/358669.358.692

Date post:	07-Jul-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Image-based ICP Algorithm for Visual Odometry Using a RGB-D Sensor in a Dynamic...

Documents