+ All Categories
Home > Technology > Dr.Kawewong Ph.D Thesis

Dr.Kawewong Ph.D Thesis

Date post: 11-Jun-2015
Category:
Upload: ohasegawa-lab-tokyo-tech
View: 5,071 times
Download: 0 times
Share this document with a friend
Popular Tags:
48
PIRF-Nav: An Online Incremental Appearance- based Localization and Mapping in Dynamic Environments Aram Kawewong Hasegawa Laboratory Department of Computational Intelligence and Systems Science Interdisciplinary Graduate School of Science and Engineering Tokyo Institute of Technology 1
Transcript
Page 1: Dr.Kawewong Ph.D Thesis

PIRF-Nav: An Online Incremental Appearance-based Localization and Mapping in

Dynamic Environments

Aram Kawewong

Hasegawa Laboratory Department of Computational Intelligence and Systems Science

Interdisciplinary Graduate School of Science and Engineering

Tokyo Institute of Technology

1

Page 2: Dr.Kawewong Ph.D Thesis

Simultaneous Localization and Mapping, or SLAM, is a navigation system needed for every kind of mobile robots

In the unfamiliar environment, the robot must be able to perform two important tasks simultaneously

Mapping the new place if the place has never been visited previously

Localizing itself to some mapped place if the place has been visited before

2

Introduction to SLAM

Page 3: Dr.Kawewong Ph.D Thesis

3

Appearance-based Localization and Mapping (FAB-MAP)

Page 4: Dr.Kawewong Ph.D Thesis

Why don’t we just use GPS ? GPS is not always reliable in the crowded city centre

GPS can only locate the coordinate/position of the agent but not the corresponded scene; how can the robot answer the question “look at this picture and tell me where it is ?” or “have you ever visited this place before ? Can you describe about the nearby places ?”

No false positive (can have false negative) If the robot is not confident then it should answer “this is the

new place”. If the robot is to answer “this place is the same place as the place ….”, it must be 100% correct.

100% precision (all answers must be correct)

4

Why Visual SLAM ? What are the Challenging ?

Page 5: Dr.Kawewong Ph.D Thesis

Place Recognition (Computer Vision)

Localization and Mapping (Robotics)

Input Images All testing images are known to come from somewhere in the map

Every input image is a testing image; it might come from somewhere in the map or it might be the previously unseen place

Environment Closed Environment Opened Environment

Precision Precision-1 is not the main concern if the recall rate is reasonably high

Precision-1 is the first priority concern; one false positive may lead the serious error in navigation

5

Appearance-based Localization and Mapping VS Place Recognition

Page 6: Dr.Kawewong Ph.D Thesis

100% Precision with very high Recall Rates

Can run incrementally in an online manner

Life-long Low computation time

Consume less memory

Suitable to navigate in large-scale environments

Can solve 2 main problems: Dynamical Changes

Perceptual Aliasing (Different Places but look similar)

Note: Coordinate-based Localization is not required here

6

Appearance-based SLAM’s Common Objectives

Page 7: Dr.Kawewong Ph.D Thesis

1. FAB-MAP (Cummins & Newman, IJRR’08) Considering the efficiency at 100% precision, the obtained

recall rate of FAB-MAP (a State-of-the-art method) is still not so high.

An offline generation process for dictionary generation is necessary.

2. Fast Incremental Bag-of-words (Angeli, et al. T-RO’08) The system can run incrementally; offline dictionary

generation process is not needed. Accuracy is said to be less than or equal to that of FAB-MAP Consume much higher memory than FAB-MAP

7

Visual SLAM’s Related Works

Page 8: Dr.Kawewong Ph.D Thesis

FAB-MAP (IJRR’08)

Inc. BoW (T-RO’08)

PIRF-Nav (prop.)

Ability to incrementally run without needs for offline dictionary generation process

No

Yes

Yes

Memory Consumption Low High Moderate

Ability to run in real-time Yes Yes Yes

Robustness against dynamical changes*

Moderate (~40% on

City Centre)

Low (~20% on City

Centre)

High (~85% on

City Centre)

8

What Do We Want ? :PIRF-Nav’s Advantages

* The recall rate is considered at 100% precision

Page 9: Dr.Kawewong Ph.D Thesis

Making use of PIRF, we can detect the good landmarks of each individual place

The extracted PIRFs should be sufficiently informative to represent the place so that the system does not need the preliminary generated visual vocabulary

The number of PIRFs is sufficiently small to be used in the real-time application

Because the PIRF is robust against dynamical changes of scenes, the PIRF-based visual SLAM (called PIRF-Nav) become an efficient online incremental visual SLAM

9

Basic Idea & Concept of PIRF-Nav

Page 10: Dr.Kawewong Ph.D Thesis

Outdoor Scenes generally include distant objects whose appearances are robust against the changes in camera position

Averaging the “slow-moving” local features which capture such objects give us the less and more robust features

10

Basic Idea of PIRFs (proposed)

10

Page 11: Dr.Kawewong Ph.D Thesis

11

PIRF Extraction Algorithm

Image Sequence

3 0 0 1 4 3

1 3 6 0 5 0

2 4 1 5 3 1

0 2 6 4 1 3

4 1 0 5 0 0

0 1 5 0 4 2

Sliding Window; w = 3

Sequence of Matching Index Vectors

Page 12: Dr.Kawewong Ph.D Thesis

Exp. 1 Scenes From Suzukakedai

Exp. 1 Scenes From O-okayama

12

Briefly on PIRF’s Performance

Training (640x428) Testing (640x428)

580 489

Training (640x428) Testing (640x428)

450 493

Page 13: Dr.Kawewong Ph.D Thesis

13

PIRF’s Performance

0.00%10.00%20.00%30.00%40.00%50.00%60.00%70.00%80.00%90.00%

100.00%

24.5

4%

31.0

8%

22.2

9%

18.2

3%

93.4

6%

27.5

9%

45.7

5%

36.7

1%

30.2

2%

77.4

8%

Recognition Rate of Suzukakedai and O-okayama

Su

zuka

ked

ai

O-O

kaya

ma

Page 14: Dr.Kawewong Ph.D Thesis

14

Even With These Strong Changes, PIRF Still Works Well !!!

Highly Dynamic Changes in Scenes

Illumination Changes in Scenes

Page 15: Dr.Kawewong Ph.D Thesis

15

PIRF (City Centre Dataset)

Original Descriptors (SIFT)

Position-invariant Robust Feature (PIRF) (proposed)

Page 16: Dr.Kawewong Ph.D Thesis

Overall Processing Diagram

Step 1: Perform simple feature matching. The score is calculated based on the popular term frequency-inverted document frequency weighting

Step 2-3: Adapt the score by considering the neighbors and then perform normalization

Step 4: Perform second integration over the score’s space for relocalization

16

PIRF-Nav Processing Diagram (prop.)

Page 17: Dr.Kawewong Ph.D Thesis

At time t, a map of the environment is a collection of nt discrete and disjoint location

Each of these locations , which has been created from past image , has an associated model

The model is a set of PIRFs

17

Notation Definition

𝐋 = {𝐿1,… ,𝐿𝑛𝑡 }

𝐿𝑖

𝐼𝑖

𝑀𝑖

𝑀𝑖

Page 18: Dr.Kawewong Ph.D Thesis

The current model is compared to each of all mapped models using standard feature matching with distant threshold

Each matching outputs the similarity score s

is model of the location which is a virtual location for the event “no loop closure occurred at time t”.

Based on the obtained score s, the system proceed to the next step if

18

STEP 1: Simple Feature Matching

𝐌𝑡 = {𝑀0,… ,𝑀𝑛𝑡} 𝑀𝑡

𝜃2

𝑀0 𝐿0

𝑎𝑟𝑔𝑚𝑎𝑥(𝑠) ≠ 0

Page 19: Dr.Kawewong Ph.D Thesis

The similarity score s is calculated by considering the term frequency – inverted document frequency (tf-idf) weighting (Sivic & Zisserman, ICCV’ 03) :-

is the number of occurrences of visual word w in is the total number of visual words in is the number of models containing word w N is the total number of all existing models

19

STEP 1: Simple Feature Matching (Continued)

tf − idf =𝑛𝑤𝑖𝑛𝑖

log𝑁

𝑛𝑤

𝑛𝑤𝑖 𝑀𝑖 𝑛𝑖 𝑀𝑖 𝑛𝑤

Page 20: Dr.Kawewong Ph.D Thesis

To be used with PIRF, the function is then converted to

is the number of models , containing PIRFs which match the kth PIRF of the input model

is the number of all matched PIRFs between input and the query model

The system proceeds to STEP 2 if and only if the maximum score does not belong to and is greater than

20

STEP 1: Simple Feature Matching (Continued)

𝑠𝑖 = log 𝑛𝑡𝑛𝑤𝑘

m i

k=1

𝑛𝑤𝑘 𝑀𝑗 , 0 ≤ 𝑗 ≤ 𝑛𝑡 , 𝑗 ≠ 𝑖

𝑀𝑡

𝑀0 𝜏1

𝑚𝑖

Page 21: Dr.Kawewong Ph.D Thesis

Accepting of rejecting loop-closure detection based on the score from only single image is sensitive to noise

This can be handled by considering the similarity score of neighboring image models as:-

The term is the transition probability generated from a Gaussian on the distance in time between i and k

stands for the number of neighbors examined

21

STEP 2: Considering Neighbors

𝛽𝑖 = 𝑠𝑘 ∙ 𝑝𝑇 𝑖, 𝑘 𝑖+𝜔

𝑘=𝑖−𝜔

𝑝𝑇 𝑖,𝑘

𝜔

Page 22: Dr.Kawewong Ph.D Thesis

Done by considering the standard deviation and mean value over all scores

indicates the number of neighbours taken into consideration

The beta-scores are converted into normalized score according to the equation

where 22

STEP 3: Normalizing the Score

ln

𝐶𝑖 =

𝛽𝑖 − 𝜎

𝜇, if 𝛽𝑖 ≥ 𝑇

1, Otherwise

𝑇 = 𝜎 + 𝜇

Page 23: Dr.Kawewong Ph.D Thesis

The obtained location would be accepted as loop closure if

Ideally, the neighboring model scores of location Lj should decrease symmetrically from a model score. However, scenes in dynamic environments usually contains moving objects that frequently cause the occlusion. The score of some assigned location may not be symmetrical.

23

STEP 4: Re-localization

𝐿𝑗 𝛽𝑗 − 𝑇 > 𝜏2

Page 24: Dr.Kawewong Ph.D Thesis

24

Step 4: Relocalization (Sample Problems)

Location assigned from Step 3 does not have a symmetrical score

Performing one more summation can shift the location to the right one

Page 25: Dr.Kawewong Ph.D Thesis

Therefore, we perform the second summation over the neighbouring score model to achieve a more accurate localization

The obtained normalized score for all possible models determines the most potential loop-closure location , where

25

STEP 4: Re-Localization

𝐿𝑗 𝑗 = argmax

𝑖𝐶𝑖′

𝐶𝑖′ = 𝐶𝑘 ∙ 𝑝𝑇 𝑖,𝑘

𝑖+𝜔

𝑘=𝑖−𝜔

Page 26: Dr.Kawewong Ph.D Thesis

Three datasets have been used City Centre (2474 images with size 640 x 480)

The dataset was taken to address to problem of dynamical changes of scenes in the city centre.

New College (2146 images with size 640 x 480) The dataset was taken to address the problem of perceptual aliasing. By

this dataset, a robot walked to the same place many times. Many different places look very similar.

Suzukakedai (1079 images with size 1920 x 1080) The dataset was taken by video camera attached with the omnidirectional

lens. The dataset was taken to address the problem of highly dynamical changes where the different event was organized (i.e. open-campus event)

26

Results & Experiments : DATASETS

Page 27: Dr.Kawewong Ph.D Thesis

City Centre

27

Results & Experiments: DATASETS

Page 28: Dr.Kawewong Ph.D Thesis

New College

28

Results & Experiments: DATASETS

Page 29: Dr.Kawewong Ph.D Thesis

Suzukakedai

29

Results & Experiments: DATASETS

Page 30: Dr.Kawewong Ph.D Thesis

Among many visual SLAM methods, FAB-MAP (Cummins & Newman. IJRR’08) and the fast incremental BoW method of Angeli et al. (T-RO’ 08) are considered to be state-of-the-art.

Both of them are based on Bag-of-words scheme Each of them offer different advantages

FAB-MAP High accuracy with offline dictionary generation Angeli et al. Lower than or equal accuracy to FAB-MAP but

with an online incremental dictionary generation

PIRF-Nav must offer higher accuracy than FAB-MAP while being an online incremental method like Angeli et al.

30

Results & Experiments: BASELINE

Page 31: Dr.Kawewong Ph.D Thesis

31

Evaluation on Appearance-based Loop-closure Detection Problem

Input Image

Loop-Closing ?

Add new place to the map

Find the loop-closure place

Output the loop-closure location

Binary Classification: New place / Old Place

Image Retrieval Problem: Retrieve the most likely place for

loop-closure

PrecisionA = Correct Loop-closure

All Loop-closure

RecallA = Correct Loop-closure

All labeled loop-closure

PrecisionB = Correctly retrieved image

All retrieved images

RecallB = Correctly retrieved image

All labeled images

Page 32: Dr.Kawewong Ph.D Thesis

Actually, performance should be evaluated by two graphs:

Precision A – Recall A curve

Precision B – Recall B curve

However, for compact representation, most works in visual SLAM use Precision B – Recall B curve to show the performance because

The binary classification is currently not so much problematic

Important challenge is given to the performance of image retrieval

32

Evaluation on Appearance-based Loop-closure Detection Problem

Page 33: Dr.Kawewong Ph.D Thesis

33

Evaluation on Appearance-based Loop-closure Detection Problem

(City Centre)

Precision A – Recall A: Focusing on only the problem of

saying “YES/NO” loop-closure detected is currently trivial

Precision B – Recall B: Instead, given that the precision of

the “YES/NO” loop-closure detected is 100%, it is much more

interesting to see how accurate the system can correctly retrieve the

corresponding image

Page 34: Dr.Kawewong Ph.D Thesis

34

Result 1: City Centre Vehicle Trajectory Loop Closure Detection

PIRF-Nav (100% Precision) (proposed) FAB-MAP (100% Precision)

Page 35: Dr.Kawewong Ph.D Thesis

35

Result 1 : City Centre (Precision-Recall Curve)

Page 36: Dr.Kawewong Ph.D Thesis

36

Result 1: City Centre (Computation Time)

*It is noteworthy that all programs of PIRF-Nav were written in MATLAB while FAB-MAP was written in C.

Page 37: Dr.Kawewong Ph.D Thesis

37

Result 2: New College Vehicle Trajectory Loop Closure Detection

PIRF-Nav (100% Precision) (proposed) FAB-MAP (100% Precision)

Page 38: Dr.Kawewong Ph.D Thesis

38

Result 2: New College (Precision-Recall Curve)

Page 39: Dr.Kawewong Ph.D Thesis

39

Result 3: Suzukakedai

Vehicle Trajectory Loop Closure Detection

PIRF-Nav (100% Precision)

Page 40: Dr.Kawewong Ph.D Thesis

40

Result 3: Suzukakedai (Precision-Recall Curve)

Page 41: Dr.Kawewong Ph.D Thesis

41

Result 4: Combined Datasets (Precision-Recall Curve)

Note: We did not test FAB-MAP on this experiment because FAB-MAP completely failed in Suzukakedai Dataset. Also the results on City Centre and New College clearly imply that FAB-MAP will not gain better accuracy in this experiment.

Page 42: Dr.Kawewong Ph.D Thesis

42

Sample Matched Images (Dynamical Changes in Major Part of Scene)

Page 43: Dr.Kawewong Ph.D Thesis

43

Sample Matched Images (Different View-Points)

Page 44: Dr.Kawewong Ph.D Thesis

PIRF-Nav outperforms FAB-MAP in term of accuracy with more than 80% recall rate at 100% precision on all datasets provided by the authors

PIRF-Nav offers an online and incremental ability to run in very different environments

Although the computation time of PIRF-Nav at the same image scale is slower than FAB, PIRF-Nav compensates this drawback by processing on smaller image scale since the accuracy is still considerably much higher than FAB-MAP

44

Conclusions

Page 45: Dr.Kawewong Ph.D Thesis

45

Thank you for Your Kind Attention

“DOUBT IS THE FATHER OF INVENTION”

QUOTED BY GALILEO

Page 46: Dr.Kawewong Ph.D Thesis

Journal 1. A. Kawewong and O. Hasegawa, "Classifying 3D Real-World Texture Images by

Combining Maximum Response 8, 4th Order of Auto Correlation and Colortons, " Jour. of Advanced Comp. Intelligence and Intelligent Informatics, vol. 11, no. 5, 2007.

2. A. Kawewong, Y. Honda, M. Tsuboyama, and O. Hasegawa, "Reasoning on the Self-Organizing Incremental Associative Memory for Online Robot Path Planning," IEICE Trans. Inf. & Sys., vol. E93-D, no. 3, 2009. (impact factor 0.369)

3. 本田雄太郎,Aram Kawewong, 坪山学,長谷川修:"半教師ありニューラルネットワークによる場所細胞の獲得とロボットの自律移動制御",信学論D,2009,採録決定

4. A. Kawewong, N. Tongprasit, S. Tangruamsub and O. Hasegawa, “Online and Incremental Appearance-based SLAM in Highly Dynamic Environments, " Int’l Jour. Robotics Research (IJRR). (To Appear in 2010, impact factor 2.882, rank#1 in robotics)

5. A. Kawewong, S. Tangruamsub and O. Hasegawa, “Position-Invariant Robust Features for Long-term Recognition of Dynamic Outdoor Scenes," IEICE Trans. Inf. & Sys. (conditional accepted)

46

Publication

Page 47: Dr.Kawewong Ph.D Thesis

Conferences 1. A. Kawewong and O. Hasegawa, "3D Texture Classification by Using Pre-testing

Stage and Reliability Table, " IEEE Proc. International Conference on Image Processing (ICIP), (2005).

2. A. Kawewong and O. Hasegawa, "Combining Rotationally Variant and Invariant Features Based on Between-Class Error for 3D Texture Classification, " IEEE Int’l Conf. On Computer Vision (ICCV) Workshop, 2005.

3. A. Kawewong, Y. Honda, M. Tsuboyama, O. Hasegawa, "A Common-Neural-Pattern Based Reasoning for Mobile Robot Cognitive Mapping, " In Proc. Int’l Conf. Neural Information Processing (ICONIP), 2008.

4. A. Kawewong, Y. Honda, M. Tsuboyama, O. Hasegawa, "Common-Patterns Based Mapping for Robot Navigation, " in Proc. IEEE Int’l Conf. Robotics and Biomimetics (ROBIO), 2008.

5. S. Tangruamsub, M. Tsuboyama, A. Kawewong and O. Hasegawa, "Mobile Robot Vision-Based Navigation Using Self-Organizing and Incremental Neural Networks," in Proc. Int’l Joint Conf. Neural Networks (IJCNN), 2009.

47

Publication

Page 48: Dr.Kawewong Ph.D Thesis

Conferences 6. A. Kawewong, S. Tangruamsub, and O. Hasegawa, "Wide-baseline Visible

Features for Highly Dynamic Scene Recognition," in Proc. Int'l Conf. Computer Analysis of Images and Patterns (CAIP), 2009.

7. N. Tongprasit, A. Kawewong and O. Hasegawa, "Data Partitioning Technique for Online and Incremental Visual SLAM," in Proc. Int’l Conf. on Neural Information Processing (ICONIP), 2009. (oral & student travel award)

48

Publication


Recommended