Home >Documents >3D Computer Vision · PDF file Iterative Closest Point Iterative method to align two free-form...

3D Computer Vision · PDF file Iterative Closest Point Iterative method to align two free-form...

Date post:18-Oct-2020
Category:
View:1 times
Download:0 times
Share this document with a friend
Transcript:
  • 3D Computer Vision

    F. Tombari, S. Salti

  • F.Tombari, S. Salti

    Introduction

    3D sensors

    Data representations

    Differential entities and operators

    Hands on session – Point Cloud Library (PCL) PCL installation

    Load and visualize a point cloud

    Compute normals

    Summary – Day 1

  • Introduction

    F. Tombari, S. Salti

    3D Computer Vision

  • F.Tombari, S. Salti

    3D Computer Vision

    Reconstruction

    Recognition

    Acquisition

  • F.Tombari, S. Salti

    Autonomous mobile robots – AMR (navigation)

    Object recognition, grasping and manipulation (social robotics)

    Applications - robotics

  • F.Tombari, S. Salti

    Applications - video surveillance

    Tracking and motion detection

    People counting

    Retail intelligence Crowd monitoring

    Behavior analysis

  • F.Tombari, S. Salti

    Applications – semantic segmentation

    Urban data classification

    Trimble Code Sprint

  • F.Tombari, S. Salti

    Shape retrieval (www)

    High-def 3D model acquisition (computer graphics)

    Biometrical systems (eg. face recognition)

    Medical imaging (MRI, CT, PET, x-ray, ultrasound, ..)

    Other applications

    Google warehouse 3D medical imaging

    3D face recognition

    Michelangelo project

  • F.Tombari, S. Salti

    Autonomous vehicle navigation (AVN)

    Augmented reality

    Human computer interaction (HRI)

    Videogaming, entertainment

    ..

    And yet..

    Augmented reality by Lego and Intel Microsoft Xbox

    Autonomous Vehicle Navigation

    vislab VIAC.mp4

  • F.Tombari, S. Salti

    Reconstruction 3D registration

    SLAM

    Meshing

    Recognition Object recognition under clutter and occlusions

    Shape retrieval/categorization

    People/face/obstacle/.. detection

    Tracking Body pose estimation

    People tracking/counting

    Semantic segmentation

    Typical 3D tasks

  • F.Tombari, S. Salti

    Alignment of partially-overlapping 2.5D views

    Useful to yield a high-def, fully-3D reconstruction of an object from views acquired from different view points

    3D registration (1)

  • F.Tombari, S. Salti

    Coarse registration provides an initial guess for the set of views that need to be registered

    Fine registration (Iterative Closest Points - ICP [Besl 92])

    3D registration (2)

    Coarse registration

    Fine registration

    Unordered Input Views

    Coarse Registration Fine Registration

  • F.Tombari, S. Salti

    Iterative Closest Point

    Iterative method to align two free-form shapes

    Input: two sets of 3D points M,S

    Output: the 6DOF transformation (R,t) that best aligning the two point sets

    Given an initial transform, iterate until convergence

    ∀𝑝 ∈ 𝑀 find its Nearest Neighbor 𝑁𝑁(𝑝) ∈ 𝑆

    Absolute orientation [Horn 87][Arun 87]

    find R,t that minimize the mean square error between the set of point pairs (p, NN(p) )

    𝑁𝑁 𝑝 − 𝑅 ∙ 𝑝 + 𝑡 2 𝑝∈𝑀

    • T: distance between centroids

    • R: least-square estimation on the over-determined system represented by the set of pairs

    Convergence criteria:

    Threshold on minimum error

    Maximum number of iterations

    If the initial guess is not good, different initial transforms ought to be tested to avoid local minima

    Efficient versions available (GPU-ICP, [Rusinkiewicz 01])

    Generalized ICP [Segal 09]

  • F.Tombari, S. Salti

    MultiView Reconstruction

    0.5

    0.0

    0.3

    0.7

    0.6 0.2 0.0

    0.1

    0.5 0.4

    0.7

    0.6

    0.5 0.4

  • F.Tombari, S. Salti

    Results

    Spacetime Stereo [Zhang03, Davis05]

    Kinect sensor

  • F.Tombari, S. Salti

    Simultaneous Localization and Mapping incrementally build a map of the agent’s surroundings (mapping)

    Localize itself within that map

    Odometry, inertial sensing Measurement drifts

    Visual odometry [Nistèr 04] [Konolige 06]

    3D / photometric sensors Laser scanner

    Sonar

    Stereo [Sim 06]

    Visual sensors (vSLAM) [Karlsson 05][Folkesson 05]

    • Landmark initialization?

    6DOF SLAM

    monoSLAM [Davison 03] [Eade 06][Clemente 07] Visual odometry + single camera

    SLAM (1)

    Credits: J.B.Hayet

  • F.Tombari, S. Salti

    SLAM (2)

    Extended Kalman Filter

    Landmark extraction

    Geometric/photometric data

    Odometry

    Data association

    Landmarks: • Re-observable • Distinctive • Stationary

    EKF: • Update via odometry • Update via landmark re-observation

    Mapping update Local vs. Global consistency

    (loop closure, bundle adjustment)

  • F.Tombari, S. Salti

    MonoSLAM converging to Structure- from-Motion [Strasdat 10] • PTAM [Klein 07], DTAM [Newcombe 11])

    6DOF SLAM with RGB-D sensors

    • Kinect Fusion [Newcombe 11b]

    • RGB-D dense point cloud mapping [Henry 11]

    SLAM (3)

  • F.Tombari, S. Salti

    Determine the presence of a model in a scene

    Estimate its 6DOF pose

    Challenges: Clutter

    Occlusions

    Point density variations

    Model library size (efficiency)

    Multi-instance

    Usually only rigid transformations are assumed (rotation, translation, scale)

    Object recognition

    Syntethic data

    Spacetime Stereo

    Real-time stereo

  • F.Tombari, S. Salti

    Challenges: Intra-class variations

    Invariance wrt. a high number of transformations including non-rigid deformations (eg. isometries)

    Clutter and occlusions are not present

    General approach Compute a compact representation for a query

    Compare it with all object in the library

    Retrieve the most similar ones - retrieval

    (Assign a label to the query - categorization)

    Shape retrieval and categorization

    Vehicle

    Animal

    Household

    Building

    Furniture

    Princeton Shape Benchmark (PSB) dataset («coarse2» categories)

    SHREC 10 dataset

  • F.Tombari, S. Salti

    Determine 3D connected components with specific properties or belonging to a particular category

    Feature extraction

    Description of 3D keypoints

    Description of clusters [Lloyd 82], such as size, density, eigenvalues of the scatter matrix, ..

    Feature classification

    SVM, Random Trees, kNN, ..

    Semantic segmentation (1)

  • F.Tombari, S. Salti

    Inference on a loopy graph [Tombari 11]

    An undirected graph is built over classified 3D features [Unnikrishnan 08]

    The following function is maximized over the graph

    where

    l is a regularizer, n the classification probability

    Other approaches based on graph inference are those relying on Associative Markov Networks (AMN) [Anguelov 05][Triebel 07][Munoz 09]

    𝑃 𝑋 = 1

    𝑍 𝑒−𝜙𝑖 𝑥𝑖

    𝑖∈𝑆

    𝑒−𝜙𝑖,𝑗 𝑥𝑖,𝑥𝑗

    𝑗∈𝑁 𝑖𝑖∈𝑆

    𝜙𝑖 𝑥𝑖 = 𝜆 1 − 𝜈 𝑥𝑖 evidence (unary)

    compatibility (pairwise)

    𝜙𝑖,𝑗 𝑥𝑖 , 𝑥𝑗 =

    0 𝑖𝑓 𝑥𝑖 = 𝑥𝑗

    𝑒

    − 𝑝𝑖−𝑝𝑗 2 𝜎𝑐 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

    Semantic segmentation (2)

  • 3D Sensors

    F. Tombari, S. Salti

    3D Computer Vision

  • F.Tombari, S. Salti

    3D sensors

    Goal: create a point cloud of (samples from) the surface of an object/scene Collection of distance measurements from the sensor to the surface

    Distances are them transformed into 3D coordinates (x,y,z) by means of calibration information

    Usually, 3D sensors (or 3D scanners) acquire only a view of the object (2.5D data)

    Some sensors also acquire information concerning color or light intensity (RGB-D data)

    First step of every 3D reconstruction / 3D recognition pipeline

    Contact sensors

    Active sensors LIDAR, rangefinders

    Time-of-Flight cameras

    Laser Triangulation

    Structured light

    Medical imaging (CT, MRI)

    ..

    Passive sensors Stereo

    Structure-from-motion

    Shape-from-shading, shape-from-silhouette, shape-from-defocus, ..

  • F.Tombari, S. Salti

    LIDAR

    LIDAR: Light Detection And Ranging A light pulse is emitted from the sensor and the round-trip time is computed

    The higher the time, the further away the point from the sensor

    Usually visible or near-infrared light is used Arrays of emitters are employed together to yield a set of simultaneous range

Click here to load reader

Reader Image
Embed Size (px)
Recommended