Fast, Automatic, Photo-Realistic, 3D Modeling of Building...

transcript

Avideh Zakhor

Video and Image Processing Lab

University of California, Berkeley

Fast, Automatic, Photo-Realistic, 3D Modeling of Building Interiors

Acknowledgements

Staff member:

– John Kua

Students

– Matthew Carlberg, Nick Corso, Nikhil Naikal,

George Chen, Jacky Chen, Tim Liu, Stephen

Shum, Stewart He, Victor Sanchez,

Outline

Problem statement

Localization

Model construction

Applications

Future work

3D Outdoor Modeling

•Combine airborne and ground-based laser scans and

camera images

•Acquisition vehicle in motion as data is being collected

Ground-Based Modeling

building facades

3D City Model

Airborne Modeling

rooftops & terrain

Fusion

Registration

Interactive Video of 3D Model for Downtown Berkeley

ABC News Clip on Indoor Modeling

Indoor Modeling Goals and objectives:

– 3D, fast, automated, photo-realistic models of building interiors

– Enables virtual walk throughs and fly throughs

– Visualize exterior and interior; seamless transition between the two

Applications of indoor modeling:

– Virtual reality, games and entertainment, training & simulations,

architecture, construction, real estate, first responders, emergency

management ….;

Today’s solutions: Indoor mapping:

– Wheeled devices on even, smooth surfaces

– Output: 2D maps, rather than textured 3D models

What about uneven surfaces e.g. staircases

What about photo-realistic textured 3D models?

Proposed Approach to 3D Indoor Modeling Use human operator rather than wheeled devices in order to

map/model uneven surfaces, tight environments

Concept of Operation:

– Equip a backpack with sensors

– Walk around a building to acquire the data

– Process data offline

Challenges:– Weight/power limitations for human operator with backpack

– Unlike outdoor modeling: No GPS inside buildings

No aerial imagery to help with localization

– Unlike wheeled systems with only 3 degrees of freedom: x, y, & yaw Need to recover six degrees of freedom for a human operator: x,y,z,yaw,pitch, roll

Human Operators …

Data Acquisition

Intersense

HG9900

C-LC-R

HG9900

Laptop

Applanix

L = laserC = cameraH = horizontalV = Vertical

System Components

• High performance laptop with three striped RAID hard drives

• Two Point Grey Grasshopper cameras –• 5 megapixel, with fisheye lenses with 183 degree field of view

• Five Hokuyo URG indoor laser range finder (LRF):• 40Hz scan rate, 30m range, 1081 points per scan, 0.25 degree angular resolution, 270

degree field of view

• Intersense InertiaCube3 Orientation Measurement Sensor (OMS)

• External 170W hours Lithium Ion battery pack with regulators

• Handheld windows mobile device to control start and stop of data collection.

• Honeywell HG9900 Navigation Grade Inertial Measurement Unit (IMU) and Applanix navigation computer

• Used only as ground truth for localization• Price tag: $0.5M• Requires zero velocity updates (ZUPTS) for the after every 2 minutes of walking

• Walk 2 min, freeze 1 minute…..• Combines 3 ring laser gyros with bias stability < .003 deg/hr with 3 accelerometers

with bias of less than 0.245 mm/s

Concept of Operation

•Localize:

• Estimate position and orientation at each time instant

•Six degrees of freedom x,y,z, yaw, pitch, roll

•Uses horizontal and vertical laser scanners & cameras

•Generate point cloud

•Stack vertical scans using localization

•Uses vertical geometry laser scanners

•Reconstruct surface from the point cloud

•Texture map

•Uses camera

Vertical LRF

System Diagram

Incremental

localization

Loop closure

(global

localization)

Point cloud

generation

Texture

mappingSurface

reconstruction

LH1 LV2 OMSLV3 LV1

C-RC-L

6DOF 6DOF

Point cloud

3D TexturedModel

Incremental Localization: Iterative Closest Point (ICP) Scan Matching Algorithm

Scan 1

Scan 2m = first scan d = second scan

Iteratively finds the rotation R, and translation, t between two scans;

Works well if:

Scans are in the same plane

– two translations parameters and one rotation

Scans are more or less pre-registered R and t small

Environments with rich 3D geometric features

Outdoor Pose Estimation Using Scan Matching of Horizontal Scanner

Overlapping horizontal laser scans:Continuously captured during vehicle motion (75 Hz)Relative 2D pose estimation by scan-to-scan matchingv

t = t0

t = t1

Scan matchingTranslation (u,v)Rotation

(u, v)

Scan 1 Line segment

extraction

(u’, v’, ’)

Q(u’, v’, ’)

RotationR(’)

Translationt(u’, v’)

Scan 2

Robust LS

Resulting Path from Scan Matching

Length:24.3 km

Driving time:78 minutes

Scans:665,000

Scan points:85 million

Camera images:19,200

2 Data Acquisitions

Use Airborne Data to Globally Correct Vehicle Pose

Idea: Ground-based façade scans should match edges in aerial photographs or digital surface models (DSM):

Maximize overlap between scan points and airborne edges

Additional advantage: registered airborne DSM and ground based data

Path After Global Registration

Length:24.3 km

Driving time:78 minutes

Scans:665,000

Scan points:85 million

Camera images:19,200

2 Data Acquisitions

Top View

Laser Scan Plot

Estimating Yaw Using Horizontal Scanner: Indoors

sidewall sidewall

Top View

Laser Scan Plot

Estimating Yaw Using Horizontal Scanner

sidewall sidewall

ICP scan matching: x, y, and yaw

Side View

Top View

Rear View

Use three orthogonally mounted laser scanners

Side walls

ceiling

sidewallsidewall

ceilingLV2: Vertical scanner point forward for pitch

LH1: Horizontal scanner for yawLV1: Vertical scanner pointing sideways for roll

•Run ICP scan matching 3 times

Incremental Localization: 3XICP Algorithm

Use three planar laser scanners:

Top horizontal yaw scanner:

∆x, ∆y, ∆yaw

Side looking vertical roll scanner

∆y, ∆z, ∆roll

Side looking vertical pitch

scanner

∆x, ∆z, ∆pitch

The 6DoF transformation is

[∆x; ∆y; ∆z; ∆roll; ∆pitch; ∆yaw]

Top horizontal yaw scanner

Side looking vertical roll scanner

Intersense

HG9900

C-LC-R

HG9900

Laptop

Applanix

Side looking vertical pitch scanner∆

Incremental Localization: 2 x ICP + OMS Algorithm

Motivation: 3XICP can have large errors

OMS gives roll, pitch, and yaw; Yaw values not reliable

Affected by local magnetic field steel cabinets

New setup: Top horizontal yaw scanner ICP ∆x, ∆y, ∆yaw

Side looking vertical scanner ICP ∆z

InterSense OMS ∆roll, ∆pitch

Pitch/roll compensation of yaw scan matching

Roll = 10 deg

Pitch = 10 deg

Roll = -10 deg

Pitch = 10 deg

Roll = 0 deg

Pitch = 0 deg

Roll = 0 deg

Pitch = 10 deg

•Assume walls are vertical;•Rotate the uncompensated scan into 3-D space:

• Augment with zeros for z-coordinate;• Multiply by the rotation matrix defined by the estimated roll and pitch angles.

•The compensated scan is the projection of the 3-D scan onto the X/Y plane

Incremental Localization: 1xICP + OMS + Planar Algorithm

∆y∆x

ypitch∆y ∆x

Direction of Motion

Z is perp. dist. frompitch scanner to floor

Scan line: y= mx + b

1tantan 11

Pitch angle p:

•Top horizontal yaw scanner ICP ∆x, ∆y, ∆yaw

•Planarity: absolute Z•OMS: ∆roll, ∆pitch

Incremental Localization: Hybrid Algorithm 1xICP+OMS+Planar:

– Has very low error in Z, but only works with planar floors

2xICP+OMS

– Works with non-planar floors, e.g. stairs, but larger error

Hybrid algorithm combines the best of both worlds:

– Use regression and line fitting to detect sections of the environment that

have planar floors to adaptively switch between algorithms.

Shortcoming of Incremental Localization

•Unbounded error dead reckoning •Errors accumulate over time

•Global error large, even though incremental error is low

•Loop closure: •Revisit the same spot during acquisition to induce a “cycle” in the graph

• Build a graph G (V,E):• A node is pose at a given time• The edge between two nodes is the incremental

transformation Ti-->j from node i to node j• Ti-->j can be from any of the 4 incremental

localization algorithms

1 2 3 4 5T12 T23 T34 T45

Loop closure errorLoop Closure Revisit the same location during acquisition:

– Estimate a 6-DOF transformation between the two poses

of the two visits.

Nonlinear graph optimization to estimate “closed

loop” pose X:– Each edge: transformation Tij with covariance ∑ij

– f(Xi, Xj): transformation from pose Xi to Xj

– Intuitively size of covariance adjusts pose of nodes

Loop closure 101

jijijiEji

jijiXXX

TXXfTXXfV

),(),()(,,,

Summary of Localization

Estimate transformations, T, and their covariances ∑ between:

• Adjacent nodes

• Each loop closure event

Graph optimization via TORO

Final estimated poses

Question: How to detect loop closures?

Loop Closure Detection

Use FAB-MAP [1] to generate rank-ordered list of candidate image pairs for potential

loop closures

Computes probability of each image belonging to each location:

– Use SIFT features within Bayesian inference and maximum likelihood framework

Loop closure event from subset of Dataset 3

[1] M. Cummins and P. Newman, IJRR 2008 34

Post-Process FAB-MAP with Keypoint Matching Run 100 trials of FAB-MAP:

– Record Image pairs with high

probability

Post-process image pairs with

highest counts to eliminate false

positives

– Key point matching (Lowe 2004)

For each candidate image pair

(I1,I2), find the nearest and the

second-closest neighbors for every

feature of I1, in I2 and compute the

ratio of their distances

Different distributions for correct

and incorrect matches from an

image pair:– For correct match, larger % below 0.6

threshold

85-1 18-3 69-20 84-3 19-3

Image pairs in Dataset 1

Incorrect match Correct match

PDF of percentage of features with ratio below the threshold 0.6

Example of Detected Loop Closures

Results (1) : Effect of Loop Closure

•Closure reduces Global Error

Results for a simple loop in a 30 m hallway

Results(2): Comparing Algorithms

Set 1: operator 1 Set 2: operator 2

• “Planar” algorithms lowest global error in Z

Results for a simple loop in a 30 m hallway

∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆

More Complex Environments: stairs: 2xICP+OMS

Set 1 Set 2

Set 3 Set 4

Dataset Path length Average

Position Error

1 68.73 m 0.66 m

2 46.28 m 0.35 m

3 46.28 m 0.58 m

4 142.03 m 0.43 m

Global roll and yaw error is larger for staircase

3 Story Localization

Loop Closure Detection With Multiple Cameras

One Left-Facing Camera

Two Opposing Cameras

1 2 3 4

•Loop closures can have opposite heading•Initial condition for yaw scanner: Zero for translation, pitch, roll; 180 deg. for yaw

•Loop closures must be in similar location and orientation•Zero initial translation and rotation for yaw scanner ICP

Double Camera Loop Closure

•Prune candidate loop closures with erroneous transformations:•Metrics from scan matching•Metric based on transformation of 3D locations of matched image features between loop closure images

System Diagram

Incremental

localizationLoop closure

Point cloud

generation

Texture

MappingSurface

reconstruction

LH1 LV2 OMSLV3 LV1

C-RC-L

6DOF 6DOF

Point cloud

3D TexturedModel

Generating Point Clouds

3D Colored Point Cloud: from outside

3D Colored Point Cloud: from inside

Two &threestories

Staircase Point Clouds

47Interactive video

Point cloudFor pushcartWith wheel

System Diagram

Incremental

Point cloud

generation

Texture

mappingSurface

reconstruction

LH1 LV2 OMSLV3 LV1

C-RC-L

6DOF 6DOF

Point cloud

3D TexturedModel

Surface Reconstruction from Point Cloud 1. Piecewise planar surface model; assumes:

- Floor and ceiling are planar & horizontal.

-Walls are planar & perpendicular to the floor

2. Surface models from scan line triangulation-Take advantage of structure in scan line ordering of

laser to generate mesh of triangles

-Carlberg et al., 3DPVT 2008

Method 2 results in more geometrically rich

models, but also requires more artifacts R

Overview of Plane Fitting

Principal component

analysis (PCA)

Cluster analysis

Classification & segmentation

Plane fitting- RANSAC

PCA: For each input point pi of the point

cloud, perform a principal component analysis (PCA) on the ball neighborhood of radius σ centered on pi

The eigenvector associated with the smallest eigenvalue estimate of the normal vector, ni, to the surface of the ball neighborhood centered on pi

Point Classification and Segmentation

Each point pi classified as part of one of three different types of structures:

Original point cloud of a T-shaped hall Type “Wall-X” Structure

Wall-X “parallel” to y-z plane

Wall-Y “parallel” to x-z plane

Ceiling-floor “parallel” to x-y plane

• Each structure is segmented and analyzed separately.

Cluster Analysis

Analyze each structure separately to divide it into “clusters” e.g. wall,

floor ceiling

Use Euclidean distance

Type “Wall-X” Structure A cluster for a wall

Plane Fitting

Use RANSAC to fit a plane to each cluster

Inliers determine the extent of each plane.

Hard to deal with stairs

A cluster of points representing a wall

Inliers projected to the fitted plane

System Diagram

Incremental

Point cloud

generation

Texture

mappingSurface

reconstruction

LH1 LV2 OMSLV3 LV1

C-RC-L

6DOF 6DOF

Point cloud

3D TexturedModel

Overall Approach to Texture Mapping

For Every Triangle:1. Find “Candidate” set of images2. Filter out undesirable images from candidate set3. Find the best Image in the filtered candidate set

Candidate

SetFilter Optimize

Image to be used to

texture map

Filter & Optimize

NTriangle normal

CCameralook vector

dDistance

Conditions:d < dt

α < α t

NTriangle normal

ddistance

CCameralook vector

Filtered Candidate Set

max[ 1/d * (-1*Ci) * N ] i = 1 .. Num of Cameras

FilterOptimize

Choose camera poses that are close to the triangle and are “head on” to maximize resolution

Among all candidates, choose one thatoptimizes joint function of distance andviewing angle.

Choosing the Candidate Set

– First, consider all images already used for a given

plane i.– Neighborhood consistency

– If filtered candidate set is empty. The consider

images with timestamp within Δt seconds of last

used image

To deal with path localization error.

Paths at two different times

– If filtered candidate set form step 2 is empty then

consider all available images.

Piecewise Planar Model with Image Mosaicing

Looking from outside of hallway

Looking down hallway

Piecewise Planar Model with Image Mosaicing

Scan Line Triangulation Model of Cory Hall, Fourth Floor

East wall

Looking from outside of hallway

CeilingWest wall

Scan Line Triangulation Model of Cory Hall, Fourth Floor

Looking down hallway

Artifacts of Texture Mapped Models

Close up of west wall

image j of texture

image j+1 of texture

Seam due to slight illumination differences between subsequent images

3D to 2D mis-registration due to small errors in world to image transformation

Seam due to slight illumination differences between subsequent images

Localization error

Blending Refined Images-Image blending removes discontinuities in texture caused by

illumination differences between successive images

-For each triangle, texture is a weighted average of pixels from two

images.

Unblended Blended

),(),( 1111 iiiiiiii yxIyxIp

Ii(xi,yi): pixel from image i

Ii+1(xi+1,yi+1): pixel from image i+1

Image weights over time

i+1ii-1

αi-1 αi αi+1

time at which image i was acquired

Gain Compensation and Blending

Matthew Brown and David G. Lowe: Automatic Panoramic Image Stitching using Invariant Features IJCV 2007

Before:

After:

Interactive Rendering of a Planar model with Gain Compensation and Blending

Texture Mis-alignments

Laser based localization is not pixel accurate

Use images to refine localization

Three approaches:

– Graph optimization using images

– Image Stitching per plane Matthew Brown and David G. Lowe: Automatic Panoramic

Image Stitching using Invariant Features IJCV 2007

– MRF formulation

Graph Optimization Using Images

•Add new edges/transformations to graph using image matching

• Use Levenberg-Marquardt nonlinear minimization to enforce

consistency of photographic views of a set of 3D points.

Transformations Using images

Graph Optimization Results

Before

Localization Drift due to Image based Graph Optimization

•Z error increasing from left to right•Image matching changes the localization path & point cloud

Example of Graph Based Optimization

before

Example of Image Stitching

Interactive video

MRF Formulation

)min( dE )min( sd EE

•The minimization is a Markov Random Field (MRF) problem which can be solved using Graph Cut. [Boykov et al. 01]

1 ( , )

min( ( ) ( , ))N

q i s i j

E l E l l

•Cast texture selection and alignment as a labeling problem [Lempistky and Ivanov, 07]:•Include image transformations to generate more image candidates

))()(()( kl

kljis pIpIllEji

2)()(iliid CamTrilE

V. Lempitsky and D. Ivanov, Seamless Mosaicing of Image-Based Texture Maps. CVPR07.Y. Boykov, O. Veksler and R. Zabih, Fast Approximation Energy Minimization via Graph Cuts. PAMI01.

Quality function Smoothness function

: number of triangles

: number of sampling points

: position of the i triangle

: image label for the i triangle

I : image with label

: camera position of image I

p : the k sampling point

s in an edge

Quality only Quality and smoothness

Interactive Rendering of Models

Application to Image Based Rendering Side products of 3D Modeling:

– Pose of captured imagery:

Which images to render for a given view

– 3D point cloud and fitted planes

Occlusions

3 steps:

– Choose the “best” image(s) head on and close by angle and distance criteria

– Check for occlusion remove images

– Mosaic & blend the remaining images

A user navigates with the keyboard and mouse to rotate, translate, and switch

between ceiling, floor, and side cameras

Example: 5 images stitched together per frame results in 5 frames per second

interactive video

Application to Mobile Indoor Augmented Reality– Image based localization is key in mobile AR applications

Superimpose meta-data information on query image from mobile location based advertising.

– Key ingredient for image based localization:

A geo-taged image database with full pose for each image

– Easy for outdoors GPS, IMU

– Use recovered pose from 3D indoor modeling for a geo-tagged image database

outdoors

indoors

Indoor Image Retrieval Performance

0 1 2 3 4 5 6 7 8 9 10 11

Indoor Performance

cory-2 cory-5

Cory Hall fifth floor: offices;2nd floor: display cases, poster, pictures

Examples of 2nd & 5th floor Cory pictures

80Data base Cell phone query

5th floor

2nd floor

Future work

Surface reconstruction needs more work:

– Planar and triangulation are too naiive

– Need more sophisticated schemes such as ball pivoting

– Need to deal with modeling staircases, etc.

Reduce texture mis-alignment

Improve localization

– Pixel level accuracy

– Small error in localization results in noticeable visual artifacts:

Geometry and texture both get messed up

Water tight models a priori constraint

Simplify models to reduce size:

– Rendering and interactivity

Characterize accuracy more systematically

– Volumetric characterization rather than localization

Applications

Architecture:

– Reverse engineering floor plans

– High rise offices, residential, government offices.

Plant and factory facilities:

– Keeping track of location and condition of equipment & assets in factories

Public areas such as schools, hospitals

– Useful for firefighters, law enforcements, first responders

Airport, train/bus stations, transportation facilities

Music halls, stadiums, theatres, public event spaces

Gaming and entertainment

Underground mines and tunnels

Fast, Automatic, Photo-Realistic, 3D Modeling of Building...

Documents