+ All Categories
Home > Documents > Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint...

Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint...

Date post: 15-Dec-2015
Category:
Upload: wayne-pendell
View: 223 times
Download: 2 times
Share this document with a friend
Popular Tags:
44
Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)
Transcript
Page 1: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Efficient SelectionOf

Disambiguating Actionsfor

Stereo Vision

Ronald ParrDuke University

Joint work with Monika Schaeffer (Duke University)

Page 2: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Traditional Stereo

These points match really well

A benchmark stereo pair (Middlebury)

• Lots of texture• Small disparity range

How realistic is this?

Page 3: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

What Robots See

• Huge disparity range• Large areas with little or no texture

Let’s go down this hallway without hitting a wall!

?????

LSRC hallway

Page 4: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Why not Use a Laser Range Finder?

● Weight● Cost $$● 2D or $$$● (lack of) stealth● Power consumption● Low data bandwidth● Calibrated moving parts

● Sensor can drive robot design

Page 5: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Alternatives

● Sonar ● 3D laser

– Slow

– $100K● Rotated/Rotating 2D laser

– Retains nearly all disadvantages of 2D laser

– Information per sweep: ~100 Kbits

Page 6: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Motivation

AccurateCheapLight

Stealthy

ExpensiveBulky

ImpracticalCalibrated mechanics

Fails where Robots need accuracy

Laser range finders Traditional Stereo

Benefits

Problems

Page 7: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Motivation

AccurateCheapLight

Stealthy

ExpensiveBulky

ImpracticalCalibrated mechanics

Fails where Robots need accuracy

Active Stereo Vision (stereo + laser pointer)

Benefits

Problems

Page 8: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Active Stereo Vision● Take base stereo pair of images

● Take stereo pair(s) with addition of laser line

(only crude calibration needed)

● Image subtraction isolates laser line

● Line disambiguates pixel matchings between pair

The laser line divides problem into two independent stereo problems.

Page 9: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Using the laser

A real stereo pair of images

An artificial stereo pair of images

Test set: Lainm x n x d: 600 x 900 x 160

Page 10: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Using the laser

Shine laser

Calculate laser lines using ground truth image

Line in right image = line in groundBrighter in ground = farther right in left image

Page 11: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Using the laser

Extract laser lines and update matchings

Update matchings

These points match

These segments match

These points match

These segments match

Page 12: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Our Prototype

Page 13: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

But what about…

● Bulk, size & cost?– Prototype is much larger than necessary

– P/T head need not be high quality (calibration not needed)

● Stealth & speed?– Only use laser when/where necessary

– Plan laser aims to reduce entropy

– This is our sensor planning problem

Page 14: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

How is Stereo Different?

● Extremely large event space– Millions of pixels/image

– Hundreds of values for each pixels

● Cost of inference is high

● (naïve) one step lookahead is impossible

● Our main result: Can determine aim point for the laser that maximizes expected entropy reduction (information gain) in same asymptotic complexity as one run of stereo

Page 15: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Doing Stereo

● Bobick & Intille present stereo as a shortest path problem

● Construct the Disparity Space Image (DSI)

● Find the shortest path in linear time using dynamic programming

● Path through DSI = Stereo Matching for a scanline

● Costs:

● Assume n pixels/scanline (thousands)

● Max disparity level of d (hundreds)

● O(nd) per scanline

Page 16: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Constructing the DSI

The DSI takes on three equivalent forms:

– A dxn image containing information about the quality of

matchings for a scanline

– An dxnx3 graphical structure where paths through the

graph represent valid pixel matchings for the scanline

– An n-state HMM with O(d) possible values per state.

Page 17: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Constructing the DSI

As an image:– A pixel in the right scanline

is a column in disparity

space.

– A pixel in the left scanline is

a diagonal in disparity

space.

– The left and right values are

run through a cost function

to get the matching score.

Page 18: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Constructing the DSI

As an image:– A pixel in the right scanline is a

column in disparity space.

– A pixel in the left scanline is a

diagonal in disparity space.

– The left and right values are run

through a cost function to get the

matching score.

– Not shown: Occlusion penalties

Page 19: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

M

L

R

MM

L

R

M

L

R

M

R

MM

R

M

R

M

L

R

MM

L

R

M

L

R

M

L

R

MM

L

R

M

L

R

M

L

R

MM

L

R

M

L

R

x = i x = i+1

cost: DL

cost: s(i+1,j)

d = j-1

d = j

d = j+1

cost: DR

Constructing the DSI

As a graph:– Each pixel in DSI image

corresponds to three nodes

representing the state of

that pixel.

– Transitions from pixel (i,j)

● M, R, L to M of (i+1,j)

● M, L to L of (i, j-1)

● M, R to R of (i+1, j+1)

Page 20: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

M

L

R

MM

L

R

M

L

R

M

R

MM

R

M

R

M

L

R

MM

L

R

M

L

R

M

L

R

MM

L

R

M

L

R

M

L

R

MM

L

R

M

L

R

Si Si+1

Constructing the DSI

As an HMM:– Ms and Rs within a column i

are mutually exclusive, jointly

exhaustive. Considered

possible values to state Si

– Ls in a column encode a more

complicated set of transitions

from Ms in the column to Ms in

the next column

Page 21: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Finding the Shortest Path

● DSI is a highly structured DAG

– We define the set of predecessor nodes, Γ-

– Graph traversed from bottom to top, left to right.

● Shortest path can be found in linear time with

dynamic programming. For node c,

)(min)()()(

bspcscorecspcb

Page 22: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Query Selection

● To maximize expected benefits of laser aims, we need a

distribution over outcomes

● Arc costs considered unnormalized log probabilities

● Forward/backward algorithm to calculate node probabilities.

For node c:

Calculated backwards.Γ+ is the successor set.

)()()(

)()(

)()(

)(

)(

)(

)(

cpcpcp

bpecp

bpecp

bf

cbb

cscoreb

cbf

cscoref

Page 23: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Query Selection

● Stereo matching = Path through DSI

● Path entropy through DSI measure of our confusion over the best path

● Query strategy: Maximize expected reduction in entropy

Page 24: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Query Selection

Use this observation from Anderson & Moore :

For entropy H(x), path space , and queries Qt:

IG(Qt) = H() - H(|Qt)symmetry of mutual information:

IG(Qt) = H(Qt) - H(Qt|)Markov property:

IG(Qt) = H(Qt) - H(Qt|St)IG(Qt) = H(St) - H(St|Qt)

Expected entropyafter query Qt

Linear time!

Page 25: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Updating the DSI

● If the laser is detected in both images, we

split the DSI into two independent sections.

● Paths are funnelled through M node.

left scanline

right scanline

DSI There are no valid paths

through these dead zones

that match our observation.

Page 26: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Updating the DSI

● If the laser is detected in both images, we

split the DSI into two independent sections.

● Many subtle details (ask later…)

Each side is now independent of the other.

Page 27: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Real World Implementation

Page 28: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Real World Implementation

● Took roughly one base and 200 lasered 1000x650px images

● Used all 200 images to establish “ground truth”

● Recalled nearest laser aim to query to simulate real time aiming

Original Right Image Our Ground Truth

Doorknob

Hinge

Copier

Page 29: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Disparity map with no lasers

entropy

Entropy

Page 30: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

After two laser aims

Entropy

Page 31: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

After nine laser aims

Entropy

Page 32: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Results: Path Entropy

Page 33: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Results: Pixel Error

Page 34: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Results on existing images

We also ran the algorithm on two sets of existing images, the

Middlebury Benchmark set “cones”, and some artificially generated

airport security camera style images with little texture. We used ground

truth to generate fake laser lines.

cones security cam

Page 35: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Results on Security Camera

Page 36: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Results on Cones

Page 37: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Conclusion● Computational properties

– O(nd) complexity– No asymptotic penalty for planning laser actions

● Practical benefits of hybrid system– Small– Inexpensive– Selective use of laser– Accuracy increases with laser use

Page 38: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Conclusion● Results

– Shown to work on both fake and real world images– Far more accurate than stereo alone– Better than random or equally spaced aims

Questions?

Thanks to: Carlo Tomasi, NSF, SAIC, IAI, Sloan Foundation.

Page 39: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Updating the DSI

● If the laser is detected in both images, we

split the DSI into two independent sections.

● Paths are funnelled through M node.

Each side is now independent of the other.

Page 40: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Updating the DSI

● The ordering constraint is an assumption that

keeps the stereo algorithm linear.

● It does not necessarily hold in the real world.

● The laser sometimes picks up on this.left scanline

right scanline

DSI

Page 41: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Updating the DSI

● The ordering constraint is an assumption that

keeps the stereo algorithm linear.

● It does not necessarily hold in the real world.

● The laser sometimes picks up on this.left scanline

right scanline

DSI Detectable because

violations occur in

previously established

dead zones.

Page 42: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Updating the DSI

● The ordering constraint is an assumption that

keeps the stereo algorithm linear.

● It does not necessarily hold in the real world.

● The laser sometimes picks up on this.left scanline

right scanline

DSI Detectable because

violations occur in

previously established

dead zones.

Page 43: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Updating the DSI

● Pixels in one image do not necessarily map

one to one with pixels in the other image.

● The borders of dead zones must be left

possible, though improbable

left scanline

right scanline

DSI

Page 44: Efficient Selection Of Disambiguating Actions for Stereo Vision Ronald Parr Duke University Joint work with Monika Schaeffer (Duke University)

Query Selection

● We could also calculate the expected path

entropy reduction in linear time using

dynamic programming...

h(c) = p(c) + Σ(p(b)log(p(c))+h(b))bє Γ-(c)

Run forward to get the total path

entropy, run in both directions to get

path entropy though each node.


Recommended