Constructing immersive virtual space for HAI with photos
Shingo MoriYoshimasa Ohmoto
Toyoaki NishidaGraduate School of Informatics Kyoto University
GrC2011 2011/11/09
Abstract
• We automatically construct immersive virtual spaces for human agent interaction– Scenes are drawn by external photo
images– Depth maps are reconstructed to
express occlusion– Rough 3D models are added for agents– Processing time is about 4.7 days to
reconstruct a 20m×20m virtual space2
Introduction
• We want to observe HHI using HAI in a virtual space
• For example, for a virtual sightseeing task:–we can select faraway place such as foreign
country–we can easily prepare an environment to
observe
• Our Goal: creating a system to construct an environment for such a task 3
Introduction
• To do the sightseeing task and observe interaction, the environment should look like the real world– virtual spaces should be immersive– scenes recreated by real world photos are
needed– spatial relationship between agent and
object should be correct– users can walk freely on some level
• How to construct such a virtual space?4
Related Work
• Model Based Rendering (MBR)– can reconstruct 3D models–make arbitrary consistent views easily–weak at trees or texture-less surfaces
• [1-3] are good methods but,– [1] can’t use outside scenes because
they use Manhattan World Assumption– [2,3] need expensive equipment or lots
of time and effort[1] Furukawa et al. 2010, Reconstructing build-ing interiors from images[2] Pollefeys et al. 2008, Detailed real-time urban 3d reconstruction from video[3] Ikeuchi et al, 2004, Bayon digital archival project 5
Related Work
• Image Based Rendering (IBR)–make a new viewpoint image by
interpolation– draw clearly complex structures such as
natural objects–weak at occlusion
• [4-5] have good image quality but,– they don’t consider agents–movable space is restricted
[4]Google Street View[5] Ibuki , 2009, Reduction of Unnatural Feeling in Free-viewpoint Rendering Using View-Dependent Deformable 3-D Mesh Model (Japanese)
6
Our Method
• To make the immersive environment, we use IBR– because high image quality is needed to
show the scene– use panorama images and
omnidirectional display to show environment
7
Our Method
• To collect photo images– divide a space in into a 1-2m grid– shoot about 18 photos in each grid
• We use interpolation when moving from one shooting point to another
obstacle
shooting point
shooting direction
1-2 meter
8
Our Method
• 3D geometry is needed for agents– use Structure from Motion and stereo
method in a similar way [1,5]– create depth map for occlusion between
objects and agents
• This information is used for better IBR– camera position & rotation– 3D position of a point cloud 9
System Pipeline
depth map
segmented image
camera parameter
Photos
Structure from Motion
Segmentation
Creating Depth Map
Show a Immersive
Virtual Space
interpolated image
Interpolation
: Input
: Process
:Output
panorama imagepanorama depth map
Creating Panorama
Use previous work
Tackle in this research
CMVS patchesrough 3D model
Multi view Stereo
System of Constructing Virtual Space
Structure from Motion (SfM)
• Estimate camera parameters (projection matrix) from multiple photos–we use Bundler[6] camera position
photos points clout and camera positioncamera position[6] Snavely et al. 2006, Photo tourism: exploring photo collections in 3D
11
Multi view Stereo
• Reconstruct 3D geometry–we use CMVS[7] and Poisson Surface
Reconstruction[8]– get a point cloud (patches) and rough 3D
model
photos and translate matrix patches and rough 3D model[7] Furukawa et al. 2010, Towards internet-scale multi-view stereo[8] Kazhdan et al. 2006, Poisson surface reconstruction 12
Create Depth Map• Deal with holes and outliers of the
point cloud• Using an assumption that the real
world is constructed by a planar surface– reconstruct surface from projected
patches– Vertical surface can be almost
reconstruct
raw image segmented image depth mapproject patches
Create Panorama Image
• To show a scene in an omnidirectional display, we create panorama images–we use Microsft ICE[9]– canonicalize direction of panorama
image from camera rotation
panorama image and depth map
[9] MicrosoftCorporation, Microsoft image composite editorhttp://research.microsoft.com/en-us/um/redmond/groups/ivm/ice.html.
Interpolation
project patchesto use as feature point
• To move freely, we create interpolated images between near panorama images• correctly move direction and distance about
objecttwo raw panorama imagesabout 1-2m away from each other
find corresponding point
interpolate by morphing (medium point between raw images)
Processing Time
• We experimented with 3 spaces• Most of the processing time is SfM –We can drastically improve if we use
[10]
• Each shooting times are about one hour
[10] Agarwal et al.2009, Building rome in a day
Conclusion
• Conclusion– create a system to automatically construct
virtual spaces for HAI– unify various methods to create the system
• Future work– expand virtual spaces– research how natural and useful it for HAI– observe HAI and feed back to the real world
18