Date post: | 18-May-2015 |
Category: |
Technology |
Upload: | crs4-research-center-in-sardinia |
View: | 738 times |
Download: | 5 times |
www.crs4.it/vic/
Massive Model RenderingMassive Model RenderingFabio Marton
CRS4
Visual Computing
F. Marton– CRS4/Visual Computing, October 2012
Goal: interactive inspection of Goal: interactive inspection of massive models on PC platforms…massive models on PC platforms…
Massive datasets rendered on a commodity PC
F. Marton– CRS4/Visual Computing, October 2012
Application domains / data sourcesApplication domains / data sources
• Many important application domains
• Today’s models exceed
– O(108-1010) samples
– O(109-1011) bytes
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine oppure l'immagine potrebbe essere danneggiata. Riavviare il computer e aprire di nuovo il file. Se v iene visualizzata di nuovo la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuovo.
Local Terrain Models2.5D – Flat – Dense regular
sampling
Planetary terrain models2.5D – Spherical – Dense
regular sampling
Laser scanned models – O(10 -10 ) bytes
• Varying
– Dimensionality
– Topology
– Sampling distribution
Laser scanned models3D – Moderately simple topology –
low depth complexity - dense
CAD models3D – complex topology – high
depth complexity – structured - ‘ugly’ mesh
Natural objects / Simulation results
3D – complex topology + high depth complexity + unstructured/high frequency details
F. Marton– CRS4/Visual Computing, October 2012
The (minimal) challenge: realThe (minimal) challenge: real--time time rendering of massive static modelsrendering of massive static models
• Explore very large models at interactive rates
– Update screen at “interactive rates” as viewpoint changes
I/O
Mega Pixels/frameat 10/100 fps
Giga/Tera Bytes
Limited bandwidth(network/disk/RAM/CPU/PCIe/GPU/…)
I/O
Storage ScreenView parameters
Projection + Visibility + Shading
F. Marton– CRS4/Visual Computing, October 2012
A realA real--time data filtering problem!time data filtering problem!
• Models of unbounded complexity on limited computers– Need for output-sensitive techniques (O(N), not O(K))
• We assume less data on screen (N) than in model (K →∞→∞→∞→∞)
– Need for memory-efficient techniques (maximize cache hits!)
– Need for parallel techniques (maximize CPU/GPU core – Need for parallel techniques (maximize CPU/GPU core usage)
I/O
Storage Screen
10-100 HzO(N=1M-100M) pixels
O(K=unbounded) bytes (triangles, points, …)
Limited bandwidth(network/disk/RAM/CPU/PCIe/GPU/…)
View parameters
Projection + Visibility + Shading
F. Marton– CRS4/Visual Computing, October 2012
A realA real--time data filtering problem!time data filtering problem!
• Models of unbounded complexity on limited computers– Need for output-sensitive techniques (O(N), not O(K))
• We assume less data on screen (N) than in model (K →∞→∞→∞→∞)
– Need for memory-efficient techniques (maximize cache hits!)
– Need for parallel techniques (maximize CPU/GPU core – Need for parallel techniques (maximize CPU/GPU core usage)
I/O
Storage Screen
10-100 HzO(N=1M-100M) pixels
O(K=unbounded) bytes (triangles, points, …)
Limited bandwidth(network/disk/RAM/CPU/PCIe/GPU/…)
View parameters
Projection + Visibility + Shading
SmallWorking Set
F. Marton– CRS4/Visual Computing, October 2012
OutputOutput--sensitive techniquessensitive techniques
• At preprocessing time: build MR hierarchy– Data prefiltering!
– Visibility + simplification
– Not output sensitive
COARSE
– Not output sensitive
• At run-time: selective view-dependent refinement from out-of-core data– Must be output sensitive
– Access to prefiltered data under real-time constraints
– Visibility + LOD
FINE
F. Marton– CRS4/Visual Computing, October 2012
OutputOutput--sensitive techniquessensitive techniques
• At preprocessing time: build MR hierarchy– Data prefiltering!
– Visibility + simplification
– Not output sensitive
FRONT
– Not output sensitive
• At run-time: selective view-dependent refinement from out-of-core data– Must be output sensitive
– Access to prefiltered data under real-time constraints
– Visibility + LOD
Occluded / Out-of-view
Inaccurate
Accurate
F. Marton– CRS4/Visual Computing, October 2012
Our contributionsOur contributionsGPUGPU--friendly outputfriendly output--sensitive techniquessensitive techniques
• Chunk-based multiresolutionstructures
– Combine space partitioning + level of detail
– Same structure used for visibility and detail culling
• Seamless combination of chunks– Dependencies ensure consistency at the level of
Partitioning and simplification
Adaptive rendering GPU
Cache
– Dependencies ensure consistency at the level of chunks
• Complex rendering primitives– GPU programming features
– Curvilinear patches, view-dependent voxels, …
• Chunk-based external memory management
– Compression/decompression, block transfers, caching
simplificationrendering GPU
Multiresolution structure (data+dependency)
Off-line On-line
Network / Bus
F. Marton– CRS4/Visual Computing, October 2012
Our contributionsOur contributionsGPUGPU--friendly outputfriendly output--sensitive techniquessensitive techniques
*-BDAM – Local and Global Terrain ModelsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)EG 2003, IEEE Viz 2003, EG 2005
Adaptive Tetrapuzzles – Dense meshesGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)SIGGRAPH 2004
Layered Point Clouds – Dense clouds
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavviare il computer e aprire di nuovo il file. Se v iene visualizzata di nuovo la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuovo.
Layered Point Clouds – Dense cloudsGobbetti/Marton (CRS4)SPBG 2004 / Computers & Graphics 2004
Far Voxels – General Gobbetti/Marton (CRS4)SIGGRAPH 2005
MOVR – Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4)CGI 2008
Blockmaps – Hybrid volumetric city modelGobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)EG 2007
F. Marton– CRS4/Visual Computing, October 2012
Our contributionsOur contributionsGPUGPU--friendly outputfriendly output--sensitive techniquessensitive techniques
*-BDAM – Local and Global Terrain ModelsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)EG 2003, IEEE Viz 2003, EG 2005
Adaptive Tetrapuzzles – Dense meshesGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)SIGGRAPH 2004
Layered Point Clouds – Dense clouds
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavviare il computer e aprire di nuovo il file. Se v iene visualizzata di nuovo la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuovo.
RASTERIZATION
Layered Point Clouds – Dense cloudsGobbetti/Marton (CRS4)SPBG 2004 / Computers & Graphics 2004
Far Voxels – General Gobbetti/Marton (CRS4)SIGGRAPH 2005
MOVR – Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4)CGI 2008
Blockmaps – Hybrid volumetric city modelGobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)EG 2007
RAYCASTING
F. Marton– CRS4/Visual Computing, October 2012
Our contributionsOur contributionsGPUGPU--friendly outputfriendly output--sensitive techniquessensitive techniques
*-BDAM – Local and Global Terrain ModelsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)EG 2003, IEEE Viz 2003, EG 2005
Adaptive Tetrapuzzles – Dense meshesGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)SIGGRAPH 2004
Layered Point Clouds – Dense clouds
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavviare il computer e aprire di nuovo il file. Se v iene visualizzata di nuovo la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuovo.
MESH-BASED FRAMEWORK
Layered Point Clouds – Dense cloudsGobbetti/Marton (CRS4)SPBG 2004 / Computers & Graphics 2004
Far Voxels – General Gobbetti/Marton (CRS4)SIGGRAPH 2005
MOVR – Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4)CGI 2008
Blockmaps – Hybrid volumetric city modelGobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)EG 2007
MESH-LESS FRAMEWORK
F. Marton– CRS4/Visual Computing, October 2012
Our contributionsOur contributionsGPUGPU--friendly outputfriendly output--sensitive techniquessensitive techniques
*-BDAM – Local and Global Terrain ModelsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)EG 2003, IEEE Viz 2003, EG 2005
Adaptive Tetrapuzzles – Dense meshesGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)SIGGRAPH 2004
Layered Point Clouds – Dense clouds
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavviare il computer e aprire di nuovo il file. Se v iene visualizzata di nuovo la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuovo.
Chunked Multi-TriangulationsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno
(CNR) IEEE Viz 2005
Specialize
Layered Point Clouds – Dense cloudsGobbetti/Marton (CRS4)SPBG 2004 / Computers & Graphics 2004
Far Voxels – General Gobbetti/Marton (CRS4)SIGGRAPH 2005
MOVR – Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4)CGI 2008
Blockmaps – Hybrid volumetric city modelGobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)EG 2007
Generalize
Specialize
View-dep.VolumetricModelIn progress
Generalize
F. Marton– CRS4/Visual Computing, October 2012
Our contributionsOur contributionsGPUGPU--friendly outputfriendly output--sensitive techniquessensitive techniques
*-BDAM – Local and Global Terrain ModelsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)EG 2003, IEEE Viz 2003, EG 2005
Adaptive Tetrapuzzles – Dense meshesGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)SIGGRAPH 2004
Layered Point Clouds – Dense clouds
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavviare il computer e aprire di nuovo il file. Se v iene visualizzata di nuovo la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuovo.
Chunked Multi-TriangulationsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno
(CNR) IEEE Viz 2005
Specialize
Layered Point Clouds – Dense cloudsGobbetti/Marton (CRS4)SPBG 2004 / Computers & Graphics 2004
Far Voxels – General Gobbetti/Marton (CRS4)SIGGRAPH 2005
MOVR – Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4)CGI 2008
Blockmaps – Hybrid volumetric city modelGobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)EG 2007
Generalize
Specialize
View-dep.VolumetricModelIn progress
Generalize
F. Marton– CRS4/Visual Computing, October 2012
RealReal--time adaptive meshestime adaptive meshes
• The problem: efficiently create view-dependent meshes
• Constraints:
– must approximate original surface with controlled surface with controlled screen-space error
– must preserve continuity (conforming meshes)
– must handle meshes of varying topology
– must be efficiently rendered
F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi TriangulationsChunked Multi TriangulationsThe Multi Triangulation FrameworkThe Multi Triangulation Framework
• Theoretical basis
– MT multiresolutionframework (Puppo 1996)
• Our contribution
– GPU friendly implementation
Partitioning and simplification
Adaptive rendering GPU
Cache
– GPU friendly implementation based on surface chunks with boundary constraints
– Optimized implicit specializations (TetraPuzzles/V-Partitions)
– Parallel out-of-core pre-processing and out-of-core run-time Cignoni, Ganovelli, Gobbetti, Marton, Ponchio, and Scopigno.
Batched Multi Triangulation .In Proc. IEEE Visualization. Pages 207-214. October 2005.
Multiresolutionstructure (data+dependency)
Off-line On-line
Network / Bus
F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi TriangulationsChunked Multi TriangulationsThe Multi Triangulation FrameworkThe Multi Triangulation Framework
• Consider a sequence of local modifications over a given description D
– Each modification replaces a portion of the domain with a different conforming portion different conforming portion (simplified)
– f1 floor
– g1 the new fragment
D’=D \ f∪ gDi+1=Di⊕ gi+1
F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi TriangulationsChunked Multi TriangulationsThe Multi Triangulation FrameworkThe Multi Triangulation Framework
• Dependencies between modifications can be arranged in a DAG
F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi TriangulationsChunked Multi TriangulationsThe Multi Triangulation FrameworkThe Multi Triangulation Framework
• Dependencies between modifications can be arranged in a DAG
– Adding a sink to – Adding a sink to the DAG we can associate each fragment to an arc leaving a node
F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi TriangulationsChunked Multi TriangulationsMT CutsMT Cuts
• A cut of the DAG defines a new representation
– Just paste all the fragments above the cutcut
D*=D0 ⊕ g1 ⊕ g4
F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi TriangulationsChunked Multi TriangulationsMT CutsMT Cuts
• A cut of the DAG defines a new representation
– Collect all the fragment floors of cut arcs and you get a new conforming meshmesh
D*=D0 ⊕ g1 ⊕ g4 = f0∞ ∪ f02 ∪ f03 ∪ f13 ∪ f1∞ ∪ f4∞
F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi TriangulationsChunked Multi TriangulationsGPU Friendly MT GPU Friendly MT
• Chunked MT assume fragments are triangle patches with proper boundary constraints
– DAG << original mesh (patches composed by (patches composed by thousands of tri)
– Structure memory + traversal overhead amortized over thousands of triangles
– Per-patch optimizations
F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi TriangulationsChunked Multi TriangulationsGPU Friendly MT GPU Friendly MT
• Chunked MT assume regions provide good hierarchical space-partitioning
– Compact• Close-to-spherical• Close-to-spherical
– Used for computing fast projected error upper bounds
– Used for visibility queries
F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi TriangulationsChunked Multi TriangulationsGPU Friendly MTGPU Friendly MT
• Construction– Start with hires triangle soup
– Partition model using a hierarchical space partitioning scheme
– Construct non-leaf cells by bottom-up recombination bottom-up recombination and simplification of lower level cells
– Assign model space errorsto cells
• Rendering– Refine conformal hierarchy,
render selected precomputed cells
– Project errors to screen
– Dual queue
Adaptive rendering GPU
Cache
On-line
F. Marton– CRS4/Visual Computing, October 2012
Chunked Multi TriangulationsChunked Multi TriangulationsDAG problemsDAG problems
• Not all MTs are good MTs!
– The topology of dependenciesmay lower the adaptivity of the multiresolution structure
• Cascading dependencies are BAD!!!
– The geometry of DAG regionsmay cause problems in view-dependent renderingdependent rendering
• Compact regions
• Proposed solutions:
– SIGGRAPH 2004: Efficient constrained technique (TetraPuzzles)
– IEEE Viz 2005: General construction technique (V-Partition)
– … see also QVDR, IEEE Viz 2004 and other related work…
F. Marton– CRS4/Visual Computing, October 2012
Adaptive Adaptive TetraPuzzlesTetraPuzzles
• Construction
– Start with hires triangle soup
– Partition model using a conformal hierarchy of tetrahedratetrahedra
– Construct non-leaf cells by bottom-up recombinationand simplification of lowerlevel cells
• Rendering
– Refine conformalhierarchy, render selectedprecomputed cells
F. Marton– CRS4/Visual Computing, October 2012
Adaptive Adaptive TetraPuzzlesTetraPuzzles
• Construction
– Start with hires triangle soup
– Partition model using a conformal hierarchy of tetrahedratetrahedra
– Construct non-leaf cells by bottom-up recombination and simplification of lower level cells
• Rendering
– Refine conformal hierarchy, render selected precomputed cells
F. Marton– CRS4/Visual Computing, October 2012
Adaptive TetraPuzzlesAdaptive TetraPuzzlesOverviewOverview
• Construction
– Start with hires triangle soup
– Partition model using a conformal hierarchy of tetrahedratetrahedra
– Construct non-leaf cells by bottom-up recombinationand simplification of lowerlevel cells
• Rendering
– Refine conformalhierarchy, render selectedprecomputed cells
View dependent mesh refinement
F. Marton– CRS4/Visual Computing, October 2012
Adaptive TetraPuzzlesAdaptive TetraPuzzlesResultsResults
Michelangelo’s St. Matthew
Source: Digital Michelangelo
ProjectProject
Data: 374M triangles
Intel Xeon 2.4GHz 1GB
GeForce FX 5800U AGP8X
F. Marton– CRS4/Visual Computing, October 2012
Advantages of meshAdvantages of mesh--based based multiresolution modelsmultiresolution models• First GPU bound methods
for very large meshes
– Adaptive conforming meshes
• Reduced overdraw
– Extensive optimization– Extensive optimization• Stripification, cache
coherence, compression, …
– State of the art performance
• GPU bound, >4Mtri/frame at >30 fps on modern GPUs
• Extremely high quality for large dense models with “well behaved” surface
F. Marton– CRS4/Visual Computing, October 2012
Limitations of meshLimitations of mesh--based based multiresolution modelsmultiresolution models• Visibility and multiresolution
solved as separate problems
– Error measured on boundary surfaces
– LOD construction based on local surface local surface coarsening/simplification operations
– LOD construction unaware of visibility (view-independent approximations)
• Hard to apply to models with high detail and complex topology and high depth complexity!
F. Marton– CRS4/Visual Computing, October 2012
Overcoming limitations of local Overcoming limitations of local mesh refinement techniquesmesh refinement techniques• Tight integration of
visibility and LOD construction
– Multi-scale modeling of appearance rather than geometry geometry
– Volume-based rather than surface-based
F. Marton– CRS4/Visual Computing, October 2012
Our contributionsOur contributionsGPUGPU--friendly outputfriendly output--sensitive techniquessensitive techniques
*-BDAM – Local and Global Terrain ModelsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)EG 2003, IEEE Viz 2003, EG 2005
Adaptive Tetrapuzzles – Dense meshesGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)SIGGRAPH 2004
Layered Point Clouds – Dense clouds
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavviare il computer e aprire di nuovo il file. Se v iene visualizzata di nuovo la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuovo.
Chunked Multi-TriangulationsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno
(CNR) IEEE Viz 2005
Specialize
Layered Point Clouds – Dense cloudsGobbetti/Marton (CRS4)SPBG 2004 / Computers & Graphics 2004
Far Voxels – General Gobbetti/Marton (CRS4)SIGGRAPH 2005
MOVR – Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4)CGI 2008
Blockmaps – Hybrid volumetric city modelGobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)EG 2007
Generalize
Specialize
View-dep.VolumetricModelIn progress
Generalize
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsHandling Huge Complex 3D modelsHandling Huge Complex 3D models
• General purpose technique that targets many model kinds
• Underlying ideas
– Multi-scale modeling of appearance rather than appearance rather than geometry
– Volume-based rather than surface-based
– Tight integration of visibility and LOD construction
– GPU accelerated (programmabilty + batching)
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsThe Far Voxel ConceptThe Far Voxel Concept
• Assumption: opaque surfaces, non participating medium
• Goal is to represent the appearance of complex far geometry
– Near geometry can be – Near geometry can be represented at full resolution
• Idea is to discretize a model into many small volumes located in the neighborood of surfaces
– Approximates how a small subvolume of the model reflects the incoming light
=> View-dependent cubical voxel
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsThe Far Voxel ConceptThe Far Voxel Concept
• Assumption: opaque surfaces, non participating medium
• Goal is to represent the appearance of complex far geometry
– Near geometry can be – Near geometry can be represented at full resolution
• Idea is to discretize a model into many small volumes located in the neighborhood of surfaces
– Approximates how a small subvolume of the model reflects the incoming light
=> View-dependent voxel
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsThe Far Voxel ConceptThe Far Voxel Concept
• A far voxel returns color attenuation given
– View direction
– Light direction
• Rendered using a customized vertex shader executed on the GPU
Shader = f (view direction, light direction)
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConstruction overviewConstruction overview
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConstruction overview: Inner nodesConstruction overview: Inner nodes
• Sample a model subvolume to build a grid of far voxels
• Voxels are far
– Project to worst case θmax
– Viewed not closer than d
D min
θθθθ max
– Viewed not closer than dmin
Section of the 3D grid of far voxels
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConstruction overview: Inner nodesConstruction overview: Inner nodes
• Sample a model subvolume to build a grid of far voxels
• Voxels are far
– Project to worst case θmax
– Viewed not closer than d
D min
θθθθ max
– Viewed not closer than dmin
• Raycasting samples original model and identifies visible voxels
Section of the 3D grid of far voxels
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConstruction overview: Inner nodesConstruction overview: Inner nodes
• Sample a model subvolume to build a grid of far voxels
• Voxels are far
– Project to worst case θmax
– Viewed not closer than d
D min
θθθθ max
– Viewed not closer than dmin
• Raycasting samples original model and identifies visible voxels
Section of the 3D grid of far voxels
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConstruction overview: Object Space Construction overview: Object Space OcclusionOcclusion• Environment occlusion
• Cull interior part of grid of far voxels
D min
X θθθθ max
Section of the 3D grid of far voxels
X
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConstruction overview: Object Space Construction overview: Object Space OcclusionOcclusion• Environment occlusion
• Cull interior part of grid of far voxels
XD min
θθθθ maxX
Section of the 3D grid of far voxels
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConstruction overview: Object Space Construction overview: Object Space OcclusionOcclusion• Environment occlusion
• Cull interior part of grid of far voxels
XD min
θθθθ max
• Culls 40% of the high depth complexity Boeing 777 model,• worst case θmax = 0.5 deg
(~10 pixel tolerance for 1024x1024 viewport using 50deg FOV)
• Minimize artifacts due to leaking of occluded parts of different colors
X
Section of the 3D grid of far voxels
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConstruction overview: Far VoxelConstruction overview: Far Voxel
• Consider voxel subvolume
• Samples gathered from unoccluded directions
– Sample: – Sample: • (BRDF, n) = f(view direction)
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConstruction overview: Far VoxelConstruction overview: Far Voxel
• Consider voxel subvolume
• Samples gathered from unoccluded directions
– Sample: – Sample: • (BRDF, n) = f(view direction)
• Compress shading information by fitting samples to a compact analytical representation
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConstruction overview: Far Voxel ShadersConstruction overview: Far Voxel Shaders
• Build all the K different far voxels representations
– K = flat, smooth..
– Principal component analysis
• Evaluate each representation error
Flat proxy:2 components
Smooth proxy:6 components
error
– Compare real values (samples) with the voxel approximations from the sample direction
• Choose approximation with lowest error
…
Others…
Err(k) =
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsRenderingRendering
• Hierarchical traversal with coherent culling
– Stop when out-of view, occluded (GPU feedback), or accurate enough
• Leaf node: Triangle rendering
– Draw the precomputed triangle strip
• Inner node: Voxel rendering• Inner node: Voxel rendering
– For each far voxel type• Enable its shader
• Draw all its view dependent primitives using glDrawArrays
– Splat voxels as antialiased point primitives
– Limits• Does not consider primitive opacity
• Rendering quality similar to one-pass point splat methods (no sorting/blending)
TrianglesTrianglesFar VoxelsFar Voxels
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsResultsResults
• Tested on extremely complex heterogeneous surface models
– St.Matthew, Boeing 777, Richtmyer Meshkov isosurf., all at once
• Tested in a number of situations
– Single processor / cluster construction– Single processor / cluster construction
– Workstation viewing, large scale display
373M triangles373M triangles14.5 GB14.5 GB
350M triangles350M triangles13.7 GB13.7 GB
472M triangles472M triangles18.4 GB18.4 GB
1.2G triangles1.2G triangles46.6 GB46.6 GB
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsResultsResults
• 1-16 Athlon 2200+ CPU, 3 x 70GB ATA 133 Disk (IDE+NFS)
• 1-20K triangles/sec
– Scales well, limited by slow disk I/O for large meshes
– Slow!! (but similar to recent adaptive tessellation methods)
• Avg. triangles per leaf 5K• Avg. triangles per leaf 5K
• Avg. voxels per inner node 2.5K
5h18m (16 CPU)5h18m (16 CPU)10.6 GB10.6 GB
6h51m (16 CPU)6h51m (16 CPU)14.9 GB14.9 GB
8h06m (16 CPU)8h06m (16 CPU)16.1 GB16.1 GB 41.6 GB41.6 GB
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsResultsResults
• Xeon 2.4GHz, 70GB SCSI 320 Disk, GeForce FX6800GT AGP 8x
• Window size: from video resolution to stereo projector display
– St.Matthew, Boeing, Isosurface: 640 x 480
– All at once: 640 x 480 and Stereo 2 x 1024 x 768– All at once: 640 x 480 and Stereo 2 x 1024 x 768
• Pixel tolerance: [Target 1 | Actual ~0.9 | Max ~10]
• Resident set size limited to ~200 MB
45 Fps45 Fps51 MPrim/s51 MPrim/s
44 Fps44 Fps42 MPrim/s42 MPrim/s
34 Fps34 Fps41 MPrim/s41 MPrim/s
2 x 1024 x 7682 x 1024 x 76820 Fps20 Fps40 MPrim/s40 MPrim/s
640 x 480640 x 48020 Fps20 Fps42 MPrim/s42 MPrim/s
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConclusionsConclusions
• General purpose technique that targets many model kinds– Seamless integration of
• multiresolution
• occlusion culling
• out-of-core data management• out-of-core data management
– High performance
– Scalability
• Main limitations– Slow preprocessing
– Non-photorealistic rendering quality
Intel Xeon 2.4GHz 1GB, GeForce 6800GT AGP8X
F. Marton– CRS4/Visual Computing, October 2012
Far VoxelsFar VoxelsConclusionsConclusions
• General purpose technique that targets many model kinds– Seamless integration of
• multiresolution
• occlusion culling
• out-of-core data management• out-of-core data management
– High performance
– Scalability
• Main limitations– Slow preprocessing
– Non-photorealistic rendering quality
Intel Xeon 2.4GHz 1GB, GeForce 6800GT AGP8X
F. Marton– CRS4/Visual Computing, October 2012
Our contributionsOur contributionsGPUGPU--friendly outputfriendly output--sensitive techniquessensitive techniques
*-BDAM – Local and Global Terrain ModelsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)EG 2003, IEEE Viz 2003, EG 2005
Adaptive Tetrapuzzles – Dense meshesGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)SIGGRAPH 2004
Layered Point Clouds – Dense clouds
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavviare il computer e aprire di nuovo il file. Se v iene visualizzata di nuovo la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuovo.
Chunked Multi-TriangulationsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno
(CNR) IEEE Viz 2005
Specialize
Layered Point Clouds – Dense cloudsGobbetti/Marton (CRS4)SPBG 2004 / Computers & Graphics 2004
Far Voxels – General Gobbetti/Marton (CRS4)SIGGRAPH 2005
MOVR – COVRA Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4)CGI 2008
Blockmaps – Hybrid volumetric city modelGobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)EG 2007
Generalize
Specialize
View-dep.VolumetricModelIn progress
Generalize
www.crs4.it/vic/
Recent Advances in Massive Recent Advances in Massive
Volume VisualizationVolume Visualization
F. Marton– CRS4/Visual Computing, October 2012
IntroductionIntroduction
GoalGoal• Visualization of massive scalar
volumes without size limitations
– A single-pass raycastingtechnique working out-of-core on GPU parallel architectures
• Compress data to facilitate data • Compress data to facilitate data streaming and 4D visualizations
– Novel compression architecture and novel compression methods
56
F. Marton– CRS4/Visual Computing, October 2012
IntroductionIntroduction
Teaser Teaser
57
Compression-domain adaptive volume rendering based on sparse representation of voxel blocks. NVIDIA GTX 560
F. Marton– CRS4/Visual Computing, October 2012
MOVR: A singleMOVR: A single--pass raycasting pass raycasting technique working outtechnique working out--ofof--core on core on
The Visual Computer 2008 & 2010
technique working outtechnique working out--ofof--core on core on GPU parallel architectures GPU parallel architectures
58
F. Marton– CRS4/Visual Computing, October 2012
Accumulation
Early ray termination
Massive Volumes Visualization Massive Volumes Visualization
Volume rendering problemVolume rendering problem
Order dependentOrder independent
Empty space skippingPixel
59
F. Marton– CRS4/Visual Computing, October 2012
Massive Volumes Visualization Massive Volumes Visualization
Volume rendering problemVolume rendering problem
• Current interactive solutions are based on GPU architectures
– Massive parallelism
– Huge memory bandwidth
• E.g. GeForce GTX 580
– has a 192.4 GB/s of bandwidth
– Has 1581.1 GFLOPs
[ hardwareinsight.com ]
60
F. Marton– CRS4/Visual Computing, October 2012
• Current high quality solutions based on GPUs implementing …
– Slice-based methods
– Ray casting techniques
Massive Volumes VisualizationMassive Volumes Visualization
Related work. Moderately sized volumesRelated work. Moderately sized volumes
– Ray casting techniques
• ���� The full volume must fit
on GPU memory
[ Li et al, 2003 ]
[ Krüger et al., 2003 ]
61
F. Marton– CRS4/Visual Computing, October 2012
• Multiresolution out-of-core Volume Renderer– Preprocessing
• build multiresolution octree of volume bricks
– Rendering: • Adaptive CPU loading of the data from local/remote repository
Massive Volumes VisualizationMassive Volumes Visualization
Contribution Contribution to the stateto the state--ofof--thethe--artart
• Adaptive CPU loading of the data from local/remote repository cooperates with separate render thread fully executed in the GPU
• Stackless traversal of an adaptive working set
• Exploitation of the visibility feedback
62
E. Gobbetti, F. Marton, and J. A. Iglesias Guitián. A single-pass GPU ray casting framework for interactive out-of-core rendering of massive volumetric datasets.The Visual Computer, 24, 2008.
J. A. Iglesias Guitián, E. Gobbetti and F. MartonView-dependent exploration of massive volumetric models on large-scale light field displays.The Visual Computer, 26, 2010.
F. Marton– CRS4/Visual Computing, October 2012
• Use CPU for …– Creation & loading
– Octree refinement
– Encode current cut using an spatial index
Massive Volumes VisualizationMassive Volumes Visualization
Contribution to the stateContribution to the state--ofof--thethe--artart
• Use GPU for …– Stackless octree traversal
• Using neighbour pointers
– Rendering• Flexible ray traversal /
compositing strategies
• Improved visibility feedback
63
Architecture overview
Neighbour pointer navigation
F. Marton– CRS4/Visual Computing, October 2012
adaptive loaderpreprocessing
visibility
feedbackoctree refinement
[ creation and maintainance ] [ rendering ]
offli
ne
Massive Volumes VisualizationMassive Volumes Visualization
Method overviewMethod overview
volume
render
storage
octree node
database
has current working set enough accuracy?
yes
prepare to render
no
GPUCPU
64
F. Marton– CRS4/Visual Computing, October 2012
• Working set reduction
– Opaque 1731 -> 1035 bricks
– Transp. 1984 -> 1789 bricks
Massive Volumes Visualization Massive Volumes Visualization
Visibility feedbackVisibility feedback
65
• Rendered on window size 1024x576
F. Marton– CRS4/Visual Computing, October 2012
Massive Volumes Visualization Massive Volumes Visualization
Results (2/2)Results (2/2)
Interactive exploration of a 16bit 2GB CT volume on a consumer NVidia 8800 GTS graphics board with 640MB (2008)
66
640MB (2008)
F. Marton– CRS4/Visual Computing, October 2012
Compression Compression –– Domain Domain Volume RenderingVolume Rendering
67
• 60 Time steps of the 432^3 supernova dataset
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
IntroductionIntroduction
• Limited bandwidth and memory =>– LOD (MOVR)
– Compression
• Compression is fully exploited if data is maintained in compressed form through the maintained in compressed form through the entire pipe-line– Compression-domain volume renderers + deferred filtering
• Highly asymmetric encoding/decoding schemes– We can afford slow offline compression and precomputation
– Fast real-time data decoding, interpolation and shading
– Spatially independent random-access to data
68
F. Marton– CRS4/Visual Computing, October 2012
StateState--ofof--thethe--artart
• CPU decompression
– Do not limit bandwidth and memory• [Ning & Hesselink, 92] and many others...
• [Gobbetti et al. 08, Iglesias et al. 10]
• Hardware based– E.g. S3TC [Brown], NVidia VTC [Craighead]
– Full random access– Full random access
– Limited compression
• GPU decompression
– Full working set GPU decompression
• Tensor Approximation [Suter et al.2010]
• Do not limit memory
• Limit Bandwidth
– Partial working set
• Limit both memory and bandwitdh
F. Marton– CRS4/Visual Computing, October 2012
Tensor Approximation Tensor Approximation (CRS4 & UZH 2010)(CRS4 & UZH 2010)• Multiresolution
• Brick Based
• Extract dominant data features
• Real Time GPU Reconstruction– Full Working set
• Bandwidth optimization
• Memory Consumption
S. Suter, J. A. Iglesias Guitián, F.Marton, M. Agus, A. Elsener, C. Zollikofer, M. Gopi, E. Gobbetti, and R. Pajarola. Interactive Multiscale Tensor Reconstruction for MultiresolutionVolume Visualization. In: IEEE Transactions on Visualization and Computer Graphics, pp. 2135–2143, vol 17, 2011
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
Contribution to the stateContribution to the state--ofof--thethe--artart
• COVRA: Compression-domain Output-sensitive Volume Rendering Architecture
– Novel architecture w/ parameterized cache behaviour
– Supports and extend state-of-the-art compression methods
• ☺☺☺☺ Efficient multisampling (HQ shading)•
• ☺☺☺☺ No perspective limitations
• ☺☺☺☺ Fully adaptive multiresolution approach
• ☺☺☺☺ Multipass working set decompression
• ☺☺☺☺ High compression ratios and signal quality
J. A. Iglesias Guitián, F.Marton and E. Gobbetti. COVRA: a Compression Domain Output-Sensitive Volume Rendering Architecture based on sparse representation of voxel blocksIn: proceedings of Eurovis 2012
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
COVRA: OverviewCOVRA: Overview
• Main concepts:
– Preprocessor builds multiresolution octree of compressed nodes
– Data travel in compressed format until last stage.
– Fully adaptive Rendering
– Highly integrated decompression / rendering supporting high quality filtering and shading
72
F. Marton– CRS4/Visual Computing, October 2012
RunRun--timetime
COVRA: Subtree managementCOVRA: Subtree management
• Three rendering steps:1. CPU multiresolution octree
Adaptive refinement
2. Partitioning of the octree into a set of subtrees• Use GPU decompressed cache size as
constraintconstraint
• Front-to-back order decided at real-time during the octree traversal
3. Subtree decompression, raycasting and compositing
• Decompress to temporary buffer or available GPU cache
• Raycast decompressed octree nodes
• Compose with previous results
73
Framebuffer
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
Sparse coding of volume blocksSparse coding of volume blocks
• Each multiresolution octree node decomposed in blocks.
• Each block, made of few^3 voxels, is compressed
Single octree node containing overlapping information Compressed block
• Each block represented by a sparse linear combination of few dictionary elements
– Data specific representation
– Compression is achieved by storing indices and magnitudes
74
overlapping information
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
Sparse coding of volume blocksSparse coding of volume blocks
• Generalization of vector quantization
– Combine vectors instead of choosing single ones
– Overcomes limitations due to dictionary sizes
• Generalization of data-specific bases
– Dictionary is an overcomplete basis– Dictionary is an overcomplete basis
– Sparse projection
• Encoding in two steps
– Training: Find data specific dictionary
– Sparse coding: Find best representation of each block using linear combination of dictionary elements under sparsity constraint
• We employ ORMP via Choleski Decomposition
75
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
Finding an optimal dictionaryFinding an optimal dictionary
• We employ the K-SVD algorithm for dictionary training
– Algorithm for designing overcomplete dictionaries for sparse representations [Aharon et al. 06]
• But running K-SVD calculations directly on massive volumes would be unfeasible, massive volumes would be unfeasible, therefore …
– … we applied the concept of coreset [Agarwal et al. 05] to smartly subsample and reweight the original training set [Feldman & Langberg 11, Feigin et al. 11]
76
F. Marton– CRS4/Visual Computing, October 2012
• K-SVD can be seen as a K-Means generalization
• Basic steps:– Sparse coding of signals in X, producing Γ
Volume CompressionVolume Compression
Dictionary learning (KDictionary learning (K--SVD)SVD)
– Update dictionary atoms given the sparse representations• Optimize one atom at a time, keeping the rest fixed
• The size of E is proportional to the number of training signals
– As in [Rubinstein et al. 08] we replace the SVD computation with a simpler numerical approximation
77
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
Coreset constructionCoreset construction
• Calculations on massive input volumes are still unfeasible, but we can …
– … reduce the amount of data used for training
– … use importance sampling
• We associate an importance to each of the • We associate an importance to each of the original blocks, being the standard deviation of the entries in
– Picking C elements with probability proportional to
– More important blocks should finish in our coreset
78
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
Coreset constructionCoreset construction
• Non-uniform sampling introduces a severe bias
– Scale each selected block by a weight where is the associated probability
– Applying K-SVD to scaled coefficients will converge to a dictionary associated with the original problem
• Coreset scalability
79
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
COVRA: ResultsCOVRA: Results
• PSNR vs. Bits Per Sample
80
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
COVRA: ResultsCOVRA: Results
• Comparison against state-of-the-art GPU-based decompression methods
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
COVRA: ResultsCOVRA: Results
82
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
COVRA: ResultsCOVRA: Results
• Gradient mapped to RGB color
83
F. Marton– CRS4/Visual Computing, October 2012
Volume CompressionVolume Compression
COVRA: VideoCOVRA: Video
84
Compression-domain adaptive volume rendering based on sparse representation of voxel blocks. NVIDIA GTX 560. (2012)
F. Marton– CRS4/Visual Computing, October 2012
• Improved the scalability of state-of-the-art volume rendering techniques– MOVR: a novel single-pass GPU ray casting framework supporting a
flexible ray traversal and incorporating visibility feedback for interactiveexploration of large volumes without size limitations
• Improved compression and streaming of large
Summary and ConclusionsSummary and Conclusions
SummarySummary
• Improved compression and streaming of largeand time-varying volumes– COVRA: Proposed a novel compression-domain architecture, supporting
state-of-the-art compression methods, random-access to compresseddata and HQ shading
– A novel compression method for massive volumes based on sparse-coding (K-SVD) and coreset training sets
85
F. Marton– CRS4/Visual Computing, October 2012
Our contributionsOur contributionsGPUGPU--friendly outputfriendly output--sensitive techniquessensitive techniques
*-BDAM – Local and Global Terrain ModelsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)EG 2003, IEEE Viz 2003, EG 2005
Adaptive Tetrapuzzles – Dense meshesGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno (CNR)SIGGRAPH 2004
Layered Point Clouds – Dense clouds
Impossibile v isualizzare l'immagine. La memoria del computer potrebbe essere insufficiente per aprire l'immagine
oppure l'immagine potrebbe essere danneggiata. Riavviare il computer e aprire di nuovo il file. Se v iene visualizzata di nuovo la x rossa, potrebbe essere necessario eliminare l'immagine e inserirla di nuovo.
Chunked Multi-TriangulationsGobbetti/Marton (CRS4), Cignoni/Ganovelli/Ponchio/Scopigno
(CNR) IEEE Viz 2005
Specialize
Layered Point Clouds – Dense cloudsGobbetti/Marton (CRS4)SPBG 2004 / Computers & Graphics 2004
Far Voxels – General Gobbetti/Marton (CRS4)SIGGRAPH 2005
MOVR – COVRA Volumetric models Gobbetti/Marton/Iglesias Guitian (CRS4)CGI 2008
Blockmaps – Hybrid volumetric city modelGobbetti/Marton (CRS4), Cignoni/Ganovelli/Di Benedetto/Scopigno (CNR)EG 2007
Generalize
Specialize
View-dep.VolumetricModelIn progress
Generalize
F. Marton– CRS4/Visual Computing, October 2012
A realA real--time data filtering problem!time data filtering problem!
• Models of unbounded complexity on limited computers– Need for output-sensitive techniques (O(N), not O(K))
• We assume less data on screen (N) than in model (K →∞→∞→∞→∞)
– Need for memory-efficient techniques (maximize cache hits!)
– Need for parallel techniques (maximize CPU/GPU core – Need for parallel techniques (maximize CPU/GPU core usage)
I/O
Storage Screen
10-100 HzO(N=1M-100M) pixels
O(K=unbounded) bytes (triangles, points, …)
Limited bandwidth(network/disk/RAM/CPU/PCIe/GPU/…)
View parameters
Projection + Visibility + Shading
SmallWorking Set
F. Marton– CRS4/Visual Computing, October 2012
A realA real--time data filtering problem!time data filtering problem!
• Models of unbounded complexity on limited computers– Need for output-sensitive techniques (O(N), not O(K))
• We assume less data on screen (N) than in model (K →∞→∞→∞→∞)
– Need for memory-efficient techniques (maximize cache hits!)
– Need for parallel techniques (maximize CPU/GPU core – Need for parallel techniques (maximize CPU/GPU core usage)
I/O
Storage Screen
10-100 HzO(N=1M-100M) pixels
O(K=unbounded) bytes (triangles, points, …)
Limited bandwidth(network/disk/RAM/CPU/PCIe/GPU/…)
View parameters
Projection + Visibility + Shading
SmallWorking Set
F. Marton– CRS4/Visual Computing, October 2012
THANK YOU!THANK YOU!
QuestionsQuestions and and AnswersAnswers
NextNext SessionSessionNextNext SessionSession
Technologies for improving realTechnologies for improving real--time time immersive exploration of massive immersive exploration of massive
(volumetric) (volumetric) models.models.presented bypresented byMarco Marco AgusAgus