Date post: | 21-Dec-2015 |
Category: |
Documents |
View: | 216 times |
Download: | 1 times |
Accelerating Marching Cubes with Graphics
HardwareGunnar Johansson, Linköping UniversityHamish Carr, University College Dublin
Presentation outline
• Goal• Background• Previous work• Our approach• Results• Conclusions, Future work
Goal
• Isosurface visualization for studying 3D scalar functions– Marching Cubes is standard algorithm
• This work presents GPU acceleration in combination with CPU-based algorithmic acceleration (interval/Kd-trees)
Isosurface visualization
• Goal: study a volumetric scalar function, f(x)
• Isosurface is a set of points with equal isovalue (h)
{ x : f(x) = h }
Illustration by
Stefan Roettger, University of Erlangen
Marching Cubes
• Each corner of a cube is classified asabove (black), or below(white), a given isovalue
• Vertices of surface is linearly interpolated along the edges
• Normals are computed usingcentral differences and interpolated along the edges
Previous workAlgorithmic acceleration
• Original marching cubes visits all cells in the dataset– O(N) in time complexity, N = number of cells
• However, an isosurface is expected to intersect only a fraction of the cells
• Efficient search structures can be used to store maximum and minimum value of each cell– Kd-tree O(√N + k), k = size of isosurface– Interval tree O(log N + k)
Previous workGPU acceleration
• Restricted to tetrahedral cells– Marching tetrahedra
• Pascucci, 04• Klein et al, 04• Reck et al, 04
• Cannot create/delete vertices on GPU– “Worst-case” strategy– Always fed 4 vertices (a quad) to the
GPU
Previous workGPU acceleration
• CPU tasks– Selects cell and sends data to GPU
• GPU tasks– Classifies cell– Interpolates surface vertices– Compute normals (per face)
• Bottleneck?– Data transfer CPU – GPU
Previous workGPU acceleration
• Parallel to our work– Goetz et al, “Real-time marching cubes
on the vertex shader”, 05
• Classifies cell on both CPU and GPU• Do not apply interval/Kd-trees• Only computes face normals
Our approach
• Marching cubes on GPU: Basic challenges– Cannot create vertices on GPU– Too costly to send all possible
triangulations (“worst-case” strategy)
Our approach
• “Caching cell topology”– Store each case triangulation on the
GPU using display lists– Classify cell on CPU and invoke
corresponding display list– Minimize CPU – GPU bandwidth
bottleneck by storing dataset on GPU
Our approach
• Display list stores corner indices• Use indices for texture lookup• Use values from texture
to interpolate verticesand normals
0
7 6
54
3 2
1(0,1)
(0,3)
(0,4)
Our approach
• Accelerate case classification by“pre-computing cell topology”
• Pre-compute possiblecases for each cell
• Store all intervalswith correspondingcase in intervalor Kd-tree
Our approach
• “Case interval/Kd-tree”– Shifts case classification to pre-
computation– Storage requirements increase for noisy
dataset (as much as 7 times)
Results
• First approach– Store dataset packed in 2D 1-channel
float texture– Central differencing on GPU for normals– Results disappointing
• 1.2-1.6 speedup for Marching Cubes without accelerating structures
• Even decrease in speedup when using accelerating structures: GPU bottleneck
Results
• Vertex texture support is currently poor– Only 2D 1/4–channel floats– High latency
• Central differencing– 12 texture lookups per vertex normal
• 14 lookups in total for each vertex
Results
• Second approach– Pre-compute normals and store dataset and
normals packed in 2D 4-channel float texture
– Only need 2 lookups for each vertex– Results improved
• Speedup of 3-4 times compared with CPU counterpart
• 128x128x128 “Hydrogen atom” dataset– Interval tree + CPU: 27 fps– Interval tree + GPU: 112 fps (4 times speedup)
Conclusions
• Accelerating isosurface extractionusing GPU– Cache all possible cell triangulations
(cases)– Use CPU for classification– Use GPU for interpolation– Optimize CPU classification by pre-
computing all possible cases (case interval/Kd-tree)
Conclusions
• Applicable to any interpolant (in this work described using Marching Cubes)
• Current hardware impose restrictions– Float textures, high latency for vertex
texture lookup
Future work
• Move computation to fragment processor– More powerful than vertex processor– Better, more efficient texture support– Ability to download (to CPU) the
extracted surface
• Optimize memory usage (texture/system)
• Apply to higher-order interpolants