University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Granular Visibility Queries on the GPU
Thomas Engelhardt & Carsten Dachsbacher
Visualization Research Center
University of Stuttgart
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Motivation: Culling
Remove rendering workload from the pipeline
Prevent draw calls from execution- Frustum Culling- Hardware Occlusion Queries (HOQ) - Occlusion Predicates
Prevent shaders from execution- Backface Culling- Early-Z
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Motivation: Culling
Control of shader execution based on visibilityGeometry ShaderPixel Shader when early-z is disabled
Visibility not only per object / draw call but perPrimitive / primitive clusterScreen space region
Evaluate and use visibility on GPU, no application feedback
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Image Space Visibility
How to determine image space visibility?
Take some objects
Rasterize
Count pixels that passed the depth test
8But how to count? 6 11
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Contribution
Two output sensitive pixel counting methods for from point visibility
Pixel Counting Summed Area Tables (PiC-SAT)Hierarchical Item Buffer (HIB)
Can also be done with HOQs. Why not use them?Granularity limitation & synchronization
Application toCulling of individual instancesControl of GS and PS execution for per pixel displacement mapping
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Pixel Counting using Summed Area Tables
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Pixel Counting using SATs
SAT stores sum of pixel values
Pixel sum of any rectangular region with just 4 lookups- Screen space bounding box
𝑆𝐴𝑇ሺ𝑥,𝑦ሻ= 𝐶𝑂𝑏𝑗(𝑖,𝑗)𝑦−1𝑗=0
𝑥−1𝑖=0
S=1 + 17 – 1 – 6 = 11
1
1
1 1 0 1 1
1 1
1 1 1
1
1
1
1
1
1 1 1
1
1
1
1
1
1
0
1
1
3
1
4
1
4
1
4
1
5
1
6
4
9
7
12
9
14
1
1
4
4
6
6
6
6
7
8
9
11
11
14
14
17
17
20
19
22
1
1
4
4
6
6
6
6
9
9
13
13
17
17
20
20
23
23
25
25
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
2
2
4
3
6
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Pixel Counting using SATs
Crucial: Query regions must not overlap!
Can‘t differentiate to which object the pixels in the overlap belong
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Conflict Objects
Conflict ObjectsObjects whose bounding rectangles overlap
How to resolve conflict?Distribute objects among color channels without overlap- 4 parallel SATs per RGBA texture
What is the distribution strategy?
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Graph Coloring Algorithms
Graph Coloring AlgorithmsAssign colors to vertices in a graph- Vertices connected by an edge must
not share the same color
Difficult problem- Requires heuristic approaches like
Chaitin‘s algorithm- What if more edges than colors
available?
correct coloring
false coloring
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Object Distribution by Graph Coloring
Construct a conflict graphEach object‘s bounding rectangle one vertexEach overlap one edge
Graph Construction
How to color the graph?
OVERLAP
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Chaitin‘s Algorithm
Heuristic algorithm desgined for register allocation
InputConflict GraphSet of colors
OutputColor coded graph
Some vertices may remain uncolored
Complexity: O(N²)
color 1 color 2
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Chaitin‘s Algorithm: Deconstruction
3
1
2 5 4
stack
4
1
5
3
21. Find any vertex with least
number of incident edges
2. Remove vertex and put onto a stack
3. Repeat until graph deconstructed
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Chaitin‘s Algorithm: Reconstruction
3
1
2 5 4
stack
4
1
5
3
2
color 2color 11. Reinsert top vertex on stack
into graph
2. Find a color not used by any reconsructed neighbor
3. Repeat until entire graph is reconstructed
No color available
What to do about uncolored objects?
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
About Uncolored Objects
Uncolored objects need additional treatment1. Split bounding rectangle of uncolored
object
2. Attempt to color sub rectangles
3. Assign any color if no unique color can be found
Visibility overestimation
4. Attempt to merge sub regions
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
The Pixel Counting SAT Pipeline
CPU (Application)Construct Conflict
Graph Graph Coloring
Treat Uncolored Objects
Calculate Look Up Coordinates
GPU
Render to texture and compute SAT [Hensley05]
Count Pixels by SAT Look Up
Objects
Shader Constantscolor information
look up information
[Hensley05: Fast Summed Area Table Generation and its Applications]
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Pixel Counting using the Hierarchical Item Buffer
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
The Hierarchical Item Buffer (HIB)
Exploits histogram computation algorithmGPU implementation demonstrated by Scheuermann [Scheuermann07: Efficient histogram generation using scattering on GPUs]
Render unique IDs to texture
𝐼𝐷= 𝐼𝐷𝑂𝑏𝑗𝑒𝑐𝑡 ∗𝑅𝑒𝑠𝑜𝑙𝑢𝑡𝑖𝑜𝑛+𝐼𝐷𝑆𝑢𝑏 21
30 31 32
41 42
115 116
124 125
134
126
135 136
144 145 146
167 168 169
177 178 179
187 188
8
10
Object ID: 0 1 2
Sub ID: Enumerate pixels
Resolution: 8x10 = 80
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
GPU Item Buffer
Reinterpret ID texture as point listVertex or Geometry Shader for scatteringBlending operations for counting
VS/GS Maps to histogram
bin
RasterizerRenders point
primitive
BlendingIncrements bin
21 30 … 31 32 … 124 115 125 188 179 … 167
0 0 0 0 0 … 0 0 0 0 0 0
histogram/item buffer
1 1 1 1 1 11
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Hierarchical Queries
Intelligently distributing IDs enables hierarchical queries by mip mapping
Mip
Map
1 10 … 0 1 10 … 1 1 10 … 0
1 13 1 2 43 2 4 12 1
6 11 8
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Applications and Results
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Culling of Instances
Shadow volumes with instanced renderingVolumes entirely contained in others have no effect [Lloyd04: CC Shadow Volumes]- Test caster visibility from light- HOQ / Occl. Predicates cannot be applied directly
- Granularity: a single draw call, not individual instances
Instances of the same object
Shadow VolumesCull volume with no contribution
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Culling of Instances
Granularity: Per Instance (Sub-ID: Instance ID)500 shadow casters (606 triangles each)ID texture/SAT resolution: 512x512 pixels
ATI (
HD
4780
X2)
0 5 10 15 20 25 30 35 40 45 50
24
11
29
26
43
31
12
27
Predicates
HIB
PiC-SAT
No Cull
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Culling of Individual Primitives
Displacement MappingSetup costs in GS (mesh extrusion, tetrahedra, texture gradients)
Ray-Casting in PS - Cannot exploit early-z due to depth write
in PS
Don‘t output triangles if extruded prism is not visible- Exact visibility requires ray-casting in
HIB/SAT pass- Conservative visibility estimation by mesh
extrusion
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Culling of Individual Primitives
Granularity: Per prismLizard: 7132 trianglesID texture / SAT resolution: 512x512
Ful
ly V
isib
leP
artia
lly V
isib
leO
cclu
ded
0 20 40 60 80 100 120 140 160 180 200
22
42
45
71
90
90
22
32
42
20
30
110
24
87
158
66
131
186
19
50
67
57
89
112
PiC-SAT (NVIDIA)
PiC-SAT (ATI)
HIB (NVIDIA)
HIB (ATI)
Predicates (NVIDIA)
Predicates (ATI)
No Cull (NVIDIA)
No Cull (ATI)
NVIDIA: GTX280 ATI: HD3780
University of Stuttgart - Visualization Research Center (VISUS) 19 April 2023
Discussion
Pixel Counting SAT
Not enough colors, if many objectsVisibility overestimation
Difficult implementationNot entirely transparent to application (overlap, coloring, …)
PerformanceDominated by treatment of uncolored objects (rectangle split/merge, texture access)
Can handle arbitrary screen regions for query
Hierarchical Item Buffer
No penalty for many objects
Easy ImplementationTransparent to application, GPU handles everything
PerformanceDominated by overdraw in item buffer caused by choice of IDs.Usage of many IDs better exploits parallelism. Mip map does the rest (memory consumption)
Query regions defined by ID assignment