Date post: | 04-Apr-2018 |
Category: |
Documents |
Upload: | tomek-manko |
View: | 237 times |
Download: | 0 times |
of 34
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
1/34
Beyond Programmable Shad
ACM SIGG
Deferred Rendering for Current and
Future Rendering Pipelines
Andrew LauritzenAdvanced Rendering Technology (ART)
Intel Corporation
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
2/34
Overview
Forward shading Deferred shading and lighting
Tile-based deferred shading
Deferred multi-sample anti-aliasing (MSA
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
3/34
Forward Shading
Do everything we need to shade a pixel for each light
Shadow attenuation (sampling shadow maps)
Distance attenuation
Evaluate lighting and accumulate
Multi-pass requires resubmitting scene g
Not a scalable solution
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
4/34
Forward Shading Problems
Ineffective light culling Object space at best
Trade-off with shader permutations/batching
Memory footprint of all inputs
Everything must be resident at the same time
Shading small triangles is inefficient
Covered earlier in this course: [Fatahalian 20
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
5/34
Conventional Deferred Shading
Store lighting inputs in memory (G-buffer for each light
Use rasterizer to scatter light volume and cull
Read lighting inputs from G-buffer
Compute lighting
Accumulate lighting with additive blending
Reorders computation to extract coheren
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
6/34
Modern Implementation
Cull with screen-aligned quads Cover light extents with axis-aligned boundin
Full light meshes (spheres, cones) are generally o
Can use oriented bounding box for narrow spot lig
Use conservative single-direction depth test Two-pass stencil is more expensive than it is worth
Depth bounds test on some hardware, but not batc
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
7/34
Lit Scene (256 Point Lights)
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
8/34
Quad-Based Light Culling
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
9/34
Deferred Shading Problems
Bandwidth overhead when lights overlap for each light
Use rasterizer to scatter light volume and cull
Read lighting inputs from G-bufferoverhead
Compute lighting
Accumulate lighting with additive blendingover
Not doing enough work to amortize overh
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
10/34
Improving Deferred Shading
Reduce G-buffer overhead Access fewer things inside the light loop
Deferred lighting / light pre-pass
Amortize overhead
Group overlapping lights and process them to
Tile-based deferred shading
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
11/34
Deferred Lighting / Light Pre-Pass
Goal: reduce G-buffer overhead Split diffuse and specular terms
Common concession is monochromatic spec
Factor out constant terms from summatio
Albedo, specular amount, etc.
Sum inner terms over all lights
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
12/34
Deferred Lighting / Light Pre-Pass
Resolve pass combines factored compon Still best to store all terms in G-buffer up fron
Better SIMD efficiency
Incremental improvement for some hardw
Relies on pre-factoring lighting functions
Ability to vary resolve pass is not particularly
See [Hoffman 2009] and [Stone 2009]
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
13/34
Tile-Based Deferred Shading
Goal: amortize overhead Use screen tiles to group lights
Use tight tile frusta to cull non-intersecting lig
Reduces number of lights to consider
Read G-buffer once and evaluate all relevant Reduces bandwidth of overlapping lights
See [Andersson 2009] for more details
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
14/34
Lit Scene (1024 Point Lights)
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
15/34
Tile-Based Light Culling
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
16/34
Quad-Based Lighting Culling
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
17/34
1
2
4
8
16
16 32 64 128 256 512 1024
FrameTime(ms)
Number of Point Lights
Quad (A
Quad (N
Tiled (NV
Tiled (AT
Light Culling Only at 1080p
Beyond Programmable Shading, SIGGRAPH 2010
Slope ~
Slope ~
Tile setup dominates
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
18/34
1
2
4
8
16
32
16 32 64 128 256 512 1024
FrameTime(ms)
Number of Point Lights
Deferred Shad
Deferred Shad
Deferred Ligh
Deferred Ligh
Tiled (NVIDIA
Tiled (ATI 587
Total Performance at 1080p
Beyond Programmable Shading, SIGGRAPH 2010
Deferred lighting slightly faster, but trends similarly
Slope ~ 4 s /
Slope ~ 20 s
Few lights overlap
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
19/34
Anti-aliasing
Multi-sampling with deferred rendering re
some work
Regular G-buffer couples visibility and shadin
Handle multi-frequency shading in user s
Store G-buffer at sample frequency Only apply per-sample shading where neces
Offers additional flexibility over forward rende
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
20/34
Identifying Edges
Forward MSAA causes redundant work
It applies to all triangle edges, even for contin
tessellated surfaces
Want to find surfacediscontinuities
Compare sample depths to depth derivatives Compare (shading) normal deviation over sa
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
21/34
Per-Sample Shading Visualization
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
22/34
MSAA with Quad-Based Methods
Mark pixels for per-sample shading
Stencil still faster than branching on most ha
Probably gets scheduled better
Shade in two passes: per-pixel and per-s
Unfortunately, duplicates culling work Scheduling is still a problem
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
23/34
Per-Sample Scheduling
Beyond Programmable Shading, SIGGRAPH 2010
Lack of spatial locality causes hardware
scheduling inefficiency
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
24/34
MSAA with Tile-Based Methods
Handle per-pixel and per-sample in one p
Avoids duplicate culling work
Can use branching, but incurs scheduling pro
Instead, reschedule per-sample pixels
Shade sample 0 for the whole tile Pack a list of pixels that require per-sample shadin
Redistribute threads to process additional samples
Scatter per-sample shaded results
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
25/34
Tile-Based MSAA at 1080p, 1024 Lig
Beyond Programmable Shading, SIGGRAPH 2010
0
5
10
15
20
25
30
35
Crytek Sponza
(ATI 5870)
2009 Game
(ATI 5870)
Crytek Sponza
(NVIDIA 480)
2009 Game
(NVIDIA 480)
FrameTime(ms)
No MSAA
4x MSAA (B
4x MSAA (Pa
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
26/34
1
2
4
8
16
32
64
16 32 64 128 256 512 1024
FrameTime(ms)
Number of Point Lights
Deferred Shad
Deferred Ligh
Deferred Shad
Deferred Ligh
Tiled (ATI 587
Tiled (NVIDIA
4x MSAA Performance at 1080p
Beyond Programmable Shading, SIGGRAPH 2010
Slope ~ 5 s /
Slope ~ 35 s
Tiled takes less of a hit from MSAA
Deferred lighting even less compelling
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
27/34
Conclusions
Deferred shading is a useful rendering to
Decouples shading from visibility
Allows efficient user-space scheduling and c
Tile-based methods win going forward
Fastest and most flexible Enable efficient MSAA
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
28/34
Future Work
Hierarchical light culling
Straightforward but would need lots of small
Improve MSAA memory usage
Irregular/compressed sample storage?
Revisit binning pipelines? Sacrifice higher resolutions for better AA?
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
29/34
Acknowledgements
Microsoft and Crytek for the scene assets
Johan Andersson from DICE
Craig Kolb, Matt Pharr, and others in the
Advanced Rendering Technology team a
Nico Galoppo, Anupreet Kalra and Mike Bfrom Intel
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
30/34
References
[Andersson 2009] Johan Andersson, Parallel Graphics in Frostbite - Cu
Future, http://s09.idav.ucdavis.edu/ [Fatahalian 2010] Kayvon Fatahalian, Evolving the Direct3D Pipeline fo
Micropolygon Rendering, http://bps10.idav.ucdavis.edu/
[Hoffman 2009] Naty Hoffman, Deferred Lighting Approaches,
http://www.realtimerendering.com/blog/deferred-lighting-approaches/
[Stone 2009] Adrian Stone, Deferred Shading Shines. Deferred Lighting
Much., http://gameangst.com/?p=141
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
31/34
Questions?
Full source and demo available at:
http://visual-computing.intel-
research.net/art/publications/deferred_rende
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
32/34
Backup
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
33/34
Quad-Based Light Culling
Accumulate many lights per draw call
Render one point per light
Vertex shader computes quad bounds for ligh
Geometry shader expands into two triangles
Pixel shader reads G-buffer and evaluates lig
Beyond Programmable Shading, SIGGRAPH 2010
7/29/2019 Deferred Rendering for Current and Future Rendering Pipelines
34/34
Tile-Based Deferred Lighting?
Can do deferred lighting with tiling...
Not usually worth sacrificing the flexibility
Bandwidth already minimized
Additional resolve pass can make it slower o
Exception: hardware considerations SPU lighting on Playstation 3
Moving less data across the bus can be an overall
Beyond Programmable Shading, SIGGRAPH 2010