of 95
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
1/95
Beyond Programmable Shading 2011
Scheduling the Graphics Pipe
JonathanRagan-Kelley,MITCSAIL9 August 2011
Beyond Programmable Shadin
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
2/95
Beyond Programmable Shading 2011
This talk
HowtothinkaboutschedulingGPU-stylepipeliFourconstraintswhichdriveschedulingdecisioExamplesoftheseconceptsinrealGPUdesignGoals
Know why GPUs, APIs impose the constraints they do.
Develop intuition for what they can do well.
Understand key patterns forbuilding your own pipelines.
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
3/95
Beyond Programmable Shading 2011
This talk
HowtothinkaboutschedulingGPU-stylepipeliFourconstraintswhichdriveschedulingdecisioExamplesoftheseconceptsinrealGPUdesignGoals
Know why GPUs, APIs impose the constraints they do.
Develop intuition for what they can do well.
Understand key patterns forbuilding your own pipelines.
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
4/95
Beyond Programmable Shading 2011
Scheduling[n.]:
First, some definitions
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
5/95Beyond Programmable Shading 2011
Scheduling[n.]:Assigning computations and dat
to resources in space and time.
First, some definitions
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
6/95Beyond Programmable Shading 2011
Scheduling[n.]:Assigning computations and dat
to resources in space and time.
First, some definitions
Task[n.]:A single, discrete unit of work.
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
7/95Beyond Programmable Shading 2011
The workload: Direct3D
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
8/95Beyond Programmable Shading 2011
The workload: Direct3D
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
9/95Beyond Programmable Shading 2011
The workload: Direct3D
dataflow
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
10/95Beyond Programmable Shading 2011
The workload: Direct3D
dataflow
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
11/95
Beyond Programmable Shading 2011
The workload: Direct3D
dataflow
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
12/95
Beyond Programmable Shading 2011
The machine: a modern GPU
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
13/95
Beyond Programmable Shading 2011
Scheduling a draw call as a series o
time
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
14/95
Beyond Programmable Shading 2011
Scheduling a draw call as a series o
t
ime
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
15/95
Beyond Programmable Shading 2011
Scheduling a draw call as a series o
t
ime
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
16/95
Beyond Programmable Shading 2011
Scheduling a draw call as a series o
t
ime
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
17/95
Beyond Programmable Shading 2011
Scheduling a draw call as a series o
t
ime
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
18/95
Beyond Programmable Shading 2011
Scheduling a draw call as a series o
t
ime
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
19/95
Beyond Programmable Shading 2011
Scheduling a draw call as a series o
t
ime
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
20/95
Beyond Programmable Shading 2011
An efficient schedule keeps hardwa
t
ime
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
21/95
Beyond Programmable Shading 2011
Choosing which tasks to run when (an
ResourceconstraintsTasks can only execute when there are sufficient resources for the
computationandtheir data.
Coherence
Control coherence is essential to shader core efficiency.
Data coherence is essential to memory and communication efficie
LoadbalanceIrregularity in execution time create bubbles in the pipeline sched
Ordering
Graphics APIs define strict ordering semantics, which restrict possible
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
22/95
Beyond Programmable Shading 2011
Resource constraints limit scheduling
time
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
23/95
Beyond Programmable Shading 2011
Resource constraints limit scheduling
time
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
24/95
Beyond Programmable Shading 2011
Resource constraints limit scheduling
time
???
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
25/95
Beyond Programmable Shading 2011
Resource constraints limit scheduling
time
??? ???
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
26/95
Beyond Programmable Shading 2011
Resource constraints limit scheduling
time
??? ???
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
27/95
Beyond Programmable Shading 2011
Resource constraints limit scheduling
time
Key concept:Preallocation of resources helps
guarantee forward progress.
??? ???
C h i b l i
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
28/95
Beyond Programmable Shading 2011
Coherence is a balancing act
Intrinsic tension between:
Horizontal (control, fetch) coherenc
Vertical (producer-consumer) localit
Localityand LoadBalance.
G hi kl d i l
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
29/95
Beyond Programmable Shading 2011
Graphics workloads are irregula
G hi kl d i l
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
30/95
Beyond Programmable Shading 2011
Graphics workloads are irregula
G hi kl d i l
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
31/95
Beyond Programmable Shading 2011
Graphics workloads are irregula
G hi kl d i l
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
32/95
Beyond Programmable Shading 2011
Graphics workloads are irregula
G aphics o kloads a e i eg la
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
33/95
Beyond Programmable Shading 2011
Graphics workloads are irregula
But: Shaders are optimized forregular,self-simila
Imbalanced work createsbubbles in the task
Graphics workloads are irregula
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
34/95
Beyond Programmable Shading 2011
Graphics workloads are irregula
But: Shaders are optimized forregular,self-simila
Imbalanced work createsbubbles in the task
Solution:
Dynamically generating and aggregatingtaskisolates irregularity and recaptures coherence
Redistributing tasks restores loadbalance.
Redistribution after irregular amplifi
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
35/95
Beyond Programmable Shading 2011
Redistribution after irregular amplifi
time
Redistribution after irregular amplifi
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
36/95
Beyond Programmable Shading 2011
Redistribution after irregular amplifi
time
Redistribution after irregular amplifi
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
37/95
Beyond Programmable Shading 2011
Redistribution after irregular amplifi
time
Key concept:Managing irregularity by dynamicallygeneraaggregating, and redistributingtasks
Ordering
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
38/95
Beyond Programmable Shading 2011
Ordering
Rule:
All framebuffer updates must appe
as though all triangles were drawn
strict sequential order
Ordering
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
39/95
Beyond Programmable Shading 2011
Ordering
Rule:
All framebuffer updates must appe
as though all triangles were drawn
strict sequential order
Key concept:Carefully structuring taskredistributito maintainAPIordering.
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
40/95
Beyond Programmable Shading 2011
Building a real pipeline
Static tile scheduling
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
41/95
Beyond Programmable Shading 2011
Static tile scheduling
Multiple cores:1 front-end
n back-end
The simplest thing that could
possibly work.
Static tile scheduling
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
42/95
Beyond Programmable Shading 2011
Static tile scheduling
Static tile scheduling
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
43/95
Beyond Programmable Shading 2011
Static tile scheduling
Static tile scheduling
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
44/95
Beyond Programmable Shading 2011
Static tile scheduling
Static tile scheduling
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
45/95
Beyond Programmable Shading 2011
Static tile scheduling
Static tile scheduling
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
46/95
Beyond Programmable Shading 2011
Static tile scheduling
Static tile scheduling
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
47/95
Beyond Programmable Shading 2011
Static tile scheduling
Localitycaptured within tiles
Resourceconstraints
static = simple
Orderingsingle front-end,sequential processing
within each tile
Static tile scheduling
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
48/95
Beyond Programmable Shading 2011
Static tile scheduling
Theproblem:loadimbalance
only one task creation
point.
no dynamic task
redistribution.
Static tile scheduling
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
49/95
Beyond Programmable Shading 2011
g
Theproblem:loadimbalance
only one task creation
point.
no dynamic task
redistribution.
Static tile scheduling
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
50/95
Beyond Programmable Shading 2011
g
Theproblem:loadimbalance
only one task creation
point.
no dynamic task
redistribution.
Sort-last fragment shading
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
51/95
Beyond Programmable Shading 2011
g g
Exem
NVIDIA
Sort-last fragment shading
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
52/95
Beyond Programmable Shading 2011
g g
Exem
NVIDIA
Redistrib
fragment
But how
maintain
Sort-last fragment shading
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
53/95
Beyond Programmable Shading 2011
g g
Exem
NVIDIA
Sort-last fragment shading
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
54/95
Beyond Programmable Shading 2011
Exem
NVIDIA
Comp
shadasync
Sort-last fragment shading
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
55/95
Beyond Programmable Shading 2011
Exem
NVIDIA
B
f
F
Unified shaders
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
56/95
Beyond Programmable Shading 2011
Exem
NVIDIA
Solve load balance by time-multiplexing diffe
stages onto shared processors according to
Unified Shaders: time-multiplexing
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
57/95
Beyond Programmable Shading 2011
time
Exem
NVIDIA
Unified Shaders: time-multiplexing
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
58/95
Beyond Programmable Shading 2011
time
Exem
NVIDIA
Prioritizing the logical pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
59/95
Beyond Programmable Shading 2011
Prioritizing the logical pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
60/95
Beyond Programmable Shading 2011
5
4
3
2
1
0
priority
Prioritizing the logical pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
61/95
Beyond Programmable Shading 2011
5
4
3
2
1
0
priority
Prioritizing the logical pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
62/95
Beyond Programmable Shading 2011
5
4
3
2
1
0
fixed-size
queue storage
priority
Scheduling the pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
63/95
Beyond Programmable Shading 2011
time
Scheduling the pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
64/95
Beyond Programmable Shading 2011
time
Scheduling the pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
65/95
Beyond Programmable Shading 2011
time
Scheduling the pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
66/95
Beyond Programmable Shading 2011
time
Hi
sta
Lo
bu
Scheduling the pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
67/95
Beyond Programmable Shading 2011
time
Scheduling the pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
68/95
Beyond Programmable Shading 2011
Queue sizes and backpressure provide a
natural knob for balancing horizontalbatcoherenceand producer-consumerloc
A real computational graphics pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
69/95
Beyond Programmable Shading 2011
Entry Points
Ray Shading
Traversal
Host
Ray Generation Program
Intersection Program
Any Hit Program
Closest Hit Program
Selector Visit Program
Trace
Miss Program
Exception Program
Buffers
Texture SamplersVariables
A real computational graphics pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
70/95
Beyond Programmable Shading 2011
Entry Points
Ray Shading
Traversal
Host
Ray Generation Program
Intersection Program
Any Hit Program
Closest Hit Program
Selector Visit Program
Trace
Miss Program
Exception Program
Buffers
Texture SamplersVariables
Pipelineabstractioforraytracing
A real computational graphics pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
71/95
Beyond Programmable Shading 2011
Entry Points
Ray Shading
Traversal
Host
Ray Generation Program
Intersection Program
Any Hit Program
Closest Hit Program
Selector Visit Program
Trace
Miss Program
Exception Program
Buffers
Texture SamplersVariables
PipelineabstractioforraytracingApplicationfunctishader-styleprogIntersecting primitives
Shading surfaces, firin
A real computational graphics pipeline
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
72/95
Beyond Programmable Shading 2011
Entry Points
Ray Shading
Traversal
Host
Ray Generation Program
Intersection Program
Any Hit Program
Closest Hit Program
Selector Visit Program
Trace
Miss Program
Exception Program
Buffers
Texture SamplersVariables
PipelineabstractioforraytracingApplicationfunctishader-styleprogIntersecting primitives
Shading surfaces, firin
PipelinestructureTraversal, acceleration
Order of execution
Resource managemen
Issues in scheduling a ray trace
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
73/95
Beyond Programmable Shading 2011
Breadthfirstordepthfirsttraversal?Wide execution aggregates more potentially coherent work
Depth-first execution reduces footprint needed.
Issues in scheduling a ray trace
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
74/95
Beyond Programmable Shading 2011
Breadthfirstordepthfirsttraversal?Wide execution aggregates more potentially coherent work
Depth-first execution reduces footprint needed.
OptiX:aswideasthethemachine,butnowide
Issues in scheduling a ray trace
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
75/95
Beyond Programmable Shading 2011
Breadthfirstordepthfirsttraversal?Wide execution aggregates more potentially coherent work
Depth-first execution reduces footprint needed.
OptiX:aswideasthethemachine,butnowide
ExtractingSIMDcoherenceShader core requires SIMD batches for efficiency.
Rays may diverge.
Ray tracing on a SIMD machine
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
76/95
Beyond Programmable Shading 2011
Scalarray
tracing
traversetraverse
Ray tracing on a SIMD machine
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
77/95
Beyond Programmable Shading 2011
Scalarray
tracing
traversetraverse
Packettracing
traversetraverse
Breaking packets: SIMT ray trac[Aila & Laine 2
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
78/95
Beyond Programmable Shading 2011
[Aila & Laine 2
AllowdatadivergenceDifferent rays traverse, intersect different parts of the s
Maintaincontrol(SIMD)coherenceAll rays in bundle either traverse orintersecttogether
Breaking packets: SIMT ray trac[Aila & Laine 2
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
79/95
Beyond Programmable Shading 2011
while(state != done) {if (state == traverse) traverse();
if (state == intersect) intersect();
}
[
AllowdatadivergenceDifferent rays traverse, intersect different parts of the s
Maintaincontrol(SIMD)coherenceAll rays in bundle either traverse orintersecttogether
A pipeline program as a state mac
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
80/95
Beyond Programmable Shading 2011
while(myState != DONE) {
nextState = scheduler();
if (myState == nextState)switch(myState) {
case 0: myState = traverse(); break;
case 1: myState = intersector1(); break
case 2: myState = intersector2(); break
case 3: myState = shader1(); break;
case 4: myState = shader2(); break;
}
}
A pipeline program as a state mac
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
81/95
Beyond Programmable Shading 2011
while(myState != DONE) {
nextState = scheduler();
if (myState == nextState)switch(myState) {
case 0: myState = traverse(); break;
case 1: myState = intersector1(); break
case 2: myState = intersector2(); break
case 3: myState = shader1(); break;
case 4: myState = shader2(); break;
}
}
A pipeline program as a state mac
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
82/95
Beyond Programmable Shading 2011
while(myState != DONE) {
nextState = scheduler();
if (myState == nextState)switch(myState) {
case 0: myState = traverse(); break;
case 1: myState = intersector1(); break
case 2: myState = intersector2(); break
case 3: myState = shader1(); break;
case 4: myState = shader2(); break;
}
}
A pipeline program as a state mac
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
83/95
Beyond Programmable Shading 2011
while(myState != DONE) {
nextState = scheduler();
if (myState == nextState)switch(myState) {
case 0: myState = traverse(); break;
case 1: myState = intersector1(); break
case 2: myState = intersector2(); break
case 3: myState = shader1(); break;
case 4: myState = shader2(); break;
}
}
A pipeline program as a state mac
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
84/95
Beyond Programmable Shading 2011
while(myState != DONE) {
nextState = scheduler();
if (myState == nextState)switch(myState) {
case 0: myState = traverse(); break;
case 1: myState = intersector1(); break
case 2: myState = intersector2(); break
case 3: myState = shader1(); break;
case 4: myState = shader2(); break;
}
}
A pipeline program as a state mac
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
85/95
Beyond Programmable Shading 2011
while(myState != DONE) {
nextState = scheduler();
if (myState == nextState)switch(myState) {
case 0: myState = traverse(); break;
case 1: myState = intersector1(); break
case 2: myState = intersector2(); break
case 3: myState = shader1(); break;
case 4: myState = shader2(); break;
}
}
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
86/95
Beyond Programmable Shading 2011
Summary
Key concepts
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
87/95
Beyond Programmable Shading 2011
Think of schedulingthepipeline as mappingtasksonPreallocateresourcesbeforelaunchingatask.Preallocation helps ensure forward progress and prevent deadloc
Graphicsisirregular.Dynamically generating, aggregating and redistributingtasks a
amplification points regains coherence and loadbalance.Ordermatters.Carefully structure taskredistribution to maintain ordering.
Why dont we have dynamic resource all
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
88/95
Beyond Programmable Shading 2011
Static preallocation of resources guarantee
forward progress.
Tasks which outgrow available resources ccausing deadlock.
Whydon twehavedynamicresourcealle.g. recursion, malloc() in shaders
GeometryShadersareslowbecausethedynamicamplificationinshaders.
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
89/95
Beyond Programmable Shading 2011
Pick your poison:
AlwaysstreamthroughDRAM.exemplar: ATI R600
Smooth falloff for large amplification, but very slo
amplification (DRAM latency).
Scaledownparallelismtofit.exemplar: NVIDIA G80
Fast for small amplification, poor shader through
parallelism) for large amplification.
Key concepts
Thi k f h d li th i li i t k
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
90/95
Beyond Programmable Shading 2011
Think of schedulingthepipeline as mappingtasksonPreallocateresourcesbeforelaunchingatask.Preallocation helps ensure forward progress and prevent deadloc
Graphicsisirregular.Dynamically generating, aggregating and redistributingtasks a
amplification points regains coherence and loadbalance.Ordermatters.Carefully structure taskredistribution to maintain ordering.
Why isnt rasterization programma
Yes, partly because it is computationally intensive,b
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
91/95
Beyond Programmable Shading 2011
es, pa y because s co pu a o a y e s e, b
It ishighly irregular.
It must generate and aggregateregular o
It must integrate with an order-preserving
redistribution mechanism.
Key concepts
Thi k f h d li th i li i t k
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
92/95
Beyond Programmable Shading 2011
Think of schedulingthepipeline as mappingtasksonPreallocateresourcesbeforelaunchingatask.Preallocation helps ensure forward progress and prevent deadloc
Graphicsisirregular.Dynamically generating, aggregating and redistributingtasks a
amplification points regains coherence and loadbalance.Ordermatters.Carefully structure taskredistribution to maintain ordering.
Questions for the future
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
93/95
Beyond Programmable Shading 2011
Can we relax the strictordering requiremen
Can you build a genericscheduler forapplication-definedpipelines?What application-specificinformation wougenericscheduler need to work well?
Starting points to learn more
Thenextstep:parallelprimitiveprocessing
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
94/95
Beyond Programmable Shading 2011
p p p p g
Eldridge et al. Pomegranate: A Fully Scalable Graphics Arc
SIGGRAPH 2000.
Tim Purcell. Fast Tessellated Rendering on Fermi GF100. H
HPG 2010.
Schedulingcyclicgraphs,insoftware,oncurrentParker et al. OptiX:A General Purpose Ray Tracing EngineSIGGRAPH 2010.
DetailsoftheARMMalidesignTom Olson. Mali-400 MP: A Scalable GPU for Mobile Devi
Hot3D, HPG 2010.
7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley
95/95
Beyond Programmable Shading 2011
Thank you
Specialthanks:Tim Purcell, Steve Molnar, Henry Moreton, Steve Parker, Austin Robi
Jeremy Sugerman - Stanford
Mike Houston -AMD
Mike Doggett - Lund University
Tom Olson -ARM