+ All Categories
Home > Documents > 05-schedulingGraphicsPipeline-BPS2011-ragankelley

05-schedulingGraphicsPipeline-BPS2011-ragankelley

Date post: 03-Apr-2018
Category:
Upload: yurymik
View: 213 times
Download: 0 times
Share this document with a friend

of 95

Transcript
  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    1/95

    Beyond Programmable Shading 2011

    Scheduling the Graphics Pipe

    JonathanRagan-Kelley,MITCSAIL9 August 2011

    Beyond Programmable Shadin

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    2/95

    Beyond Programmable Shading 2011

    This talk

    HowtothinkaboutschedulingGPU-stylepipeliFourconstraintswhichdriveschedulingdecisioExamplesoftheseconceptsinrealGPUdesignGoals

    Know why GPUs, APIs impose the constraints they do.

    Develop intuition for what they can do well.

    Understand key patterns forbuilding your own pipelines.

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    3/95

    Beyond Programmable Shading 2011

    This talk

    HowtothinkaboutschedulingGPU-stylepipeliFourconstraintswhichdriveschedulingdecisioExamplesoftheseconceptsinrealGPUdesignGoals

    Know why GPUs, APIs impose the constraints they do.

    Develop intuition for what they can do well.

    Understand key patterns forbuilding your own pipelines.

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    4/95

    Beyond Programmable Shading 2011

    Scheduling[n.]:

    First, some definitions

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    5/95Beyond Programmable Shading 2011

    Scheduling[n.]:Assigning computations and dat

    to resources in space and time.

    First, some definitions

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    6/95Beyond Programmable Shading 2011

    Scheduling[n.]:Assigning computations and dat

    to resources in space and time.

    First, some definitions

    Task[n.]:A single, discrete unit of work.

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    7/95Beyond Programmable Shading 2011

    The workload: Direct3D

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    8/95Beyond Programmable Shading 2011

    The workload: Direct3D

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    9/95Beyond Programmable Shading 2011

    The workload: Direct3D

    dataflow

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    10/95Beyond Programmable Shading 2011

    The workload: Direct3D

    dataflow

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    11/95

    Beyond Programmable Shading 2011

    The workload: Direct3D

    dataflow

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    12/95

    Beyond Programmable Shading 2011

    The machine: a modern GPU

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    13/95

    Beyond Programmable Shading 2011

    Scheduling a draw call as a series o

    time

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    14/95

    Beyond Programmable Shading 2011

    Scheduling a draw call as a series o

    t

    ime

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    15/95

    Beyond Programmable Shading 2011

    Scheduling a draw call as a series o

    t

    ime

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    16/95

    Beyond Programmable Shading 2011

    Scheduling a draw call as a series o

    t

    ime

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    17/95

    Beyond Programmable Shading 2011

    Scheduling a draw call as a series o

    t

    ime

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    18/95

    Beyond Programmable Shading 2011

    Scheduling a draw call as a series o

    t

    ime

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    19/95

    Beyond Programmable Shading 2011

    Scheduling a draw call as a series o

    t

    ime

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    20/95

    Beyond Programmable Shading 2011

    An efficient schedule keeps hardwa

    t

    ime

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    21/95

    Beyond Programmable Shading 2011

    Choosing which tasks to run when (an

    ResourceconstraintsTasks can only execute when there are sufficient resources for the

    computationandtheir data.

    Coherence

    Control coherence is essential to shader core efficiency.

    Data coherence is essential to memory and communication efficie

    LoadbalanceIrregularity in execution time create bubbles in the pipeline sched

    Ordering

    Graphics APIs define strict ordering semantics, which restrict possible

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    22/95

    Beyond Programmable Shading 2011

    Resource constraints limit scheduling

    time

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    23/95

    Beyond Programmable Shading 2011

    Resource constraints limit scheduling

    time

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    24/95

    Beyond Programmable Shading 2011

    Resource constraints limit scheduling

    time

    ???

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    25/95

    Beyond Programmable Shading 2011

    Resource constraints limit scheduling

    time

    ??? ???

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    26/95

    Beyond Programmable Shading 2011

    Resource constraints limit scheduling

    time

    ??? ???

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    27/95

    Beyond Programmable Shading 2011

    Resource constraints limit scheduling

    time

    Key concept:Preallocation of resources helps

    guarantee forward progress.

    ??? ???

    C h i b l i

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    28/95

    Beyond Programmable Shading 2011

    Coherence is a balancing act

    Intrinsic tension between:

    Horizontal (control, fetch) coherenc

    Vertical (producer-consumer) localit

    Localityand LoadBalance.

    G hi kl d i l

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    29/95

    Beyond Programmable Shading 2011

    Graphics workloads are irregula

    G hi kl d i l

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    30/95

    Beyond Programmable Shading 2011

    Graphics workloads are irregula

    G hi kl d i l

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    31/95

    Beyond Programmable Shading 2011

    Graphics workloads are irregula

    G hi kl d i l

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    32/95

    Beyond Programmable Shading 2011

    Graphics workloads are irregula

    G aphics o kloads a e i eg la

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    33/95

    Beyond Programmable Shading 2011

    Graphics workloads are irregula

    But: Shaders are optimized forregular,self-simila

    Imbalanced work createsbubbles in the task

    Graphics workloads are irregula

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    34/95

    Beyond Programmable Shading 2011

    Graphics workloads are irregula

    But: Shaders are optimized forregular,self-simila

    Imbalanced work createsbubbles in the task

    Solution:

    Dynamically generating and aggregatingtaskisolates irregularity and recaptures coherence

    Redistributing tasks restores loadbalance.

    Redistribution after irregular amplifi

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    35/95

    Beyond Programmable Shading 2011

    Redistribution after irregular amplifi

    time

    Redistribution after irregular amplifi

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    36/95

    Beyond Programmable Shading 2011

    Redistribution after irregular amplifi

    time

    Redistribution after irregular amplifi

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    37/95

    Beyond Programmable Shading 2011

    Redistribution after irregular amplifi

    time

    Key concept:Managing irregularity by dynamicallygeneraaggregating, and redistributingtasks

    Ordering

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    38/95

    Beyond Programmable Shading 2011

    Ordering

    Rule:

    All framebuffer updates must appe

    as though all triangles were drawn

    strict sequential order

    Ordering

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    39/95

    Beyond Programmable Shading 2011

    Ordering

    Rule:

    All framebuffer updates must appe

    as though all triangles were drawn

    strict sequential order

    Key concept:Carefully structuring taskredistributito maintainAPIordering.

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    40/95

    Beyond Programmable Shading 2011

    Building a real pipeline

    Static tile scheduling

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    41/95

    Beyond Programmable Shading 2011

    Static tile scheduling

    Multiple cores:1 front-end

    n back-end

    The simplest thing that could

    possibly work.

    Static tile scheduling

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    42/95

    Beyond Programmable Shading 2011

    Static tile scheduling

    Static tile scheduling

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    43/95

    Beyond Programmable Shading 2011

    Static tile scheduling

    Static tile scheduling

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    44/95

    Beyond Programmable Shading 2011

    Static tile scheduling

    Static tile scheduling

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    45/95

    Beyond Programmable Shading 2011

    Static tile scheduling

    Static tile scheduling

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    46/95

    Beyond Programmable Shading 2011

    Static tile scheduling

    Static tile scheduling

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    47/95

    Beyond Programmable Shading 2011

    Static tile scheduling

    Localitycaptured within tiles

    Resourceconstraints

    static = simple

    Orderingsingle front-end,sequential processing

    within each tile

    Static tile scheduling

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    48/95

    Beyond Programmable Shading 2011

    Static tile scheduling

    Theproblem:loadimbalance

    only one task creation

    point.

    no dynamic task

    redistribution.

    Static tile scheduling

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    49/95

    Beyond Programmable Shading 2011

    g

    Theproblem:loadimbalance

    only one task creation

    point.

    no dynamic task

    redistribution.

    Static tile scheduling

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    50/95

    Beyond Programmable Shading 2011

    g

    Theproblem:loadimbalance

    only one task creation

    point.

    no dynamic task

    redistribution.

    Sort-last fragment shading

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    51/95

    Beyond Programmable Shading 2011

    g g

    Exem

    NVIDIA

    Sort-last fragment shading

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    52/95

    Beyond Programmable Shading 2011

    g g

    Exem

    NVIDIA

    Redistrib

    fragment

    But how

    maintain

    Sort-last fragment shading

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    53/95

    Beyond Programmable Shading 2011

    g g

    Exem

    NVIDIA

    Sort-last fragment shading

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    54/95

    Beyond Programmable Shading 2011

    Exem

    NVIDIA

    Comp

    shadasync

    Sort-last fragment shading

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    55/95

    Beyond Programmable Shading 2011

    Exem

    NVIDIA

    B

    f

    F

    Unified shaders

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    56/95

    Beyond Programmable Shading 2011

    Exem

    NVIDIA

    Solve load balance by time-multiplexing diffe

    stages onto shared processors according to

    Unified Shaders: time-multiplexing

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    57/95

    Beyond Programmable Shading 2011

    time

    Exem

    NVIDIA

    Unified Shaders: time-multiplexing

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    58/95

    Beyond Programmable Shading 2011

    time

    Exem

    NVIDIA

    Prioritizing the logical pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    59/95

    Beyond Programmable Shading 2011

    Prioritizing the logical pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    60/95

    Beyond Programmable Shading 2011

    5

    4

    3

    2

    1

    0

    priority

    Prioritizing the logical pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    61/95

    Beyond Programmable Shading 2011

    5

    4

    3

    2

    1

    0

    priority

    Prioritizing the logical pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    62/95

    Beyond Programmable Shading 2011

    5

    4

    3

    2

    1

    0

    fixed-size

    queue storage

    priority

    Scheduling the pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    63/95

    Beyond Programmable Shading 2011

    time

    Scheduling the pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    64/95

    Beyond Programmable Shading 2011

    time

    Scheduling the pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    65/95

    Beyond Programmable Shading 2011

    time

    Scheduling the pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    66/95

    Beyond Programmable Shading 2011

    time

    Hi

    sta

    Lo

    bu

    Scheduling the pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    67/95

    Beyond Programmable Shading 2011

    time

    Scheduling the pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    68/95

    Beyond Programmable Shading 2011

    Queue sizes and backpressure provide a

    natural knob for balancing horizontalbatcoherenceand producer-consumerloc

    A real computational graphics pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    69/95

    Beyond Programmable Shading 2011

    Entry Points

    Ray Shading

    Traversal

    Host

    Ray Generation Program

    Intersection Program

    Any Hit Program

    Closest Hit Program

    Selector Visit Program

    Trace

    Miss Program

    Exception Program

    Buffers

    Texture SamplersVariables

    A real computational graphics pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    70/95

    Beyond Programmable Shading 2011

    Entry Points

    Ray Shading

    Traversal

    Host

    Ray Generation Program

    Intersection Program

    Any Hit Program

    Closest Hit Program

    Selector Visit Program

    Trace

    Miss Program

    Exception Program

    Buffers

    Texture SamplersVariables

    Pipelineabstractioforraytracing

    A real computational graphics pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    71/95

    Beyond Programmable Shading 2011

    Entry Points

    Ray Shading

    Traversal

    Host

    Ray Generation Program

    Intersection Program

    Any Hit Program

    Closest Hit Program

    Selector Visit Program

    Trace

    Miss Program

    Exception Program

    Buffers

    Texture SamplersVariables

    PipelineabstractioforraytracingApplicationfunctishader-styleprogIntersecting primitives

    Shading surfaces, firin

    A real computational graphics pipeline

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    72/95

    Beyond Programmable Shading 2011

    Entry Points

    Ray Shading

    Traversal

    Host

    Ray Generation Program

    Intersection Program

    Any Hit Program

    Closest Hit Program

    Selector Visit Program

    Trace

    Miss Program

    Exception Program

    Buffers

    Texture SamplersVariables

    PipelineabstractioforraytracingApplicationfunctishader-styleprogIntersecting primitives

    Shading surfaces, firin

    PipelinestructureTraversal, acceleration

    Order of execution

    Resource managemen

    Issues in scheduling a ray trace

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    73/95

    Beyond Programmable Shading 2011

    Breadthfirstordepthfirsttraversal?Wide execution aggregates more potentially coherent work

    Depth-first execution reduces footprint needed.

    Issues in scheduling a ray trace

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    74/95

    Beyond Programmable Shading 2011

    Breadthfirstordepthfirsttraversal?Wide execution aggregates more potentially coherent work

    Depth-first execution reduces footprint needed.

    OptiX:aswideasthethemachine,butnowide

    Issues in scheduling a ray trace

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    75/95

    Beyond Programmable Shading 2011

    Breadthfirstordepthfirsttraversal?Wide execution aggregates more potentially coherent work

    Depth-first execution reduces footprint needed.

    OptiX:aswideasthethemachine,butnowide

    ExtractingSIMDcoherenceShader core requires SIMD batches for efficiency.

    Rays may diverge.

    Ray tracing on a SIMD machine

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    76/95

    Beyond Programmable Shading 2011

    Scalarray

    tracing

    traversetraverse

    Ray tracing on a SIMD machine

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    77/95

    Beyond Programmable Shading 2011

    Scalarray

    tracing

    traversetraverse

    Packettracing

    traversetraverse

    Breaking packets: SIMT ray trac[Aila & Laine 2

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    78/95

    Beyond Programmable Shading 2011

    [Aila & Laine 2

    AllowdatadivergenceDifferent rays traverse, intersect different parts of the s

    Maintaincontrol(SIMD)coherenceAll rays in bundle either traverse orintersecttogether

    Breaking packets: SIMT ray trac[Aila & Laine 2

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    79/95

    Beyond Programmable Shading 2011

    while(state != done) {if (state == traverse) traverse();

    if (state == intersect) intersect();

    }

    [

    AllowdatadivergenceDifferent rays traverse, intersect different parts of the s

    Maintaincontrol(SIMD)coherenceAll rays in bundle either traverse orintersecttogether

    A pipeline program as a state mac

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    80/95

    Beyond Programmable Shading 2011

    while(myState != DONE) {

    nextState = scheduler();

    if (myState == nextState)switch(myState) {

    case 0: myState = traverse(); break;

    case 1: myState = intersector1(); break

    case 2: myState = intersector2(); break

    case 3: myState = shader1(); break;

    case 4: myState = shader2(); break;

    }

    }

    A pipeline program as a state mac

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    81/95

    Beyond Programmable Shading 2011

    while(myState != DONE) {

    nextState = scheduler();

    if (myState == nextState)switch(myState) {

    case 0: myState = traverse(); break;

    case 1: myState = intersector1(); break

    case 2: myState = intersector2(); break

    case 3: myState = shader1(); break;

    case 4: myState = shader2(); break;

    }

    }

    A pipeline program as a state mac

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    82/95

    Beyond Programmable Shading 2011

    while(myState != DONE) {

    nextState = scheduler();

    if (myState == nextState)switch(myState) {

    case 0: myState = traverse(); break;

    case 1: myState = intersector1(); break

    case 2: myState = intersector2(); break

    case 3: myState = shader1(); break;

    case 4: myState = shader2(); break;

    }

    }

    A pipeline program as a state mac

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    83/95

    Beyond Programmable Shading 2011

    while(myState != DONE) {

    nextState = scheduler();

    if (myState == nextState)switch(myState) {

    case 0: myState = traverse(); break;

    case 1: myState = intersector1(); break

    case 2: myState = intersector2(); break

    case 3: myState = shader1(); break;

    case 4: myState = shader2(); break;

    }

    }

    A pipeline program as a state mac

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    84/95

    Beyond Programmable Shading 2011

    while(myState != DONE) {

    nextState = scheduler();

    if (myState == nextState)switch(myState) {

    case 0: myState = traverse(); break;

    case 1: myState = intersector1(); break

    case 2: myState = intersector2(); break

    case 3: myState = shader1(); break;

    case 4: myState = shader2(); break;

    }

    }

    A pipeline program as a state mac

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    85/95

    Beyond Programmable Shading 2011

    while(myState != DONE) {

    nextState = scheduler();

    if (myState == nextState)switch(myState) {

    case 0: myState = traverse(); break;

    case 1: myState = intersector1(); break

    case 2: myState = intersector2(); break

    case 3: myState = shader1(); break;

    case 4: myState = shader2(); break;

    }

    }

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    86/95

    Beyond Programmable Shading 2011

    Summary

    Key concepts

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    87/95

    Beyond Programmable Shading 2011

    Think of schedulingthepipeline as mappingtasksonPreallocateresourcesbeforelaunchingatask.Preallocation helps ensure forward progress and prevent deadloc

    Graphicsisirregular.Dynamically generating, aggregating and redistributingtasks a

    amplification points regains coherence and loadbalance.Ordermatters.Carefully structure taskredistribution to maintain ordering.

    Why dont we have dynamic resource all

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    88/95

    Beyond Programmable Shading 2011

    Static preallocation of resources guarantee

    forward progress.

    Tasks which outgrow available resources ccausing deadlock.

    Whydon twehavedynamicresourcealle.g. recursion, malloc() in shaders

    GeometryShadersareslowbecausethedynamicamplificationinshaders.

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    89/95

    Beyond Programmable Shading 2011

    Pick your poison:

    AlwaysstreamthroughDRAM.exemplar: ATI R600

    Smooth falloff for large amplification, but very slo

    amplification (DRAM latency).

    Scaledownparallelismtofit.exemplar: NVIDIA G80

    Fast for small amplification, poor shader through

    parallelism) for large amplification.

    Key concepts

    Thi k f h d li th i li i t k

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    90/95

    Beyond Programmable Shading 2011

    Think of schedulingthepipeline as mappingtasksonPreallocateresourcesbeforelaunchingatask.Preallocation helps ensure forward progress and prevent deadloc

    Graphicsisirregular.Dynamically generating, aggregating and redistributingtasks a

    amplification points regains coherence and loadbalance.Ordermatters.Carefully structure taskredistribution to maintain ordering.

    Why isnt rasterization programma

    Yes, partly because it is computationally intensive,b

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    91/95

    Beyond Programmable Shading 2011

    es, pa y because s co pu a o a y e s e, b

    It ishighly irregular.

    It must generate and aggregateregular o

    It must integrate with an order-preserving

    redistribution mechanism.

    Key concepts

    Thi k f h d li th i li i t k

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    92/95

    Beyond Programmable Shading 2011

    Think of schedulingthepipeline as mappingtasksonPreallocateresourcesbeforelaunchingatask.Preallocation helps ensure forward progress and prevent deadloc

    Graphicsisirregular.Dynamically generating, aggregating and redistributingtasks a

    amplification points regains coherence and loadbalance.Ordermatters.Carefully structure taskredistribution to maintain ordering.

    Questions for the future

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    93/95

    Beyond Programmable Shading 2011

    Can we relax the strictordering requiremen

    Can you build a genericscheduler forapplication-definedpipelines?What application-specificinformation wougenericscheduler need to work well?

    Starting points to learn more

    Thenextstep:parallelprimitiveprocessing

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    94/95

    Beyond Programmable Shading 2011

    p p p p g

    Eldridge et al. Pomegranate: A Fully Scalable Graphics Arc

    SIGGRAPH 2000.

    Tim Purcell. Fast Tessellated Rendering on Fermi GF100. H

    HPG 2010.

    Schedulingcyclicgraphs,insoftware,oncurrentParker et al. OptiX:A General Purpose Ray Tracing EngineSIGGRAPH 2010.

    DetailsoftheARMMalidesignTom Olson. Mali-400 MP: A Scalable GPU for Mobile Devi

    Hot3D, HPG 2010.

  • 7/29/2019 05-schedulingGraphicsPipeline-BPS2011-ragankelley

    95/95

    Beyond Programmable Shading 2011

    Thank you

    Specialthanks:Tim Purcell, Steve Molnar, Henry Moreton, Steve Parker, Austin Robi

    Jeremy Sugerman - Stanford

    Mike Houston -AMD

    Mike Doggett - Lund University

    Tom Olson -ARM


Recommended