+ All Categories
Home > Documents > Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven...

Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven...

Date post: 04-Jan-2016
Category:
Upload: shona-dorsey
View: 215 times
Download: 2 times
Share this document with a friend
Popular Tags:
72
Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek
Transcript
Page 1: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Saarland University, Germany

B-KD Trees for Hardware Accelerated Ray Tracing of

Dynamic Scenes

B-KD Trees for Hardware Accelerated Ray Tracing of

Dynamic Scenes

Sven Woop Gerd Marmitt

Philipp Slusallek

Page 2: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

OutlineOutline

• Previous Work

• B-KD Tree as new Spatial Index Structure

• DynRT Architecture

• Traveral Processing Unit

• Update Processor

• Prototype Implementation

• Live Demo

• Conclusion

Page 3: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Previous WorkPrevious Work

• Ray Tracers for Static Scenes

• CPU based: [OpenRT], [MLRT SIGGRAPH05]

• GPU based: Purcell (Grids) [SIGGRAPH02], Foley et al. (KD Trees) [GH05]

• Custom Hardware: Commercial Hardware (ART-VPS) Schmittler (KD Trees) [GH04] RPU (KD Trees) [SIGGRAPH05]

• Ray Tracers for Dynamic Scenes

• CPU based: Wald (Grids) [SIGGRAPH06] Wald (AABVHs) [TOG / Tech. Rep. 2006]

• Custom Hardware: Woop (B-KD Trees) [GH06]

Page 4: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Definition of B-KD TreesDefinition of B-KD TreesB-KD Tree (Bounded KD-Tree)

• Binary Tree

• 1D bounding intervalls for each child

• Leaf nodes point to a single primitive

split axisT1

T T

T 0

10

Page 5: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree SubdivisionB-KD Tree Subdivision• Bounding Volume Hierarchy (partially unbounded)

• Each node can be associated with a full bounding box

• Bounds may overlap

Primitives in single leaf nodes

More traversal steps as for KD Tree

Support for dynamic scenes

T T10

T01T00 T11T10

T1T 0

10T

11T

00T

01T

Page 6: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree SubdivisionB-KD Tree Subdivision• Bounding Volume Hierarchy (partially unbounded)

• Each node can be associated with a full bounding box

• Bounds may overlap

Primitives in single leaf nodes

More traversal steps as for KD Tree

Support for dynamic scenes

T T10

T01T00 T11T10

T1T 0

10T

11T

00T

01T

Page 7: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree SubdivisionB-KD Tree Subdivision• Bounding Volume Hierarchy (partially unbounded)

• Each node can be associated with a full bounding box

• Bounds may overlap

Primitives in single leaf nodes

More traversal steps as for KD Tree

Support for dynamic scenes

T T10

T01T00 T11T10

T1T 0

10T

11T

00T

01T

Page 8: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree SubdivisionB-KD Tree Subdivision• Bounding Volume Hierarchy (partially unbounded)

• Each node can be associated with a full bounding box

• Bounds may overlap

Primitives in single leaf nodes

More traversal steps as for KD Tree

Support for dynamic scenes

T T10

T01T00 T11T10

T1T 0

10T

11T

00T

01T

Page 9: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree SubdivisionB-KD Tree Subdivision• Bounding Volume Hierarchy (partially unbounded)

• Each node can be associated with a full bounding box

• Bounds may overlap

Primitives in single leaf nodes

More traversal steps as for KD Tree

Support for dynamic scenes

T T10

T01T00 T11T10

T1T 0

10T

11T

00T

01T

Page 10: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree ConstructionB-KD Tree Construction

• If #primitives > 1 then

Page 11: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree ConstructionB-KD Tree Construction

• If #primitives > 1 then

• Compute center of mass

Page 12: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree ConstructionB-KD Tree Construction

• If #primitives > 1 then

• Compute center of mass

• Spatial Median

• Object Median

Page 13: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree ConstructionB-KD Tree Construction

• If #primitives > 1 then

• Compute center of mass

• Spatial Median

• Object Median

Page 14: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree ConstructionB-KD Tree Construction

• If #primitives > 1 then

• Compute center of mass

• Sort geometry along all three dimensions

Page 15: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree ConstructionB-KD Tree Construction

• If #primitives > 1 then

• Compute center of mass

• Sort geometry along all three dimensions

• Partitionings can be determined by splitting a list at a position

Page 16: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree ConstructionB-KD Tree Construction

• If #primitives > 1 then

• Compute center of mass

• Sort geometry along all three dimensions

• Partitionings can be determined by splitting a list at a position

• Build all possible partitionings in all three dimensions

Page 17: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree ConstructionB-KD Tree Construction

• If #primitives > 1 then

• Compute center of mass

• Sort geometry along all three dimensions

• Partitionings can be determined by splitting a list at a position

• Build all possible partitionings in all three dimensions

• Find the partitioning with smallest SAH cost

Page 18: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree ConstructionB-KD Tree Construction

• If #primitives > 1 then

• Compute center of mass

• Sort geometry along all three dimensions

• Partitionings can be determined by splitting a list at a position

• Build all possible partitionings in all three dimensions

• Find the partitioning with smallest SAH cost

• Create node and recurse

Page 19: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree ConstructionB-KD Tree Construction

• If #primitives > 1 then

• Compute center of mass

• Sort geometry along all three dimensions

• Partitionings can be determined by splitting a list at a position

• Build all possible partitionings in all three dimensions

• Find the partitioning with smallest SAH cost

• Create node and recurse

• Else if #primitives = 1 then

• Create leaf node

Page 20: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Tree ConstructionB-KD Tree Construction

• Rendering Performance

• 20% to 100% better than center splitting approaches

• Two-level B-KD Trees

• Top-level B-KD tree over object instances

• Bottom-level B-KD tree for each object

Page 21: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

B-KD Trees for Dynamic ScenesB-KD Trees for Dynamic Scenes

• On changed object geometry

• B-KD tree bounds are updated from bottom up

• B-KD tree structure remains constant

Linear updating complexity

Page 22: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ExamplesExamples

• Bounding approaches perform well for

• Continous motion

• Structure of motion must match tree structure

• E.g. skinned meshes, characters, water surfaces, ...

Page 23: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ExamplesExamples

• Bounding approaches perform well for

• Continous motion

• Structure of motion must match tree structure

• E.g. skinned meshes, characters, water surfaces, ...

Page 24: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ExamplesExamples

• Bounding approaches perform well for

• Continous motion

• Structure of motion must match tree structure

• E.g. skinned meshes, characters, water surfaces, ...

Page 25: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ExamplesExamples

• Bounding approaches perform well for

• Continous motion

• Structure of motion must match tree structure

• E.g. skinned meshes, characters, water surfaces, ...

Page 26: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ExamplesExamples

• Bounding approaches perform well for

• Continous motion

• Structure of motion must match tree structure

• E.g. skinned meshes, characters, water surfaces, ...

Page 27: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ExamplesExamples

• Bounding approaches perform well for

• Continous motion

• Structure of motion must match tree structure

• E.g. skinned meshes, characters, water surfaces, ...

Page 28: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ExamplesExamples

• Bounding volume approaches are less efficient for

• Non-continous motion

• Structure of motion does not match tree structure

• High traversal cost due to large overlapping boxes

Page 29: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ExamplesExamples

• Bounding volume approaches fail for

• Non-continous motion

• Structure of motion does not match tree structure

• High traversal cost due to large overlapping boxes

Page 30: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ExamplesExamples

• Bounding volume approaches fail for

• Non-continous motion

• Structure of motion does not match tree structure

• High traversal cost due to large overlapping boxes

Page 31: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ExamplesExamples

• Bounding volume approaches fail for

• Non-continous motion

• Structure of motion does not match tree structure

• High traversal cost due to large overlapping boxes

Page 32: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ExamplesExamples

• Bounding volume approaches fail for

• Non-continous motion

• Structure of motion does not match tree structure

• High traversal cost due to large overlapping boxes

Page 33: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Comparison for Gael Scene Comparison for Gael Scene

52k triangles

Index type Index size # trav-cost # tri-ints

KD 1.4 MB 31 4.8

B-KD 1.1 MB 116 6.8

AABVH 2.2 MB 253 5.3

KD tree B-KD tree AABVH

Page 34: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

DynRT ArchitectureDynRT Architecture

• Extension of RPU approach

TraversalProcessing

Unit

Geometry Unit

Node Cache128 Bit wide

Vertex Cache128 Bit wide

from memory

Shading Unit

to framebuffer

from memory

Shader Cache128 Bit wide

from memory

Skinning Processor

instructions from memory

nodes tomemory

instructions from memory

vertices tomemory

Update Processor

vertices from memory

Page 35: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

DynRT ArchitectureDynRT Architecture

• Rendering Units

• Highly multi-threaded

• Higher hardware usage

• Synchronous execution of packets of 4 rays

• Memory bandwidth reduction

• First level caches

• Memory bandwidth reduction

vertices from memory

TraversalProcessing

Unit

Geometry Unit

Node Cache128 Bit wide

Vertex Cache128 Bit wide

from memory

Shading Unit

to framebuffer

from memory

Shader Cache128 Bit wide

from memory

Skinning Processor

instructions from memory

nodes tomemory

instructions from memory

vertices tomemory

Update Processor

Page 36: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

DynRT ArchitectureDynRT Architecture

• Programmable Shading Unit

• Similar to RPU shading processor

• Ray generation tasks

• Material shading

• Calls Ray Casting Units to cast rays

vertices from memory

TraversalProcessing

Unit

Geometry Unit

Node Cache128 Bit wide

Vertex Cache128 Bit wide

from memory

Shading Unit

to framebuffer

from memory

Shader Cache128 Bit wide

from memory

Skinning Processor

instructions from memory

nodes tomemory

instructions from memory

vertices tomemory

Update Processor

Page 37: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

DynRT ArchitectureDynRT Architecture

• Programmable Shading Unit

• Ray Casting Units

vertices from memory

TraversalProcessing

Unit

Geometry Unit

Node Cache128 Bit wide

Vertex Cache128 Bit wide

from memory

Shading Unit

to framebuffer

from memory

Shader Cache128 Bit wide

from memory

Skinning Processor

instructions from memory

nodes tomemory

instructions from memory

vertices tomemory

Update Processor

Page 38: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

DynRT ArchitectureDynRT Architecture

• Programmable Shading Unit

• Ray Casting Units

• Traversal Processing Unit

• Efficient traversal of B-KD trees

• Two level B-KD trees supported

vertices from memory

TraversalProcessing

Unit

Geometry Unit

Node Cache128 Bit wide

Vertex Cache128 Bit wide

from memory

Shading Unit

to framebuffer

from memory

Shader Cache128 Bit wide

from memory

Skinning Processor

instructions from memory

nodes tomemory

instructions from memory

vertices tomemory

Update Processor

Page 39: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

DynRT ArchitectureDynRT Architecture

• Programmable Shading Unit

• Ray Casting Units

• Traversal Processing Unit

• Efficient traversal of B-KD trees

• Two level B-KD trees supported

• Geometry Unit

• Ray transformations

• Vertex-based ray/triangle intersection [Möller Trumbore]

• Shared vertices save memory 6x

vertices from memory

TraversalProcessing

Unit

Geometry Unit

Node Cache128 Bit wide

Vertex Cache128 Bit wide

from memory

Shading Unit

to framebuffer

from memory

Shader Cache128 Bit wide

from memory

Skinning Processor

instructions from memory

nodes tomemory

instructions from memory

vertices tomemory

Update Processor

Page 40: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

DynRT ArchitectureDynRT Architecture

• Programmable Shading Unit

• Ray Casting Units

• Scene Changes

• Skinning Processor (see paper)

• Skeleton Subspace Deformation

• Re-uses Geometry Unit

• Pure stream architecture

vertices from memory

TraversalProcessing

Unit

Geometry Unit

Node Cache128 Bit wide

Vertex Cache128 Bit wide

from memory

Shading Unit

to framebuffer

from memory

Shader Cache128 Bit wide

from memory

Skinning Processor

instructions from memory

nodes tomemory

instructions from memory

vertices tomemory

Update Processor

Page 41: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

DynRT ArchitectureDynRT Architecture

• Programmable Shading Unit

• Ray Casting Units

• Scene Changes

• Skinning Processor (see paper)

• Skeleton Subspace Deformation

• Re-uses Geometry Unit

• Pure stream architecture

• Update Processor

• Stream-like architecture

• Partial breadth-first execution

• One B-KD node update per clock cycle peak

vertices from memory

TraversalProcessing

Unit

Geometry Unit

Node Cache128 Bit wide

Vertex Cache128 Bit wide

from memory

Shading Unit

to framebuffer

from memory

Shader Cache128 Bit wide

from memory

Skinning Processor

instructions from memory

nodes tomemory

instructions from memory

vertices tomemory

Update Processor

Page 42: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

vertices from memory

TraversalProcessing

Unit

Geometry Unit

Node Cache128 Bit wide

Vertex Cache128 Bit wide

from memory

Shading Unit

to framebuffer

from memory

Shader Cache128 Bit wide

from memory

Skinning Processor

instructions from memory

nodes tomemory

instructions from memory

vertices tomemory

Update Processor

Page 43: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Traversal of B-KD TreesTraversal of B-KD Trees

Traversal of B-KD Trees

• Early ray termination

• Clipping of near/far interval against both bounding intervalls

• Take closer child, push farther child to stack

• Traversal order does not affect correctness

Complexity

• 4x computational cost of KD tree traversal step

• 2x stack memory

near

I

R

Tcloser ch ild

Tfarther ch ild

far

I 1 0

10

Page 44: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Traversal Processing UnitTraversal Processing Unit

• Stack control computes next address

Node Cache128 Bit wide

from memory

Stack Control Unit

Memory Access Unit

Slice 0 Slice 1 Slice 2 Slice 3

Decide0 Decide1 Decide2 Decide3

Packet Decision Unit

to Geometry Unit

start traversal

finished if stack empty

Page 45: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Traversal Processing UnitTraversal Processing Unit

• Stack control computes next address

• Next node is fetched from cache

Node Cache128 Bit wide

from memory

Stack Control Unit

Memory Access Unit

Slice 0 Slice 1 Slice 2 Slice 3

Decide0 Decide1 Decide2 Decide3

Packet Decision Unit

to Geometry Unit

start traversal

finished if stack empty

Page 46: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Traversal Processing UnitTraversal Processing Unit

• Stack control computes next address

• Next node is fetched from cache

• 4 traversal slices compute 4x4 distances to bounding planes

Node Cache128 Bit wide

from memory

Stack Control Unit

Memory Access Unit

Slice 0 Slice 1 Slice 2 Slice 3

Decide0 Decide1 Decide2 Decide3

Packet Decision Unit

to Geometry Unit

start traversal

finished if stack empty

Page 47: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Traversal Processing UnitTraversal Processing Unit

• Stack control computes next address

• Next node is fetched from cache

• 4 traversal slices compute 4x4 distances to bounding planes

• 4 Decision Units compute per ray traversal decision

Node Cache128 Bit wide

from memory

Stack Control Unit

Memory Access Unit

Slice 0 Slice 1 Slice 2 Slice 3

Decide0 Decide1 Decide2 Decide3

Packet Decision Unit

to Geometry Unit

start traversal

finished if stack empty

Page 48: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Traversal Processing UnitTraversal Processing Unit

• Stack control computes next address

• Next node is fetched from cache

• 4 traversal slices compute 4x4 distances to bounding planes

• 4 Decision Units compute per ray traversal decision

• Packet Decision Unit computes packet traversal decision

• Packet goes left if exists a that ray goes left

• Packet goes right if exists a ray that goes right

• Packet goes from left to right if exists a ray that goes into both children from left to right

Node Cache128 Bit wide

from memory

Stack Control Unit

Memory Access Unit

Slice 0 Slice 1 Slice 2 Slice 3

Decide0 Decide1 Decide2 Decide3

Packet Decision Unit

to Geometry Unit

start traversal

finished if stack empty

Page 49: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Traversal Processing UnitTraversal Processing Unit

• Stack control computes next address

• Next node is fetched from cache

• 4 traversal slices compute 4x4 distances to bounding planes

• 4 Decision Units compute per ray traversal decision

• Packet Decision Unit computes packet traversal decision

• Packet goes left if exists a that ray goes left

• Packet goes right if exists a ray that goes right

• Packet goes from left to right if exists a ray that goes into both children from left to right

Incoherent packets possible

Node Cache128 Bit wide

from memory

Stack Control Unit

Memory Access Unit

Slice 0 Slice 1 Slice 2 Slice 3

Decide0 Decide1 Decide2 Decide3

Packet Decision Unit

to Geometry Unit

start traversal

finished if stack empty

Page 50: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

vertices from memory

TraversalProcessing

Unit

Geometry Unit

Node Cache128 Bit wide

Vertex Cache128 Bit wide

from memory

Shading Unit

to framebuffer

from memory

Shader Cache128 Bit wide

from memory

Skinning Processor

instructions from memory

nodes tomemory

instructions from memory

vertices tomemory

Update Processor

Page 51: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update of B-KD TreesUpdate of B-KD Trees

• Leaf Node

x

y

leaf leaf leaf

Page 52: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update of B-KD TreesUpdate of B-KD Trees

• Leaf Node

• Fetch vertices

x

y

leaf leaf leaf

Page 53: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update of B-KD TreesUpdate of B-KD Trees

• Leaf Node

• Fetch vertices

• Compute leaf boxes x

y

leaf leaf leaf

Page 54: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update of B-KD TreesUpdate of B-KD Trees

• Leaf Node

• Fetch vertices

• Compute leaf boxes

• Inner Node

• Update 1D node bounds

x

y

leaf leaf leaf

Page 55: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update of B-KD TreesUpdate of B-KD Trees

• Leaf Node

• Fetch vertices

• Compute leaf boxes

• Inner Node

• Update 1D node bounds

• Merge boxes of both children

x

y

leaf leaf leaf

Page 56: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update of B-KD TreesUpdate of B-KD Trees

• Leaf Node

• Fetch vertices

• Compute leaf boxes

• Inner Node

• Update 1D node bounds

• Merge boxes of both children

y

leaf leaf leaf

x

Page 57: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update of B-KD TreesUpdate of B-KD Trees

• Leaf Node

• Fetch vertices

• Compute leaf boxes

• Inner Node

• Update 1D node bounds

• Merge boxes of both children

y

leaf leaf leaf

x

Page 58: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update ProcessorUpdate Processor

• ¼ more memory for instructions

• Optimized Instruction Set

• Load vertex

• Merge 3 vertices to a box

• Merge 2 boxes (plus update node)

• 64 Vertex and 64 Box Registers

• Optimal re-use of data

• Stream Based

• Reads one instruction stream

• Writes a sequential node stream

• Vertices are accessed as sequential as possible

InstructionFetch

from memory

Fetch Vertex Unit

from memory

Merge unitMerge Boxes

Merge Vertices

Box Writeback

Vertex writeback

Register Access Read 2 Boxes

Read 3 Vertices

Node Update

nodesto memory

child bounds

vertex addressvertex destination

4x32 Bit

instruction

merged box

fetched vertex

128 Bit

128 Bit

InstructionScheduler

Page 59: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update ProcessorUpdate Processor

• ¼ more memory for instructions

• Optimized Instruction Set

• Load vertex

• Merge 3 vertices to a box

• Merge 2 boxes (plus update node)

• 64 Vertex and 64 Box Registers

• Optimal re-use of data

• Stream Based

• Reads one instruction stream

• Writes a sequential node stream

• Vertices are accessed as sequential as possible

InstructionFetch

from memory

Fetch Vertex Unit

from memory

Merge unitMerge Boxes

Merge Vertices

Box Writeback

Vertex writeback

Register Access Read 2 Boxes

Read 3 Vertices

Node Update

nodesto memory

child bounds

vertex addressvertex destination

4x32 Bit

instruction

merged box

fetched vertex

128 Bit

128 Bit

InstructionScheduler

Page 60: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update ProcessorUpdate Processor

• ¼ more memory for instructions

• Optimized Instruction Set

• Load vertex

• Merge 3 vertices to a box

• Merge 2 boxes (plus update node)

• 64 Vertex and 64 Box Registers

• Optimal re-use of data

• Stream Based

• Reads one instruction stream

• Writes a sequential node stream

• Vertices are accessed as sequential as possible

InstructionFetch

from memory

Fetch Vertex Unit

from memory

Merge unitMerge Boxes

Merge Vertices

Box Writeback

Vertex writeback

Register Access Read 2 Boxes

Read 3 Vertices

Node Update

nodesto memory

child bounds

vertex addressvertex destination

4x32 Bit

instruction

merged box

fetched vertex

128 Bit

128 Bit

InstructionScheduler

Page 61: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update ProcessorUpdate Processor

• ¼ more memory for instructions

• Optimized Instruction Set

• Load vertex

• Merge 3 vertices to a box

• Merge 2 boxes (plus update node)

• 64 Vertex and 64 Box Registers

• Optimal re-use of data

• Stream Based

• Reads one instruction stream

• Writes a sequential node stream

• Vertices are accessed as sequential as possible

InstructionFetch

from memory

Fetch Vertex Unit

from memory

Merge unitMerge Boxes

Merge Vertices

Box Writeback

Vertex writeback

Register Access Read 2 Boxes

Read 3 Vertices

Node Update

nodesto memory

child bounds

vertex addressvertex destination

4x32 Bit

instruction

merged box

fetched vertex

128 Bit

128 Bit

InstructionScheduler

Page 62: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update ProcessorUpdate Processor

• ¼ more memory for instructions

• Optimized Instruction Set

• Load vertex

• Merge 3 vertices to a box

• Merge 2 boxes (plus update node)

• 64 Vertex and 64 Box Registers

• Optimal re-use of data

• Stream Based

• Reads one instruction stream

• Writes a sequential node stream

• Vertices are accessed as sequential as possible

InstructionFetch

from memory

Fetch Vertex Unit

from memory

Merge unitMerge Boxes

Merge Vertices

Box Writeback

Vertex writeback

Register Access Read 2 Boxes

Read 3 Vertices

Node Update

nodesto memory

child bounds

vertex addressvertex destination

4x32 Bit

instruction

merged box

fetched vertex

128 Bit

128 Bit

InstructionScheduler

Page 63: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update ProcessorUpdate Processor

• ¼ more memory for instructions

• Optimized Instruction Set

• Load vertex

• Merge 3 vertices to a box

• Merge 2 boxes (plus update node)

• 64 Vertex and 64 Box Registers

• Optimal re-use of data

• Stream Based

• Reads one instruction stream

• Writes a sequential node stream

• Vertices are accessed as sequential as possible

InstructionFetch

from memory

Fetch Vertex Unit

from memory

Merge unitMerge Boxes

Merge Vertices

Box Writeback

Vertex writeback

Register Access Read 2 Boxes

Read 3 Vertices

Node Update

nodesto memory

child bounds

vertex addressvertex destination

4x32 Bit

instruction

merged box

fetched vertex

128 Bit

128 Bit

InstructionScheduler

Page 64: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update ProcessorUpdate Processor

• ¼ more memory for instructions

• Optimized Instruction Set

• Load vertex

• Merge 3 vertices to a box

• Merge 2 boxes (plus update node)

• 64 Vertex and 64 Box Registers

• Optimal re-use of data

• Stream Based

• Reads one instruction stream

• Writes a sequential node stream

• Vertices are accessed as sequential as possible

InstructionFetch

from memory

Fetch Vertex Unit

from memory

Merge unitMerge Boxes

Merge Vertices

Box Writeback

Vertex writeback

Register Access Read 2 Boxes

Read 3 Vertices

Node Update

nodesto memory

child bounds

vertex addressvertex destination

4x32 Bit

instruction

merged box

fetched vertex

128 Bit

128 Bit

InstructionScheduler

Page 65: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update ProcessorUpdate Processor

• ¼ more memory for instructions

• Optimized Instruction Set

• Load vertex

• Merge 3 vertices to a box

• Merge 2 boxes (plus update node)

• 64 Vertex and 64 Box Registers

• Optimal re-use of data

• Stream Based

• Reads one instruction stream

• Writes a sequential node stream

• Vertices are accessed as sequential as possible

InstructionFetch

from memory

Fetch Vertex Unit

from memory

Merge unitMerge Boxes

Merge Vertices

Box Writeback

Vertex writeback

Register Access Read 2 Boxes

Read 3 Vertices

Node Update

nodesto memory

child bounds

vertex addressvertex destination

4x32 Bit

instruction

merged box

fetched vertex

128 Bit

128 Bit

InstructionScheduler

Page 66: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update ProcessorUpdate Processor

• ¼ more memory for instructions

• Optimized Instruction Set

• Load vertex

• Merge 3 vertices to a box

• Merge 2 boxes (plus update node)

• 64 Vertex and 64 Box Registers

• Optimal re-use of data

• Stream Based

• Reads one instruction stream

• Writes a sequential node stream

• Vertices are accessed as sequential as possible

InstructionFetch

from memory

Fetch Vertex Unit

from memory

Merge unitMerge Boxes

Merge Vertices

Box Writeback

Vertex writeback

Register Access Read 2 Boxes

Read 3 Vertices

Node Update

nodesto memory

child bounds

vertex addressvertex destination

4x32 Bit

instruction

merged box

fetched vertex

128 Bit

128 Bit

InstructionScheduler

Page 67: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Update ProcessorUpdate Processor

• ¼ more memory for instructions

• Optimized Instruction Set

• Load vertex

• Merge 3 vertices to a box

• Merge 2 boxes (plus update node)

• 64 Vertex and 64 Box Registers

• Optimal re-use of data

• Stream Based

• Reads one instruction stream

• Writes a sequential node stream

• Vertices are accessed as sequential as possible

InstructionFetch

from memory

Fetch Vertex Unit

from memory

Merge unitMerge Boxes

Merge Vertices

Box Writeback

Vertex writeback

Register Access Read 2 Boxes

Read 3 Vertices

Node Update

nodesto memory

child bounds

vertex addressvertex destination

4x32 Bit

instruction

merged box

fetched vertex

128 Bit

128 Bit

InstructionScheduler

Page 68: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Prototype ImplementationPrototype Implementation

Hardware

• FPGA board from Alpha Data

• Xilinx Virtex4 LX160

• 128 MB DDR Memory

Implementation

• Packets of 4 rays

• 32 packets of rays

• 24 bit floating point

• 66 MHz

Virtex4 Board

Page 69: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

ResultsResults

Update Performance

• 66 million B-KD tree node updates

• 200 updates per second for characters with 80k triangles

• 1 to 15.0 % of rendering time

Ray Casting Performance

• 2 to 8 million rays per second

• 10 to 40 fps at 512x386

Page 70: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Conclusions and Future WorkConclusions and Future Work

• Ray Tracing Hardware Design

• Efficient for coherent dynamic scenes

• Less efficient for non-continous scene changes

• Working Prototype Implementation

• Even FPGA achieves high performance

• 2x - 3x OpenRT on Pentium 4 2,6 GHz

• Post layout ASIC Results [RT06]

• 90nm, 400 MHz, 200mm^2, 19.5 GB/s

• Performs up to 40x faster (80-200 fps at 1024x768)

Page 71: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Live DemoLive Demo

Page 72: Saarland University, Germany B-KD Trees for Hardware Accelerated Ray Tracing of Dynamic Scenes Sven Woop Gerd Marmitt Philipp Slusallek.

Questions?Questions?

• Project Homepage:http://www.saarcor.de

• Computer Graphics Lab at Saarland University:http://graphics.cs.uni-sb.de


Recommended