Kristóf Ralovich
Budapest University of Technology and Economics
Agenda
• Quick overview of ray tracing• Programmable pipeline• Storing the scene• GPU ray tracing• Results• Future work
Concept of ray tracing
occluder
camera
light
viewport
diffuse objectprimary rayshadow ray
reflectiveobject
reflected ray
Accelerating ray shooting
• uniform grid space subdivision• 3D DDA traversal [Amanatides & Woo]
Pipeline > computing stage
fragment program(kernel)
screen covered byrasterized fragments
fragments torender target(s)
texturememory
draw full screen quad
1 : 1 mapping ofrays to fragments
render targets canbe fed back as textures
programmable pipeline
texture
memory
vertexprocessor
rasterization
fragmentprocessor
render target(on/off-screen)
rendering pass /kernel execution
Scene encoded in textures3D RGB grid texture
ptr cnt proxcld.
R G B030
triID
2D LUMINANCE triangle list texture L
320 000
0 2 3 0 2 4 7 8
...
9 ...2x 3D RGB texturefor triangle data4 sliceseach
x
R
y
G
z
B
x y zx y zr g bx y zx y zx y zrc
...tri0 tri1 tri2 tri3 tri4 tri5 tri6 tri7 triN
v1v2v3coln1n2n3refl
540
vox0 vox1vox2 vox3 voxK
tri listreferencedin vox0
tri listreferencedvox2
GPU ray tracer (ray generation)
scene AABB hit? (masking)and 3D DDA initialization
ray generator
shading
traverse
intersect+
stage output inrender targets:
ray origins ray directions
GPU ray tracer (initialization)
scene AABB hit? (masking)and 3D DDA initialization
ray generator
shading
traverse
intersect+
xyz
xyz
xyz
xyz
xyz
xyz
xyz
xyz
xyz
xyz
xyz
xyz
xyz
xyz
xyz
xyz
3D DDA tMax
xzyfin?xzyfin?
xzyfin?xzyfin?
xzyfin?
xzyfin?xzyfin?
xzyfin?xzyfin?
xzyfin?xzyfin?
xzyfin?xzyfin?
xzyfin?xzyfin?
xzyfin?
curr. voxel +finished? flag
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
hit record:barycentric u,v +ray param. t +triID
stage output inrender targets:
GPU ray tracer (trav. + isec. 1)
scene AABB hit? (masking)and 3D DDA initialization
ray generator
shading
traverse
intersect+
xzy#trisxzy#tris
xzy#trisxzy#tris
xzy#trisxzy#tris
xzy#trisxzy#tris
xzy#trisxzy#tris
xzy#trisxzy#tris
xzy#trisxzy#tris
xzy#trisxzy#tris
3D DDA tMax +# of proc. tris
xzyfin?xzyfin?
xzyfin?xzyfin?
xzyfin?
xzyfin?xzyfin?
xzyfin?xzyfin?
xzyfin?xzyfin?
xzyfin?xzyfin?
xzyfin?xzyfin?
xzyfin?
curr. voxel +finished? flag
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
hit record:barycentric u,v +ray param. t +triID
stage output inrender targets:
pass repeated until all ray are finished(halting cond. det. by occlusion query)
GPU ray tracer (trav. + isec. 2)
scene AABB hit? (masking)and 3D DDA initialization
ray generator
shading
traverse
intersect+
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
hit record:barycentric u,v +ray param. t +triID
stage output inrender targets:
GPU ray tracer (shading)
scene AABB hit? (masking)and 3D DDA initialization
ray generator
shading
traverse
intersect+
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
uvtID
hit record:barycentric u,v +ray param. t +triID
accumulatedcolorvalues
stage output inrender targets:
Evaluated scenes
Knight Torus Knot Bunny (low poly) Stanford Bunny
14x16x14 grid
636 tris
16x16x16 grid
1024 tris
20x20x20 grid
1764 tris
128x128x128 grid
69451 tris
2.6 - 9.7 FPS 1.5 - 8.4 FPS 3.3 - 11.6 FPS 0.8 - 3.0 FPS
Experiences: Lessons learned
Pros.
ray tracing and GPU are both paralell uniform grid traversal is fast and simple
Cons
abstraction over graphics APIno stack for recursiongrid is not adaptive
Future
• SM 4.0 GPU: integer arith, geom. feedback
• others RASs on GPU: KD-tree [Foley & Sugerman 2005], BVH [Thrane & Simonsen 2005], geometry images [Carr et. al. 2006]
• better API to HW: CUDA• different HW: Cell
Thank you for yourattention!
• Questions?