+ All Categories
Home > Documents > Photon Mapping on Programmable Graphics Hardware Timothy J. Purcell Mike Cammarano Pat Hanrahan...

Photon Mapping on Programmable Graphics Hardware Timothy J. Purcell Mike Cammarano Pat Hanrahan...

Date post: 14-Dec-2015
Category:
Upload: jeremiah-stain
View: 215 times
Download: 2 times
Share this document with a friend
Popular Tags:
53
Photon Mapping on Photon Mapping on Programmable Graphics Programmable Graphics Hardware Hardware Timothy J. Timothy J. Purcell Purcell Mike Cammarano Mike Cammarano Pat Hanrahan Pat Hanrahan Stanford Stanford University University Craig Donner Craig Donner Henrik Wann Henrik Wann Jensen Jensen University of University of California, San California, San Diego Diego
Transcript

Photon Mapping on Photon Mapping on Programmable Graphics Programmable Graphics

HardwareHardware

Timothy J. Timothy J. PurcellPurcell

Mike Mike CammaranoCammarano

Pat HanrahanPat Hanrahan

Stanford Stanford UniversityUniversity

Craig DonnerCraig Donner

Henrik Wann Henrik Wann JensenJensen

University of University of California, San California, San

DiegoDiego

MotivationMotivation

MotivationMotivation

• Interactive global illumination on the Interactive global illumination on the GPUGPU• Nearly have sufficient compute power and Nearly have sufficient compute power and

flexibilityflexibility

• Explore GPU-based computation Explore GPU-based computation algorithms algorithms

Related WorkRelated Work

• CPU-based interactive global CPU-based interactive global illuminationillumination• Supercomputers [Parker et al.]Supercomputers [Parker et al.]

• Clusters [Tole et al., Wald et al.] Clusters [Tole et al., Wald et al.]

• Global illumination on programmable Global illumination on programmable GPUsGPUs• Ray tracing [Carr et al., Purcell et al.]Ray tracing [Carr et al., Purcell et al.]

• Photon mapping [Ma et al.]Photon mapping [Ma et al.]

• Radiosity [Carr et al., Coombe et al.]Radiosity [Carr et al., Coombe et al.]

• Translucency [Carr et al., Stamminger et al.]Translucency [Carr et al., Stamminger et al.]

Photon Mapping Algorithm Photon Mapping Algorithm ReviewReview• Photon tracingPhoton tracing

• Emission, scattering, Emission, scattering, storing into kd-treestoring into kd-tree

• Similar to ray tracingSimilar to ray tracing

• RenderingRendering• Ray tracing for direct Ray tracing for direct

illuminationillumination

• Photon map Photon map visualizationvisualization

• Indirect bounceIndirect bounce

Computational Challenge for Computational Challenge for GPUs #1GPUs #1

• Constructing Constructing a irregular or a irregular or sparse data sparse data structurestructure

Computational Challenge for Computational Challenge for GPUs #2GPUs #2

• Adaptive Adaptive nearest nearest neighbor neighbor searchsearch• Noise vs. blurNoise vs. blur

Computational Challenge for Computational Challenge for GPUs #2GPUs #2

• Adaptive Adaptive nearest nearest neighbor neighbor searchsearch• Noise vs. blurNoise vs. blur

Photon Mapping on the CPUPhoton Mapping on the CPU

• Balanced kd-treeBalanced kd-tree• Compact storage of photonsCompact storage of photons

• EfficientEfficient

• O(log n) searchO(log n) search

• Priority queuePriority queue• Nearest neighbor searchNearest neighbor search

• Incremental insertion and removal of photonsIncremental insertion and removal of photons

Algorithmic Changes for the Algorithmic Changes for the GPUGPU• Direct visualization of photon mapDirect visualization of photon map

• Keeps rendering costs lowKeeps rendering costs low

• Use grid instead of kd-treeUse grid instead of kd-tree• Tried kd-tree…Tried kd-tree…

•Kd-tree construction is difficultKd-tree construction is difficult

•Radiance estimateRadiance estimate– Fixed radius search works fineFixed radius search works fine– Adaptive search needs priority queueAdaptive search needs priority queue

• No priority queueNo priority queue• Can’t build on GPUCan’t build on GPU

•Too much stateToo much state

ContributionsContributions

• Mapped complete grid-based photon Mapped complete grid-based photon mapping algorithm onto the GPUmapping algorithm onto the GPU• Including photon tracing, ray tracing, etc.Including photon tracing, ray tracing, etc.

• Implemented an adaptive Implemented an adaptive kk-nearest -nearest neighbor searchneighbor search• kNN-gridkNN-grid

• Show how to construct a sparse data Show how to construct a sparse data structure on the GPUstructure on the GPU• Bitonic merge sort with binary searchBitonic merge sort with binary search

• Stencil routingStencil routing

Configuring the GPU for Configuring the GPU for ComputingComputing• GPU as data parallel compute engineGPU as data parallel compute engine

• Fragment programs execute compute kernelsFragment programs execute compute kernels

• Screen sized quad initializes computationScreen sized quad initializes computation

•SIMD executionSIMD execution

• Floating point texture memoryFloating point texture memory• Render-to-texture for intermediate resultsRender-to-texture for intermediate results

• Data structure storageData structure storage

•Pointer dereferencing via dependent fetchesPointer dereferencing via dependent fetches

Computational Challenge #1Computational Challenge #1

Building a Sparse Data Building a Sparse Data StructureStructure

Building a Sparse Data Building a Sparse Data StructureStructure• Requires scatterRequires scatter

• Dependent texture writeDependent texture write

• Why don’t we have fragment Why don’t we have fragment scatter?scatter?• Fragment processing has highly coherent Fragment processing has highly coherent

blocked memory writesblocked memory writes

• Extra hardware support would be needed Extra hardware support would be needed

•Write hazardsWrite hazards

•Memory latenciesMemory latencies

Scatter on the GPUScatter on the GPU

• Sort photons into grid cellsSort photons into grid cells• Grid cell is sort keyGrid cell is sort key

• Simulate scatter with fragment Simulate scatter with fragment programsprograms• Bitonic merge sort followed by binary searchBitonic merge sort followed by binary search

•Compact gridCompact grid

•O(logO(log22 n) rendering passes n) rendering passes

Bitonic Merge SortBitonic Merge Sort

1

3

2

4

7

6

8

5

2

3

1

4

7

5

8

6

3

2

4

1

7

5

8

6

3

7

4

8

2

5

1

6

3

8

4

7

2

6

1

5

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

O(logO(log22 n) rendering passes n) rendering passes

Binary SearchBinary Search

• Grid cell searches for self in photon Grid cell searches for self in photon listlist• If none, find first element in next cellIf none, find first element in next cell

•Empty grid cells waste computeEmpty grid cells waste compute

• Log(n) + 1 stepsLog(n) + 1 steps

Binary SearchBinary Search

• Grid cell searches for self in photon Grid cell searches for self in photon listlist• If none, find first element in next cellIf none, find first element in next cell

•Empty grid cells waste computeEmpty grid cells waste compute

• Log(n) + 1 stepsLog(n) + 1 steps

v0v0 v0v0 v2v2 v2v2 v5v5v0v0 v5v5SortedSortedPhoton ListPhoton List

v2v2

Searching for first v5 photon

initialize

Binary SearchBinary Search

• Grid cell searches for self in photon Grid cell searches for self in photon listlist• If none, find first element in next cellIf none, find first element in next cell

•Empty grid cells waste computeEmpty grid cells waste compute

• Log(n) + 1 stepsLog(n) + 1 steps

v0v0 v0v0 v2v2 v2v2 v5v5v0v0 v5v5SortedSortedPhoton ListPhoton List

v0v0 v0v0 v2v2 v2v2 v2v2v0v0 v5v5

v2v2

v5v5

Searching for first v5 photon

initialize

step 1

v5v5

Binary SearchBinary Search

• Grid cell searches for self in photon Grid cell searches for self in photon listlist• If none, find first element in next cellIf none, find first element in next cell

•Empty grid cells waste computeEmpty grid cells waste compute

• Log(n) + 1 stepsLog(n) + 1 steps

v0v0 v0v0 v2v2 v2v2 v5v5v0v0 v5v5SortedSortedPhoton ListPhoton List

v0v0 v0v0 v2v2 v2v2 v2v2v0v0 v5v5

v0v0 v0v0 v2v2 v2v2 v5v5v0v0

v2v2

v5v5

v2v2

Searching for first v5 photon

initialize

step 1

step 2

v5v5

Binary SearchBinary Search

• Grid cell searches for self in photon Grid cell searches for self in photon listlist• If none, find first element in next cellIf none, find first element in next cell

•Empty grid cells waste computeEmpty grid cells waste compute

• Log(n) + 1 stepsLog(n) + 1 steps

v0v0 v0v0 v2v2 v2v2 v5v5v0v0 v5v5SortedSortedPhoton ListPhoton List

v0v0 v0v0 v2v2 v2v2 v2v2v0v0 v5v5

v0v0 v0v0 v2v2 v2v2 v5v5v0v0

v0v0 v0v0 v2v2 v2v2 v2v2v0v0 v5v5

v2v2

v5v5

v2v2

v5v5

Searching for first v5 photon

initialize

step 1

step 2

step 3

v5v5

Binary SearchBinary Search

• Grid cell searches for self in photon Grid cell searches for self in photon listlist• If none, find first element in next cellIf none, find first element in next cell

•Empty grid cells waste computeEmpty grid cells waste compute

• Log(n) + 1 stepsLog(n) + 1 steps

v0v0 v0v0 v2v2 v2v2 v5v5v0v0 v5v5SortedSortedPhoton ListPhoton List

v0v0 v0v0 v2v2 v2v2 v2v2v0v0 v5v5

v0v0 v0v0 v2v2 v2v2 v5v5v0v0

v0v0 v0v0 v2v2 v2v2 v2v2v0v0 v5v5

v0v0 v0v0 v2v2 v2v2 v2v2v0v0 v5v5

v2v2

v5v5

v2v2

v5v5

v5v5

Searching for first v5 photon

initialize

step 1

step 2

step 3

step 4

Scatter on the GPUScatter on the GPU

• Vertex programs can scatterVertex programs can scatter• Draw point to buffer Draw point to buffer

•Collisions?Collisions?

Scatter on the GPUScatter on the GPU

• Vertex programs can scatterVertex programs can scatter• Draw point to buffer Draw point to buffer

•Collisions?Collisions?

• Stencil routingStencil routing

•Limit photon count per grid cellLimit photon count per grid cell– Pre-allocate grid cell spacePre-allocate grid cell space

•Draw photons as pointsDraw photons as points– Vertex program computes grid cellVertex program computes grid cell

•Stencil buffer controls location within cellStencil buffer controls location within cell

•Single rendering passSingle rendering pass

Stencil RoutingStencil Routing

• Fix each grid cell Fix each grid cell size to nsize to n22 pixels pixels

• Draw fat points to Draw fat points to cover each fat cellcover each fat cell• glPointSize(n)glPointSize(n)

Vertex ( photon_pos )

Vertex Program

Flattened Grid

4 pixels

Stencil RoutingStencil Routing

• Control location Control location written to with written to with stencilstencil• Pass when stencil is nPass when stencil is n22 - -

11

• Stencil always Stencil always incrementsincrements

• Location written Location written depends on draw orderdepends on draw order

Vertex ( photon_pos )

Vertex Program

Flattened Grid

1 pixel

Stencil

4 pixels

Stencil Values

0 1

2 3

1 2

3 4

0 1

2 3

0 1

2 3

Computational Challenge #2Computational Challenge #2

Adaptive Nearest Neighbor Adaptive Nearest Neighbor SearchSearch

Adaptive Nearest Neighbor Adaptive Nearest Neighbor SearchSearch• Iterative algorithmIterative algorithm

• Accept or reject photons in cell visit Accept or reject photons in cell visit orderorder

kNN-grid AlgorithmkNN-grid Algorithm

sample point

photons in estimatecandidate photon

Want a 4 photon estimate

kNN-grid AlgorithmkNN-grid Algorithm

• Candidate photons Candidate photons must be within max must be within max search radiussearch radius

• Visit voxels in Visit voxels in order of distance order of distance to sample pointto sample point

sample point

photons in estimatecandidate photon

Want a 4 photon estimate

kNN-grid AlgorithmkNN-grid Algorithm

• If current number If current number of photons in of photons in estimate is less estimate is less than number than number requested, grow requested, grow search radiussearch radius

1

sample point

photons in estimatecandidate photon

Want a 4 photon estimate

kNN-grid AlgorithmkNN-grid Algorithm

• If current number If current number of photons in of photons in estimate is less estimate is less than number than number requested, grow requested, grow search radiussearch radius

2

sample point

photons in estimatecandidate photon

Want a 4 photon estimate

kNN-grid AlgorithmkNN-grid Algorithm

• Don’t add photons Don’t add photons outside maximum outside maximum search radiussearch radius

• Don’t grow search Don’t grow search radius when radius when photon is outside photon is outside maximum radiusmaximum radius2

sample point

photons in estimatecandidate photon

Want a 4 photon estimate

kNN-grid AlgorithmkNN-grid Algorithm

• Add photons within Add photons within search radiussearch radius

3

sample point

photons in estimatecandidate photon

Want a 4 photon estimate

kNN-grid AlgorithmkNN-grid Algorithm

• Add photons within Add photons within search radiussearch radius

4

sample point

photons in estimatecandidate photon

Want a 4 photon estimate

kNN-grid AlgorithmkNN-grid Algorithm

• Don’t expand Don’t expand search radius if search radius if enough photons enough photons already foundalready found

4

sample point

photons in estimatecandidate photon

Want a 4 photon estimate

kNN-grid AlgorithmkNN-grid Algorithm

• Add photons within Add photons within search radiussearch radius

5

sample point

photons in estimatecandidate photon

Want a 4 photon estimate

kNN-grid AlgorithmkNN-grid Algorithm

• Visit all other Visit all other voxels accessible voxels accessible within determined within determined search radiussearch radius

• Add photons within Add photons within search radiussearch radius

6

sample point

photons in estimatecandidate photon

Want a 4 photon estimate

kNN-grid AlgorithmkNN-grid Algorithm

• Finds all photons Finds all photons within a sphere within a sphere centered about centered about sample pointsample point

• May locate more May locate more than requested than requested kk--nearest neighborsnearest neighbors6

sample point

photons in estimatecandidate photon

Want a 4 photon estimate

System ImplementationSystem Implementation

• NVIDIA GeForce FX 5900 Ultra NVIDIA GeForce FX 5900 Ultra (NV35)(NV35)

• Cg compiler 1.1Cg compiler 1.1

TracePhoton

s

BuildPhoton

Map

RayTraceScene

ComputeRadianceEstimate

Compute Lighting Render Image

DemosDemos

Glass Ball – Bitonic SortGlass Ball – Bitonic Sort

18s @ 512x384, 5K photons

Glass Ball – Stencil RoutingGlass Ball – Stencil Routing

11s @ 512x384, 5K photons

Ring – Bitonic SortRing – Bitonic Sort

9s @ 512x384, 16K photons

Ring – Stencil RoutingRing – Stencil Routing

8s @ 512x384, 16K photons

Cornell Box – Bitonic SortCornell Box – Bitonic Sort

64s @ 512x512, 65K photons

Cornell Box – Stencil RoutingCornell Box – Stencil Routing

47s @ 512x512, 65K photons

Cornell Box – Cornell Box – Increased Search Increased Search RadiusRadius

Open Issues (1)Open Issues (1)

• How to prevent program execution How to prevent program execution over a subset of pixels?over a subset of pixels?• Non-uniform pixel computation distributionNon-uniform pixel computation distribution

•Radiance estimateRadiance estimate

• KILL is only a write maskKILL is only a write mask

• Early-z occlusion cullingEarly-z occlusion culling

•No pixel level controlNo pixel level control

• Compute mask, branching, or stream buffer?Compute mask, branching, or stream buffer?

• Improve radiance estimate speed by 30-70% Improve radiance estimate speed by 30-70% over tilingover tiling

Open Issues (2)Open Issues (2)

• ScatterScatter• Makes (a programmer’s) life easierMakes (a programmer’s) life easier

• Is it worth implementing?Is it worth implementing?

•Gain factor of logGain factor of log2 2 n avoiding sortn avoiding sort

Future WorkFuture Work

• Kd-treesKd-trees

• Photon power redistributionPhoton power redistribution

• Adaptive samplingAdaptive sampling

• Progressive refinementProgressive refinement

ConclusionsConclusions

• The GPU can compute an entire global The GPU can compute an entire global illumination solutionillumination solution• Nearly interactiveNearly interactive

• Implemented an adaptive Implemented an adaptive kk-nearest -nearest neighbor query for the GPUneighbor query for the GPU• kNN-gridkNN-grid

• Shown how to construct sparse data Shown how to construct sparse data structures on the GPUstructures on the GPU• Bitonic merge sort and binary searchBitonic merge sort and binary search

• Stencil routingStencil routing

• Sorting and searching algorithms Sorting and searching algorithms applicable to other computationsapplicable to other computations

AcknowledgmentsAcknowledgments

• Stanford FlashGStanford FlashG• Ian Buck, Mike Houston, Kekoa ProudfootIan Buck, Mike Houston, Kekoa Proudfoot

• Stencil routingStencil routing• Kurt Akeley, Matt PapakiposKurt Akeley, Matt Papakipos

• Hardware and driversHardware and drivers• David Kirk, Nick TriantosDavid Kirk, Nick Triantos

• FundingFunding• NVIDIA, DARPA, NSF, 3ComNVIDIA, DARPA, NSF, 3Com


Recommended