+ All Categories
Home > Documents > GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf ·...

GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf ·...

Date post: 11-May-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
14
1 GLSL Applications: 2 of 2 Patrick Cozzi University of Pennsylvania CIS 565 - Spring 2011 Agenda Today’s slides Matrix Operations on the GPU OpenGL Textures and Multitexturing OpenGL Framebuffers and Deferred Shading Ambient Occlusion Matrix Operations (thanks too…) Slide information sources Suresh Venkatasubramanian CIS700 – Matrix Operations Lectures Fast matrix multiplies using graphics hardware by Larsen and McAllister Dense Matrix Multiplication by Ádám Moravánszky Cache and Bandwidth Aware Matrix Multiplication on the GPU, by Hall, Carr and Hart Understanding the Efficiency of GPU Algorithms for Matrix-Matrix Multiplication by Fatahalian, Sugerman, and Harahan Linear algebra operators for GPU implementation of numerical algorithms by Krüger and Westermann Overview 3 Basic Linear Algebra Operations Vector-Vector Operations c=a . b Matrix-Matrix Operations C=A+B - addition D=A*B - multiplication E E - inverse Matrix-Vector Operations y =Ax Note on Notation: 1) Vectors - lower case, underlined: v 2) Matrices – upper case, underlined 2x : M 3) Scalar – lower case, no lines: s Efficiency/Bandwidth Issues GPU algorithms are severely bandwidth limited! Minimize Texture Fetches Effective cache bandwidth… so no algorithm would be able to read data from texture very much faster with texture fetches Vector-Vector Operations Inner Product Review An inner product on a vector space (V) over a field (K) (which must be either the field R of real numbers or the field C of complex numbers) is a function <,>:VxVK such that, k 1 , k 2 in K for all v,w in V the following properties hold: 1. <u+v, w> = <u,w>+<v,w> 2. <άv,w>= ά<v,w> (linearity constraints) ____ 3. <v,w> = <w,v> (conjugate symmetry) 4. <v,v> 0 (positive definite)
Transcript
Page 1: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

1

GLSL Applications: 2 of 2

Patrick CozziUniversity of PennsylvaniaCIS 565 - Spring 2011

Agenda

Today’s slidesMatrix Operations on the GPUOpenGL Textures and MultitexturingOpenGL Framebuffers and Deferred ShadingAmbient Occlusion

Matrix Operations (thanks too…)

Slide information sourcesSuresh VenkatasubramanianCIS700 – Matrix Operations LecturesFast matrix multiplies using graphics hardware by Larsen and McAllister Dense Matrix Multiplication by Ádám MoravánszkyCache and Bandwidth Aware Matrix Multiplication on the GPU, by Hall, Carr and HartUnderstanding the Efficiency of GPU Algorithms for Matrix-Matrix Multiplication by Fatahalian, Sugerman, and Harahan Linear algebra operators for GPU implementation of numerical algorithms by Krüger and Westermann

Overview3 Basic Linear Algebra Operations

Vector-Vector Operationsc=a.b

Matrix-Matrix OperationsC=A+B - additionD=A*B - multiplicationE = E-1 - inverse

Matrix-Vector Operationsy=Ax

Note on Notation:1) Vectors - lower case,

underlined: v2) Matrices – upper case,

underlined 2x : M3) Scalar – lower case,

no lines: s

Efficiency/Bandwidth Issues

GPU algorithms are severely bandwidth limited!

Minimize Texture Fetches

Effective cache bandwidth…so no algorithm would be able to read data from texture very much faster with texture fetches

Vector-Vector OperationsInner Product Review

An inner product on a vector space (V) over a field (K) (which must be either the field R of real numbers or the field C of complex numbers) is a function <,>:VxV→K such that, k1, k2 in K for all v,w in V the following properties hold:

1. <u+v, w> = <u,w>+<v,w>

2. <άv,w>= ά<v,w> (linearity constraints)____

3. <v,w> = <w,v> (conjugate symmetry)

4. <v,v> ≥ 0 (positive definite)

Page 2: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

2

Vector-Vector OperationsInner Product Review

A vector space together with an inner product on it is called an inner product space. Examples include:

1. The real numbers R where the inner product is given by <x,y> = xy

2. The Euclidean space Rn where the inner product is given by the dot product:

c = a.bc = <(a1, a2,…,an),(b1,b2,…,bn)>c = a1b1+a2b2+…+anbnc = ∑aibi

3. The vector space of real functions with a closed domain [a,b]<f,g> = ∫ f g dx

Vector-Vector OperationsInner Product Review

A vector space together with an inner product on it is called an inner product space. Examples include:

1. The real numbers R where the inner product is given by <x,y> = xy

2. The Euclidean space Rn where the inner product is given by the dot product:

c = a.bc = <(a1, a2,…,an),(b1,b2,…,bn)>c = a1b1+a2b2+…+anbnc = ∑aibi

3. The vector space of real functions with a closed domain [a,b]<f,g> = ∫ f g dx

Vector-Vector OperationsDot Product: Technique 1

(Optimized for memory)

- Store each vector as a 1D texture a and b- In the ith rendering pass we render a single

point at coordinates (0,0) which has a single texture coordinate i

- The Fragment program uses I to index into the 2 textures and return the value s + ai*bi

( s is the running sum maintained over the previous i-1 passes)

Vector-Vector Operations

Dot Product: Technique 1: Problems?We cannot read and write to the location s is stored in a single pass, we need to use a ping-pong trick to maintain s accuratelyTakes n-passes

☺ Requires only a fixed number of texture locations (1 unit of memory)Does not take advantage of 2D spatial texture caches on the GPU that are optimized by the rasterizerLimited length of 1d textures, especially in older cards

Vector-Vector Operations

Dot Product: Technique 2(optimized for passes)

- Wrap a and b as 2D textures

Vector-Vector Operations

Dot Product: Technique 2

- Multiply the two 2D textures by rendering a single quad with the answer

- Add the elements in (c) the result 2D texture together

Page 3: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

3

Vector-Vector Operations

Adding up a texture elements to a scalar value

Additive blendingOr parallel reduction algorithm (log n passes)

//example Fragment program for performing a reductionfloat main (float2 texcoord: TEXCOORD0, uniform sampler2D img): COLOR{

float a, b, c, d;

a=tex2D(img, texcoord);

b=tex2D(img, texcoord + float2(0,1) );

c=tex2D(img, texcoord + float2(1,0) );

d=tex2D(img, texcoord + float2(1,1) );

return (a+b+c+d);}

Matrix-Matrix Operations

Store matrices as 2D textures

glTexImage2D(GL_TEXTURE_2D, 0,GL_RED , 256, 256, 0, GL_RED, GL_UNSIGNED_BYTE, pData);

Matrix-Matrix Operations

Store matrices as 2D textures

Addition is now a trivial fragment program /additive blend

Matrix-Matrix OperationsMatrix Multiplication Review

So in other words we have:

In general:(AB)ij = ∑r=0 air brj

Naïve O(n3) CPU algorithm

for i = 1 to nfor j = 1 to n

C[i,j] = ∑ A[I,k] * B[k,j]

Matrix-Matrix Operations

GPU Matrix Multiplication: Technique 1

Express multiplication of two matrices as dot product of vector of matrix row and columns

Compute matrix C by:for each cell of cij take the dot product of row I of matrix A with column j of matrix B

Matrix-Matrix Operations

GPU Matrix Multiplication: Technique 1Pass1Output = ax1 * b1y

Pass2Output = Output1+ax2 * b2y…..PassKOutput = Outputk-1 + axk * bky

Uses: n passesUses: N=n2 space

Page 4: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

4

Matrix-Matrix OperationsGPU Matrix Multiplication: Technique 2

Blocking

Instead of making one computation per pass. Compute multiple additions per pass in the fragment program.

Pass1Output = ax1 * b1y + ax2 * b2y +… + axb * bby

…..

Passes: = n/Blockssize

Now there is a tradeoff between passes and program size/fetches

Matrix-Matrix OperationsGPU Matrix Multiplication: Technique 3

Modern fragment shaders allow up to 4 instructions to be executed simultaneously

(1) output = v1.abgr*v2.ggab

This is issued as a single GPU instruction and numerically equivalent to the following 4 instructions being executed in parallel

(2) output.r = v1.a *v2.goutput.g = v1.b * v2.goutput.b = v1.g * v2.aoutput.a = v1.r * v2.b

In v1.abgr the color channels are referenced in arbitrary order.This is referred to as swizzling.

In v2.ggab the color channel (g) is referenced multiple times.This is referred to as smearing.

Matrix-Matrix Operations

Up until now we have been using 1 channel, the red component to store the data, why now store data across all the channels (RGBA) and compute instructions 4 at a time

GPU Matrix Multiplication: Technique 3Smearing/Swizzling

The matrix multiplication can be expressed as follows:

Suppose we have 2 large matrices A B, wog whose dimensions are power of 2sA11, a12 … are sub matrices of 2i-1 rows/columns

Matrix-Matrix OperationsNote on Notation:

C(r)=A(r)*B(r) used to denote the channelsExample:

So now the final matrix multiplication can be expressed recursively by:

Matrix-Matrix OperationsEfficiency/Bandwidth Issues

Problem with matrix multiplication is each input contributes to multiple outputs O(n)Arithmetic performance is limited by cache bandwidthMultipass algorthims tend to be more cache friendly

2 Types of Bandwidth- External Bandwidth: Data from the CPU GPU transfers

limited by the AGP or PCI express bus- Internal Bandwidth (Blackbox): read from textures/write to

textures tend to be expensive

- Back of the envelope calculation:((2 texture read/write lookups) *blocksize + 2(previous pass lookup)*(prescion)(n2)

- (2*32 + 2)(32)(1024) = 4GB of Data being thrown around

GPU Benchmarks

164

50

75

10

125

150

175

5900 6800 ATI9800 ATIX800

GFL

OP

S

Peak Arithmetic Rate

7800Pent IV

54

520

8800

330

ATIX1900

2225

0

Page 5: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

5

Previous Generation GPUs

0

2

4

6

8

10

12

P4 3Ghz 5900 Ultra 9800 XT0

5

10

15

20

25

30

GFLOPSBandwidth

Multiplication of 1024x1024 Matrices

GFL

OP

S

GB

/sec

Next Generation GPUs

0

2

4

6

8

10

12

P4 3Ghz 6800 Ultra X800 XT PE0

5

10

15

20

25

30

GFLOPSBandwidth

Multiplication of 1024x1024 Matrices

GFL

OP

S

GB

/sec

Matrix-Vector OperationsMatrix Vector Operation Review

Example 1:

Example 2:

Matrix-Vector OperationsTechnique 1: Just use a Dense Matrix Multiply

Pass1Output = ax1 * b11 + ax2 * b21 +… + axb * bb1

…..

Passes: = n/Blockssize

Matrix-Vector OperationsTechnique 2: Sparse Banded Matrices (A*x = y)

A band matrix is a sparse matrix whose nonzero elements are confined to diagonal bands

Algorithm:- Convert Diagonal Bands to vectors - Convert (N) vectors to 2D-textures , pad with 0 if they do not fill the

texture completely

Matrix-Vector OperationsTechnique 2: Sparse Banded Matrices

- Convert the multiplication Vector (x) to a 2D texture

- Pointwise multiply (N) Diagonal textures with (x) texuture

- Add the (N) resulting matrices to form a 2D texuture

- unwrap the 2D texture for the final answer

Page 6: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

6

Matrix-Vector Operations

Technique 3: Sparse Matrices

Create a texture lookup scheme

Texturesunsigned char *pixels = // ...

GLuint id; glGenTextures(1, &id);

glBindTexture(GL_TEXTURE_2D, id);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);

glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_BGRA, GL_UNSIGNED_BYTE, pixels);

// ...glDeleteTextures(1, &id);

Texturesunsigned char *pixels = // ...

GLuint id; glGenTextures(1, &id);

glBindTexture(GL_TEXTURE_2D, id);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);

glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_BGRA, GL_UNSIGNED_BYTE, pixels);

// ...glDeleteTextures(1, &id);

Pixels for an image insystem memory.

Texturesunsigned char *pixels = // ...

GLuint id; glGenTextures(1, &id);

glBindTexture(GL_TEXTURE_2D, id);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);

glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_BGRA, GL_UNSIGNED_BYTE, pixels);

// ...glDeleteTextures(1, &id);

Standard business.

Texturesunsigned char *pixels = // ...

GLuint id; glGenTextures(1, &id);

glBindTexture(GL_TEXTURE_2D, id);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);

glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_BGRA, GL_UNSIGNED_BYTE, pixels);

// ...glDeleteTextures(1, &id);

I hate global state. You should too. What is the alternative design?

Texturesunsigned char *pixels = // ...

GLuint id; glGenTextures(1, &id);

glBindTexture(GL_TEXTURE_2D, id);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);

glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_BGRA, GL_UNSIGNED_BYTE, pixels);

// ...glDeleteTextures(1, &id);

Sampler state. More info to follow.

Page 7: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

7

Texturesunsigned char *pixels = // ...

GLuint id; glGenTextures(1, &id);

glBindTexture(GL_TEXTURE_2D, id);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);

glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_BGRA, GL_UNSIGNED_BYTE, pixels);

// ...glDeleteTextures(1, &id); Transfer from system memory to

driver-controlled (likely, GPU) memory. Does it need to block?

Texturesunsigned char *pixels = // ...

GLuint id; glGenTextures(1, &id);

glBindTexture(GL_TEXTURE_2D, id);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);

glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_BGRA, GL_UNSIGNED_BYTE, pixels);

// ...glDeleteTextures(1, &id);

Pixel data format and datatype

Texturesunsigned char *pixels = // ...

GLuint id; glGenTextures(1, &id);

glBindTexture(GL_TEXTURE_2D, id);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR); glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);

glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, width, height, 0, GL_BGRA, GL_UNSIGNED_BYTE, pixels);

// ...glDeleteTextures(1, &id);

Internal (GPU) texture format

Texture Wrap Parameters

Images from: http://http.download.nvidia.com/developer/NVTextureSuite/Atlas_Tools/Texture_Atlas_Whitepaper.pdf

GL_MIRRORED_REPEAT

GL_REPEAT

GL_CLAMP

Set with:

glTexParameteri()

Multitexturing

Using multiple textures in the same rendering passEach is bound to a different texture unitand accessed with a different sampleruniform in GLSL

Multitexturing: Light Map

Recall our Light Map example:

x =

Precomputed light Surface color

“lit” surface

Page 8: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

8

Multitexturing: Light Map

uniform sampler2D lightMap;uniform sampler2D surfaceMap;

in vec2 fs_TxCoord;in vec3 out_Color;

void main(void) {

float intensity = texture(lightMap, fs_TxCoord).r;

vec3 color = texture(surfaceMap, fs_TxCoord).rgb;out_Color = intensity * color;

}

Each texture is accessed with a different sampler

Multitexturing: Light Map

uniform sampler2D lightMap;uniform sampler2D surfaceMap;

in vec2 fs_TxCoord;in vec3 out_Color;

void main(void) {float intensity = texture(lightMap, fs_TxCoord).r;

vec3 color = texture(surfaceMap, fs_TxCoord).rgb;out_Color = intensity * color;

}

Pass the sampler to texture()to read from a particular texture

Multitexturing: Terrain

How was this rendered?

Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

Multitexturing: Terrain

First hint: two textures

Images courtesy of A K Peters, Ltd. www.virtualglobebook.com

Grass Stone

Multitexturing: Terrain

Second hint: terrain slope

Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

Multitexturing: Terrain

Second hint: terrain slope

Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

•0 is flat•1 is steep

Page 9: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

9

Multitexturing: Terrain

Third and final hint: a blend ramp

Image courtesy of A K Peters, Ltd. www.virtualglobebook.com

Multitexturing: Terrainuniform sampler2D grass;uniform sampler2D stone;uniform sampler2D blendRamp;

in vec3 out_Color;

void main(void) {// ...

out_Color = intensity * mix(texture(grass, repeatTextureCoordinate).rgb,texture(stone, repeatTextureCoordinate).rgb,texture(u_blendRamp, vec2(0.5, slope)).r);

}

Multitexturing: Terrainuniform sampler2D grass;uniform sampler2D stone;uniform sampler2D blendRamp;

in vec3 out_Color;

void main(void) {

// ...

out_Color = intensity * mix(texture(grass, repeatTextureCoordinate).rgb,texture(stone, repeatTextureCoordinate).rgb,texture(u_blendRamp, vec2(0.5, slope)).r);

}

• Three samplers• blendRamp could be 1D; it is just 1xn

Multitexturing: Terrainuniform sampler2D grass;uniform sampler2D stone;uniform sampler2D blendRamp;

in vec3 out_Color;

void main(void) {// ...

out_Color = intensity * mix(texture(grass, repeatTextureCoordinate).rgb,texture(stone, repeatTextureCoordinate).rgb,texture(u_blendRamp, vec2(0.5, slope)).r);

}

Use terrain slope to look up a blend value in the range [0, 1]

Multitexturing: Terrainuniform sampler2D grass;uniform sampler2D stone;uniform sampler2D blendRamp;

in vec3 out_Color;

void main(void) {

// ...

out_Color = intensity * mix(texture(grass, repeatTextureCoordinate).rgb,texture(stone, repeatTextureCoordinate).rgb,texture(u_blendRamp, vec2(0.5, slope)).r);

}

Linearly blend between grass and stone

Multitexturing: Globe

How will you render this?

Imagery from http://planetpixelemporium.com/

Page 10: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

10

Multitexturing: Globe

How will you render this?

Imagery from http://planetpixelemporium.com/

Day textureNight texture

Multitexturing: Globe

Imagery from http://planetpixelemporium.com/

Day Texture Day Night

Multitexturing: Globe

VideosNight and DayCloudsSpecularity

Framebuffer Objects (FBOs)

Framebuffer Objects (FBOs)Allow fragment shader to write to one or more off-screen buffersCan then use the off-screen buffer as a texture in a later rendering passAllows render to textureDon’t worry about the OpenGL API; we’ve already coded it for you

Framebuffer Objects (FBOs)

FBOs are lightweight containers of textures

FBO

Depth Texture

Color Texture 0

Color Texture 1

Framebuffer Objects (FBOs)

FBO use case: post processing effectsRender scene to FBO with depth and color attachmentRender a viewport-aligned quad with texture that was the color attachment and apply effectHow would you design a post processing framework?

Page 11: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

11

Deferred Shading

FBO use case: deferred shadingRender scene in two passes

1st pass: Visibility tests2nd pass: Shading

Deferred Shading

1st Pass: Render geometry into G-Buffers

Fragment Colors Normals

Depth Edge Weight

Images from http://http.developer.nvidia.com/GPUGems3/gpugems3_ch19.html

Deferred Shading

2nd pass: shading == post processing effectsRender viewport-aligned quads that read from G-BuffersObjects are no longer needed

Deferred Shading

Light accumulation result

Image from http://http.developer.nvidia.com/GPUGems3/gpugems3_ch19.html

Deferred Shading

What are the benefits:Shading and depth complexity?Memory requirements?Memory bandwidth?Material and light decoupling?

Ambient Occlusion

Ambient Occlusion (AO)"shadowing of ambient light“"darkening of the ambient shading contribution“

Image from Bavoil and Sainz. http://developer.download.nvidia.com/SDK/10.5/direct3d/Source/ScreenSpaceAO/doc/ScreenSpaceAO.pdf

Page 12: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

12

Ambient Occlusion

Ambient Occlusion"the crevices of the model are realistically darkened, and the exposed parts of the model realistically receive more light and are thus brighter“"the soft shadow generated by a sphere light of uniform intensity surrounding the scene"

Ambient Occlusion

Image from Iñigo Quílez. http://iquilezles.org/www/articles/ssao/ssao.htm

Ambient Occlusion

Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

Evenly lit from all directions Ambient Occlusion Global Illumination

Ambient Occlusion

Image from Bavoil and Sainz. http://developer.download.nvidia.com/SDK/10.5/direct3d/Source/ScreenSpaceAO/doc/ScreenSpaceAO.pdf

"the integral of the occlusion contributed from inside a hemisphere of a given radius R, centered at the current surface point P and oriented towards the normal n at P"

Object Space Ambient Occlusion

AO does not depend on light directionPrecompute AO for static objects using ray casting

How many rays?How far do they go?Local objects? Or all objects?

Object Space Ambient Occlusion

Image courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

• Cosine weight rays• or use importance sampling: cosine distribute number of rays

Page 13: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

13

Object Space Ambient Occlusion

Depends on scene complexityStored in textures or verticesHow can we

Support dynamic scenesBe independent of scene complexity

Screen Space Ambient Occlusion

Apply AO as a post processing effect using a combination of depth, normal, and position buffersNot physically correct but plausibleVisual quality depends on

Screen resolutionNumber of buffersNumber of samples

Depth Buffer Normal Buffer

View Space Eye Position Buffer Screen Space Ambient Occlusion

Images courtesy of A K Peters, Ltd. http://www.realtimerendering.com/

Page 14: GLSL Applications: Today’s slides Matrix Operations on the ...cis565/Lectures2011/Lecture7.pdf · Pass1 Output = a x1 * b 11 + a x2 * b 21 +… + a xb * b b1 ….. Passes: = n/Blockssize

14

Screen Space Ambient Occlusion

Image from Martin Mittring. http://developer.amd.com/documentation/presentations/legacy/Chapter8-Mittring-Finding_NextGen_CryEngine2.pdf

Screen Space Ambient Occlusion

Image from Martin Mittring. http://developer.amd.com/documentation/presentations/legacy/Chapter8-Mittring-Finding_NextGen_CryEngine2.pdf

Screen Space Ambient Occlusion

Image from Mike Pan. http://mikepan.com

• Blur depth buffer• Subtract it from original depth buffer• Scale and clamp image, then subtract from original• Superficially resembles AO but fast


Recommended