Perception
Kurt Akeley
CS248 Lecture 18
29 November 2007
http://graphics.stanford.edu/courses/cs248-07/
CS248 Lecture 18 Kurt Akeley, Fall 2007
Today
This is the last for-credit lecture
Material from next-weeks lectures will not be tested
Emphasize perception
Pull together and re-emphasize ideas from earlier lectures
Introduce some new ideas
Tie everything back to performance
CS248 Lecture 18 Kurt Akeley, Fall 2007
Optical quality of the eye
Image from www.wikipedia.com
What is the image of this (ideal) line?
Fovea
Range of focus:
• 5” to infinity (you)
• 40” to infinity (me, corrected)
CS248 Lecture 18 Kurt Akeley, Fall 2007
Retinal image of an ideal line
Eye image from www.wikipedia.com
CS248 Lecture 18 Kurt Akeley, Fall 2007
Line spread function
CS248 Lecture 18 Kurt Akeley, Fall 2007
Retinal image of a sine wave grating
Eye image from www.wikipedia.com
Lower contrast
CS248 Lecture 18 Kurt Akeley, Fall 2007
Modulation transfer function
CS248 Lecture 18 Kurt Akeley, Fall 2007
Ricco’s Law
Area and intensity are indistinguishable for objects that subtend less than (roughly) 6 arc min.
This allows antialiasing to work
Especially fractional-width points and lines
Antialiased pixels should subtend less than 6 arc min
CS248 Lecture 18 Kurt Akeley, Fall 2007
Ricco’s Law and line spread (a coincidence?)
6 arc min
CS248 Lecture 18 Kurt Akeley, Fall 2007
Spatial resolution of the eye
Cone spacing in the fovea:
L and M cones: 0.5 arc min
S cones: 10 arc min
Nyquist frequency for foveal photopic vision is 60 cpd
Half the 120 cone/deg density
Nyquist frequency is much lower outside the fovea
Effective receptor density falls to 1/20th that of the fovea ?
Rendering can take advantage of this E.g., insets in flight-simulation graphics
accelerators
Thus the lower spectral response seen in the color theory lecture
CS248 Lecture 18 Kurt Akeley, Fall 2007
No aliasing in foveal vision
Foveal Nyquist frequency
Peripheral Nyquist frequency
(approximate)
CS248 Lecture 18 Kurt Akeley, Fall 2007
No aliasing in foveal S cones either
Optics of the eye are substantially worse for 400 nm light
MTF did not show this (it is an aggregate)
CS248 Lecture 18 Kurt Akeley, Fall 2007
Vernier acuity
Can detect an offset of 5 arc sec
But sensor spacing is 30 arc sec
How does this work?
Not due to random sensor locations (works with very short lines)
5 arc sec
CS248 Lecture 18 Kurt Akeley, Fall 2007
How vernier acuity (probably) works
Cone spacing
CS248 Lecture 18 Kurt Akeley, Fall 2007
Display resolution
h dθ
-1
1
screen height
viewing distance
pixel count
pixel angle
=tan
12".028 tan 1.68 arc min
1024 24"
h
d
p
h
pd
q
q
-
=
=
=
=
æ ö÷ç ÷ç ÷ç ÷çè ø
æ ö÷ç= =÷ç ÷çè ø×
Satisfies Ricco’s Law (less than 6 arc
min)
CS248 Lecture 18 Kurt Akeley, Fall 2007
Matching foveal resolution
-1
1
screen height
viewing distance
pixel count
pixel angle
=tan
12".00833 tan 0.5 arc min
3438 24"
h
d
p
h
pd
q
q
-
=
=
=
=
æ ö÷ç ÷ç ÷ç ÷çè ø
æ ö÷ç= =÷ç ÷çè ø×
Foveal resolution
CS248 Lecture 18 Kurt Akeley, Fall 2007
Flicker
Flicker fusion threshold
Statistically 16 Hz
Increases
In peripheral vision
With brighter scenes
With viewer fatigue
Flicker rates:
Movies: 48 Hz (typical), 72 Hz (using computer displays)
Video: 60 Hz (US NTSC), 50 Hz (Europe and Asia, PAL)
Computer displays: 60-100 Hz (CRT), no flicker (LCD)
Fluorescent lights: 120 Hz (US), 100 Hz (Europe, Asia)
Hence “jumping” numeric or CRT displays, when you aren’t looking directly at
them
CS248 Lecture 18 Kurt Akeley, Fall 2007
Frame rate vs. flicker rate
Increasing flicker rate above frame rate:
Avoids flicker-rate problems
But introduces visual artifacts Image doubling (2x) or even tripling (3x)
Media Frame rate Flicker rate
Movie 24 48 or 72
Television 25 or 30* 50 or 60
Visual simulation 60 60
CS248 Lecture 18 Kurt Akeley, Fall 2007
Interlaced displays
Two fields per “frame” Display odd lines in the first field Display even lines in the second field
“Frame” is misleading: True interlaced sampling is “flying spot”
Each pixel is sampled and displayed at proportional times
Motion artifacts are avoided
Interlaced frames (e.g., video display of a movie) All pixels are sampled at the same moment But display is sequential, causing motion artifacts
Still common in video 1080i is standard 1080p is becoming more common
Big battle during definition of
HDTV!
CS248 Lecture 18 Kurt Akeley, Fall 2007
Interlacing and antialiasing
Small moving objects can disappear
Object subtends a single pixel
Fields are rendered properly (not from a single frame)
One solution is antilaliasing with a large filter kernel
Rendered objects necessarily subtend more than a single pixel
Field n Field n+1 Field n+2
CS248 Lecture 18 Kurt Akeley, Fall 2007
Color sequential displays
Time-sequential red, green, blue (and sometimes white)
Examples:
Many digital projectors
Professional head-mounted displays
Should render each “frame” separately
Movies don’t So time sequential projectors yield “rainbow”
effects
Simulation systems do So motion artifacts are avoided
CS248 Lecture 18 Kurt Akeley, Fall 2007
Mach banding – slope discontinuities
Same peak intensities
CS248 Lecture 18 Kurt Akeley, Fall 2007
Human response is not linear
Twice as many photons/sec does not appear twice as bright
Instead 5.7 times as many photons appear twice as bright
Brightness (human perception) and intensity (actual photon rate) are related by Steven’s Power Law:
0.4k= ×B I
CS248 Lecture 18 Kurt Akeley, Fall 2007
Human sensitivity is not linear either
Can distinguish intensity differences of 1%
Static images
Photopic (intensities bright enough for cones to see)
This corresponds to a linear change in brightness
( )
( )
0.4 0.4
0.4
1.01
0.01
k k
k
= × × - ×
= ×
B I IV
CS248 Lecture 18 Kurt Akeley, Fall 2007
Motion matters
CS248 Lecture 18 Kurt Akeley, Fall 2007
Numeric representation
Optimal numeric representation would arrange for adjacent intensities to be (barely) indistinguishable.
Thus optimal numeric representation is
nonlinear in intensity (relative differences of 1 percent)
but linear in brightness (absolute differences of k(0.01)0.4)
CS248 Lecture 18 Kurt Akeley, Fall 2007
Contrast ratio
Visible contrast:
4-5 orders of magnitude within a scene (at the same time)
6 orders of magnitude of “adaptation” Can take up to 40 minutes, though
Bits Delta I per step
Delta I total
8 1.0 % 12.6
8 1.8 % 100
10 0.7 % 1000
CS248 Lecture 18 Kurt Akeley, Fall 2007
Solutions
Brightness-linear storage
Use linear arithmetic (get incorrect answers)
Use non-linear arithmetic (get correct answers) Convert convert to intensity-linear, operate,
convert back
Implement nonlinear arithmetic
Intensity-linear storage
Gamma correct (convert to brightness-linear form) when displaying
CS248 Lecture 18 Kurt Akeley, Fall 2007
Brightness-linear storage
Intensities can be added, brightnesses cannot
Store image linear in brightness (unusual in 3-D systems)
Best use of available storage precision
256 representable levels are enough
Requires conversion for each pixel operation (e.g., blend)
8-bitframebuffer
DAC DisplayGamma
convertern 8 8
CS248 Lecture 18 Kurt Akeley, Fall 2007
Intensity-linear storage
Store image linear in intensity (typical in 3-D systems)
Native arithmetic format
Requires conversion during display
Large brightness steps at low intensities
256 DAC levels is OK, but frame buffer needs more
Gammaconverter DAC Display
n-bitframebuffern n 8
CS248 Lecture 18 Kurt Akeley, Fall 2007
What is n ?
Assume
8-bit DAC
Gamma of 2.4
Table input
Output n=8
Output n=10
Output n=12
Output n=14
Output n=16
2**n-1 255 255 255 255 255
2**n-2 254 255 255 255 255
3 40 22 13 7 4
2 34 19 11 6 3
1 25 14 8 4 2
0 0 0 0 0 0
0.41662.4255
2 1n
inputoutput
æ ö÷ç= ÷ç ÷çè ø-
…
CS248 Lecture 18 Kurt Akeley, Fall 2007
Display gamut
No finite set of primaries can
reproduce the entire gamut. But more
primaries do a better job.
CS248 Lecture 18 Kurt Akeley, Fall 2007
Perception and Performance(adapted from my VR2004 keynote)
CS248 Lecture 18 Kurt Akeley, Fall 2007
Latency
For an out-the-window display
100 to 150 milliseconds
For a head-mounted display
5 to 15 milliseconds**
Total response latency, sum of
Tracking/input delay, plus
Rendering delay, plus
Display delay
A 72 Hz display refreshes every 14 ms
** source: Fred Brooks
CS248 Lecture 18 Kurt Akeley, Fall 2007
Latency solution
Reduce system latency to 5-15 ms range
Requires 2-4 ms frame time (250-500 Hz)
Assuming 3-frame latency
Estimated cost: 5x
CS248 Lecture 18 Kurt Akeley, Fall 2007
Running total
Cost Feature Notes
5x Low latency Frame rate 250-500 Hz
CS248 Lecture 18 Kurt Akeley, Fall 2007
Stereo solution
Binocular disparity is a very strong visual cue
Must render separately for each eye
Occlusion
View-dependent lighting (e.g. reflections, specularity)
Alternatives tend to be hacks
Estimated cost: 2x
CS248 Lecture 18 Kurt Akeley, Fall 2007
Running total
Cost Feature Notes
5x Low latency Frame rate 250-500 Hz
2x Stereo Two independent views
CS248 Lecture 18 Kurt Akeley, Fall 2007
Incorrect retinal cue – blur gradient
Correct Incorrect
CS248 Lecture 18 Kurt Akeley, Fall 2007
Focus cue solution
Multiple image plane display
Fixed relationship to viewer (e.g. head mounted)
Low resolution in depth
Non-occluding images with depth filtering
Separate left and right displays (2x cost already accounted)
Leverages 2D technology
Amounts to a 2.5D display
Cost estimate: 3x
f
CS248 Lecture 18 Kurt Akeley, Fall 2007
Running total
Cost Feature Notes
5x Low latency Frame rate 250-500 Hz
2x Stereo Two independent views
3x Correct focus cues Multi-plane display
CS248 Lecture 18 Kurt Akeley, Fall 2007
High Dynamic Range (HDR)
Human limitations
1,000,000:1 range of sensitivity
100,000:1 contrast within scene
Current displays
CRT 300:1 contrast ratio
LCD 1000:1 contrast ratio
SIGGRAPH 2003 ET
Sunnybrook Technologies
Numbers from Sunnybrook Technologies
CS248 Lecture 18 Kurt Akeley, Fall 2007
Sunnybrook Technologies
Dual-density display
Conventional LCD panel in front (full-resolution)
White LED array used as back-light (~1/50 resolution)
CS248 Lecture 18 Kurt Akeley, Fall 2007
Sunnybrook Technologies
Scattering masks low resolution LEDs
CS248 Lecture 18 Kurt Akeley, Fall 2007
HDR solution
Requires 16-bit framebuffer components
Rendering
Blending
Full-scene anti-aliasing
Requires multi-resolution rendering
Full-resolution for LCD, corrected for back-lighting
Low-resolution for back-lighting
Estimated cost: 2x
CS248 Lecture 18 Kurt Akeley, Fall 2007
Running total
Cost Feature Notes
5x Low latency Frame rate 250-500 Hz
2x Stereo Two independent views
3x Correct focus cues Multi-plane display
2x High dynamic range
Multi-resolution rendering
CS248 Lecture 18 Kurt Akeley, Fall 2007
Field of view
Human field of view (FOV)
Monocular: 160 deg (wide) x 135 deg (high)
Binocular: 200 deg (wide)
Binocular overlap: 120 deg (wide)
Typical screen FOV
55 deg (wide) x 41 deg (high)
dd
CS248 Lecture 18 Kurt Akeley, Fall 2007
Optical flow matters
“Women Go With the (Optical) Flow”, Desney S. Tan, Mary Czerwinski, George Robertson. http://research.microsoft.com/users/marycz/chi2003flow.pdf
CS248 Lecture 18 Kurt Akeley, Fall 2007
FOV solution
Double horizontal FOV to 110 degrees
Double vertical FOV to 80 degrees
Cleverness to distribute resolution ?
e.g. cylindrical projection
Estimated cost: 5x
CS248 Lecture 18 Kurt Akeley, Fall 2007
Pixels subtend different angles
Assumes planar display
0
1
2
3
4
5
6
7
8
9
0 20 40 60 80 100 120 140
Field of view
Center pixel
Edge Pixel
CS248 Lecture 18 Kurt Akeley, Fall 2007
Running total
Cost Feature Notes
5x Low latency Frame rate 250-500 Hz
2x Stereo Two independent views
3x Correct focus cues Multi-plane display
2x High dynamic range
Multi-resolution rendering
5x Full field of view 110 deg (wide) x 80 deg (high)
CS248 Lecture 18 Kurt Akeley, Fall 2007
Foveal resolution
Foveal sampling density is ½ arc min
Display pixel should subtend ½ arc min
Typical monitor pixel subtends 2 arc min
1600 pixels at (dist = width)
IBM T221 (aka Big Bertha) LCD Display
Resolution: 3840 (wide) x 2400 (high)
Dimensions: 19” (wide) x 12” (high)
Estimated cost: 15x
CS248 Lecture 18 Kurt Akeley, Fall 2007
Running total
Cost Feature Notes
5x Low latency Frame rate 250-500 Hz
2x Stereo Two independent views
3x Correct focus cues Multi-plane display
2x High dynamic range
Multi-resolution rendering
5x Full field of view 110 deg (wide) x 80 deg (high)
15x Foveal resolution ½ arc min
CS248 Lecture 18 Kurt Akeley, Fall 2007
Full-scene antialiasing
SAGE
Render
16 sample / pixel
Reconstruction
5x5 pixel filter
400 samples / pixel
~1000 FLOPs / pixel
Estimated cost: 5x
“The SAGE Graphics Architecture”, Michael Deering and David Naegle, Proceedings of SIGGRAPH 2002
CS248 Lecture 18 Kurt Akeley, Fall 2007
Running total
Cost Feature Notes
5x Low latency Frame rate 250-500 Hz
2x Stereo Two independent views
3x Correct focus cues Multi-plane display
2x High dynamic range
Multi-resolution rendering
5x Full field of view 110 deg (wide) x 80 deg (high)
15x Foveal resolution ½ arc minute
5x Full-scene AA 16 samples / pixel, 5x5 pixel filter
CS248 Lecture 18 Kurt Akeley, Fall 2007
Let’s sum it all up
Cost Feature Notes
5x Low latency Frame rate 250-500 Hz
2x Stereo Two independent views
3x Correct focus cues Multi-plane display
2x High dynamic range
Multi-resolution rendering
5x Full field of view 110 deg (wide) x 80 deg (high)
15x Foveal resolution ½ arc minute
5x Full-scene AA 16 samples / pixel, 5x5 pixel filter
22,500x
CS248 Lecture 18 Kurt Akeley, Fall 2007
This will keep GPU vendors busy ...
Multiple 2.2 CAGR 2.0 CAGR 1.8 CAGR
1000 9 years 10 years 12 years
5000 11 years 12 years 15 years
10000 12 years 13 years 16 years
50000 15 years 16 years 18 years
22,500x
CS248 Lecture 18 Kurt Akeley, Fall 2007
Vision research
Almost all experimental vision research is now done using computer graphics
There are research opportunities in this area:
http://bankslab.berkeley.edu/
CS248 Lecture 18 Kurt Akeley, Fall 2007
Purpose of computer graphics?
Communication is the purpose
Human perception is the context
Techniques leverage visual perception abilities
Fidelity is a tool, not (necessarily) the goal
Virtual reality is great, but
Don’t want to be limited to reality Want to do super reality
Non-photorealistic rendering (NPR) is valuable
– Bill Buxton, Sketching User Experiences, 2006
No apology is required for “approximations” Especially for interactive graphics
CS248 Lecture 18 Kurt Akeley, Fall 2007
Summary
Rule 1: All discontinuous frame-to-frame changes correspond to
discontinuous scene or visibility changes
CS248 Lecture 18 Kurt Akeley, Fall 2007
Assignments
Next lecture: Computational Photography (Marc Levoy)
Reading assignment: none (work on your projects)
CS248 Lecture 18 Kurt Akeley, Fall 2007
End