Except for the historical information contained herein, certain matters set forth in today’s presentation
including, but not limited to, statements as to: visual computing; the discrete GPU, integrated graphics
and mobile devices; optimization of the PC; our strategies; our growth and growth factors; market
opportunities; the features, benefits, capabilities and performance of our current and future products
and technologies and consumer demands and expectations as well as other predictions and estimates
are forward-looking statements within the meaning of the Private Securities Reform Act of 1995. These
forward-looking statements and any other forward-looking statements that go beyond historical facts
that are made today during the presentation, demonstrations or in response to questions are subject to
risks and uncertainties that may cause actual results to differ materially. For a complete discussion of
risk factors that could affect our present and future results, please refer to our Annual Report on Form
10-K for the fiscal year ended January 27, 2008 and from time to time in the reports we file with the
Securities and Exchange Commission. All forward-looking statements are made as of today, based on
the information currently available to us. Except as required by law, we assume no obligation to update
any such statements.
Safe Harbor
First, graphics that we have all come to know and
love today, I have news for you. It's coming to an
end. Our multi-decade old 3D graphics rendering
architecture that's based on a rasterization
approach is no longer scalable and suitable for the
demands of the future.
Pat Gelsinger keynote, IDF Shanghai
The Graphics Industry
GPU
INTELIGP
59%
41%
CPU Shipments
273 Million
GPU Shipments
366 Million
Source: Mercury Research
The Graphics Industry
INTELIGP
Source: Mercury Research
CPU Shipments
273 Million
73%
27%
73 Million Idle IGPs
GPU
Probably be no need [to purchase a dedicated
graphics card in a short while].
Ron Fosner, Intel Graphics and Gaming
Technologist, Shanghai IDF, 2008
0
5
10
15
20
25
30
35
40
45
50
3DMark06 Doom3 CoD4
Intel 965g
Intel G35
GeForce 9500GT
GeForce 9600GT
GeForce 9800GTX
10x by 2010
FA
IL
Core 2 Duo E6550
No AA
No Aniso Filter
Intel’s integrated graphics just don't work. I don't think they will
ever work.
All the Intel integrated graphics are still incapable of running any
modern games.
Industry Truth: The Creator of Unreal
Tournament 2004 says Intel is “incapable of
running modern games.”
Tim Sweeney
President Epic Games
March 10, 2008
Intel GMA 3100 Can’t Properly Run
2/3 of Top Selling Games Unplayable or Games with Problems
Sims2
Sim City 5
Call Of Duty 4
Half Life 2
Civilization IV
Enemy Territory: Quake Wars
Crysis
Battlefield 2
Command & Conquer 3
Unreal Tournament 3
Battlefield 2142
The Witcher
Bioshock
Halo: Combat Evolved
Guild Wars: Nightfall
Medieval II: Total War
World In Conflict
Heroes Of Might & Magic V
Supreme Commander
Configuration:
Microsoft Vista 32-bit
Intel G33: Intel 15.8.0.1437 drivers
Core2 Duo, 2GB Memory
*based on NPD retail sales reports
Multi-core processors [could] handle life-like
animations, such as weather or effects better
than dedicated GPUs. For instance, multi-core
processors can handle the graphics tasks in a
better manner than a high-end graphics board
could ever do.
Ron Fosner, Intel Graphics and Gaming Technologist,
Shanghai IDF, 2008
3000%
2000%
1000%
0%
500%
1500%
2500%
$163 $240 $342$113 $999
Average Total GPU + CPU Spend
Rela
tive
3D
Ma
rk06
Pe
rfo
rma
nce
Benchmark run at 1280x1024, 4xAA/8x AF.
GMA 3100
Core 2 Duo
E4400
3000%
2000%
1000%
0%
500%
1500%
2500%
$163 $240 $342
Upgrade CPU, Graphics Constant
$113 $999
Average Total GPU + CPU Spend
Rela
tive
3D
Ma
rk06
Pe
rfo
rma
nce
Core 2 Quad QX9650Core 2 Quad Q6600Core 2 Duo E6550
GMA 3100
Core 2 Duo
E4400
Benchmark run at 1280x1024, 4xAA/8x AF.
GPU Delivers 27x the Bang For Buck
3000%
2000%
1000%
0%
500%
1500%
2500%
$163 $240 $342
Upgrade CPU, Graphics Constant
$113 $999
GeForce 8800 GT
GeForce 8600 GT
GeForce 8400 GS
Average Total GPU + CPU Spend
Upgrade GPU, CPU Constant
Rela
tive
3D
Ma
rk06
Pe
rfo
rma
nce
Core 2 Quad QX9650Core 2 Quad Q6600Core 2 Duo E6550
GMA 3100
Core 2 Duo
E4400
Benchmark run at 1280x1024, 4xAA/8x AF.
See Appendix for system specs.
Discrete Notebook Unit Share
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0
3,000
6,000
9,000
12,000
FQ1'08 FQ2'08 FQ3'08 FQ4'08 FQ1'09 FQ2'09 FQ3'09 FQ4'09
NVIDIA TAM Market ShareHistorical data from Mercury Research
FQ1’09 onward NVIDIA estimates
Discrete Notebook Revenue Share
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
$-
$75
$150
$225
$300
FQ1'08 FQ2'08 FQ3'08 FQ4'08 FQ1'09 FQ2'09 FQ3'09 FQ4'09
NVIDIA TAM Market ShareHistorical data from Mercury Research
FQ1’09 onward NVIDIA estimates
Shockwaves
Gateway
P-6831 FX
Dell
XPS M1530
HP
dv9700t
Sony
CR490EBR
Price $1249 $1299 $1289 $1324
CPU1.67GHz Core 2 Duo
2GHz Core 2 Duo
1.83GHzCore 2 Duo
2.5GHzCore 2 Duo
GPU 8800 GTS 8400 GS 8400 GS Intel Integrated
Performance
3DMark068000 1600 1600 500
3DMark06 estimates based on NVIDIA testing of similar graphics configurations
Prices available online
“… features exceptional graphics performance
as its main focus …”
“ … to give you high definition movie pleasure
and smooth gaming fun …”
“… superior graphic quality and HDMI display output
allows for entertainment enjoyment in high definition.”
Source: ASUS
GPU GrowthDriven by Insatiable Visual Computing Demand
7% Decline
14%Growth
80%
90%
100%
110%
120%
2005 2007
Re
kati
ve R
eve
nu
e G
row
th
CPU GPU
Re
lative
Re
ve
nu
e G
row
th
Source: Mercury Research
Core 2 Duo
E8400
(Dual Core)
GeForce
8800 GTS
(Many-Core)
# of Cores 2 128
# of GFLOPS 48 576
Heterogeneous Computing
Multi-Core plus Many-Core
Isotropic turbulence
simulation in Matlab using
.mex file CUDA function4
Transcoding HD video
stream to H.264 for
portable video3
Ionic placement for
molecular dynamics
simulation on GPU2
Interactive visualization of
volumetric white matter
connectivity1
Ultrasound medical
imaging for cancer
diagnostics8
Astrophysics nbody
simulation5
Financial simulation of
LIBOR Model with
swaptions6
Cmatch exact string
matching to find similar
proteins and gene
sequences10
Highly optimized object
oriented molecular
dynamics9
GLAME@lab: An M-script
API for Linear Algebra
Operations on GPU7
The Power of Heterogeneous Computing
Computer Industry Luminaries Driving the
Heterogeneous Computing Movement
“One reason to have different „sized‟ processors in a many-core architecture is to
improve parallel speedup ....”- David Patterson, Berkeley
“Many applications can run orders of magnitude faster on a heterogeneous CPU+GPU
system today. CUDA has been shown to be a very effective programming model for
heterogeneous computing.”- Wen-Mei Hwu, University of Illinois at Urbana-Champaign
CUDA GPUs
Oil & Gas Finance Medical Biophysics Numerics Audio Video Imaging
Heterogeneous Computing
CPU
GPU
VAX
Maspar
Thinking Machines
Blue Gene
Many-Core
Multi-Core
Intel 4004
DEC PDP-1
ILLIAC IV
IBM System 360
Cray-1
IBM POWER4
Numerics Engine
FFT BLAS CuDPP
CUDA Compiler
C Fortran
CUDA Tools
Debugger Profiler
System
PCI-E Switch1U
Application Software
Industry Standard C Language
4 cores
nbody Demo on CPU and GPU
0
50
100
150
200
250
300
1 2 3 4CPU mGPU Mobile
GPU
Desktop
GPU
Source: NVIDIA
See appendix for system specs
Power Performance - VMD
http://www.hardware.fr/articles/678-8/nvidia-cuda-plus-pratique.html
0
2
4
6
8
10
12
14
16
18
V8 QX6850 E6850 3x8800 GTX
2x8800 GTX
8800 Ultra eVGA
8800 Ultra 8800 GTX 8800 GTS 8600 GTS 8600 GT 8500 GT 8400 GS
Billions of
Evaluations
Per Watt
AutoDock: Software Used for Drug Discovery
AutoDock:
Finds ways to fit or dock small molecules into proteins
Tries millions of different configurations
Used for virtual screening of new drug leads
Determine if drug molecules can dock into proteins of bacteria
Like finding the right key out of a pile of millions of different keys
Used by thousands of institutions worldwide
Authored by Scripps Research Institute
Accelerating AutoDock with CUDA
Silicon Informatics created siAutoDock
Implemented key kernels in CUDA
National Cancer Institute reports 12x speedup
Run went from 2 hours to 10 minutes
Speed-up expected to scale linearly with multiple GPUs
“We can only hope that in the long run, Silicon Informatics' efforts will accelerate
the discovery of new drugs to treat a wide range of diseases, from cancer to
Alzheimer's, HIV to malaria.” Dr. Garrett Morris, Scripps, Author of AutoDock
John MichalakesLead Software Developer
Weather Research and Forecast (WRF) Model
National Center for Atmospheric Research
Weather Modeling
One of the first HPC applications
Continues to have highest direct public impact
Accurate & timely severe storm/hurricane prediction
Air pollution & dispersion modeling at urban scales
Ever hungry for cycles
5% of Top500® systems dedicated to weather / climate
Weather modeling requires
Solving bigger problem sizes -- solved today by building bigger clusters
Solving problems faster -- improvements tailing off with conventional processors
WRF Hurricane Katrina Forecast
4km resolution moving west
CUDA Results
Weather Research and Forecast (WRF) model
4000+ registered users worldwide
First-ever release with GPU acceleration
Adapted 1% of WRF code to CUDA
Resulted in 20% overall speedup
Ongoing work to adapt larger percentage of
WRF to CUDA
12km CONUS WRF benchmark
Running on NCSA CUDA cluster
SPICE Simulation
SPICE: Among oldest software tools in electronic design automation
Fundamental to semiconductor circuit design & verificationSimulates transistors, interconnects & parasitics in a chip
NVIDIA’s chips have close to 1 billion transistors
Demand for speedup has led to variants with less accuracyFastSPICE: faster, but less accurate
Accurate SPICE simulation takes weeks to months on a workstation
OmegaSim and OmegaSim GX
OmegaSim: Parallel CPU implementation of SPICE models by Nascentric
OmegaSim GX: GPU implementation in CUDA
Speedup transistor evaluation ~40X
Up to 90% of SPICE execution time spent in transistor evaluation
8x overall speedup
4 CPUs
+
4 GPUs> 32 CPUs
OmegaSim GX OmegaSim
faster
Growing Market
$1B
900
800
700
600
500
400
300
200
100
02007 2008 2009 20010 20011
Design
Fab/IDM
Requirements Exceed CPU Capacity
$240M
$800M
Growing TAM
$10B in Correctable Yield Losses
Gauda’s Solution: 200x Faster and Lower Cost
Time
Time-to-Money
100%
60%
Time-to
Market
Delay
(3mo.)
CPU
FPGA
$100K $1 M $10 M
Cost
hours
days
$
1000’s
CPUs
10’s
GPUs
Reven
ue
Typical 1 Yr Life-Cycle
Quantitative Finance Dilemma
HANWECK ASSOCIATES, LLC
As financial products and processes have grown in complexity...
...their computational needs have become more demanding...
• Credit Default Swaps (CDS)
• Collateralized Debt Obligations (CDOs)
• Asset/Mortgage-Backed Securities (ABS/MBS)
• Structured Finance
• Algorithmic Trading
• High-Frequency Trading
• Program Trading
• Risk Management
• Asset Valuation
• Monte Carlo simulations
• Binomial / trinomial trees & lattices
• Numerical integration
• Matrix algebra
• Numerical optimization
• Finite-difference / finite-element methods
• Digital signal processing
• Real-time data processing
...and compute-related resources are at a premium:
• Shrinking IT budgets
• Limited availability of server rack space
• Energy costs for servers and cooling
• Productivity pressure
• Regulatory pressure
• “Green” pressure
The GPU Solution
Raw computational speed for improved productivity:
• A current generation NVIDIA Tesla GPU has 128 floating-point cores.
• 50-100x performance increase over a single CPU core with NVIDIA GPU
Option pricing (binomial tree): 50x speedup (1.25m/sec vs. 25k/sec)
Monte Carlo simulation: 100x speedup (30m/sec vs. 300k/sec)
Numerical integration (Heston model): 100x speedup (5k/sec vs. 50/sec)
Increased efficiency in real-estate and power consumption:
• The GPU’s higher performance translates to:
Lower up-front hardware spend
Lower IT charges for rack space and maintenance
Lower power consumption
Lower operating costsHANWECK ASSOCIATES, LLC
Savings Case Study
Case Study: Hanweck Associates Volera™ real-time option valuation engine
Capable of valuing the entire U.S. listed options market in real-time
using 3 NVIDIA Tesla S870’s
GPUs CPUs GPU savings
Number of Processors 12 600
Rack Space 6U 54U 9x
Hardware Spend $42,000 $262,000 6x
Annual Cost $140,000 $1,200,000 9x
Figures assume:
• NVIDIA Tesla S870s with one 8-core host server per unit
• CPUs are 8-core blade servers; 10 blades per 7U
• $1,800/U/month rack and power charges
• 5-year depreciation HANWECK ASSOCIATES, LLC
RapiHD™ Video Platform: Software that harnesses the GPU
• First company to leverage key GPU technology trends for video
RapiHD™ eliminates the need for specialized hardware
• Disruptive technology for the entire video industry
Elemental’s Solution
Why the GPU?
Elemental recognized three key trends:
1. GPUs have become much more programmable
2. GPUs have become immensely powerful
3. CPU and GPU communication is no longer a bottleneck
GIG
AF
LO
PS
Architectural Fit
CUDA GPUs will revolutionalize video processing
Video compression divides frames in blocks of pixels
CPU processes these serially; GPU processes them in parallel
Core 2 Duo
T5450
GeForce
8800 GTS
(Many Core)
Number of cores 2 128
Blu-ray HD decoding (CPU utilization) > 100% ~ 28%
H.264 encoding (normalized performance) 1 19x
Heterogeneous Computing
Video Processing
Source: NVIDIA
See appendix for details
Era of Visual Computing
Programmable GraphicsCUDA – DX11
1997
August
RIVA 128 GeForce 3G80
CUDA
Fixed-Function Pipelines“3D Accelerators”
Programmable ShadersDX8 – DX9 – DX10
2001 2007
Next
Gen
2009
Tesla2
2008
GeForce 6
2005
Fracture
Indirect Lighting
Optical Complexity
Fluids
Ambient Occlusion
Participating Media
Soft Shadows
Subsurface Scatter
Caustics
Detailed Characters
Rich Environments
mental ray Photorealism in Motion Pictures
SPEED RACER
Image rendered with mental ray® by Digital Domain
© 2008 Warner Brothers. All Rights Reserved.
“So far I haven’t seen a compelling example for using pure classical
ray tracing… Both methods [rasterization and ray tracing] have
been in the race for some time, but rasterization is significantly
ahead based on real world efficiencies.”
Cevat Yerli, Crytek
“Head to head rasterization is just a vastly more efficient use of
whatever transistors you have available.”
John Carmack, ID Software
http://www.pcper.com/article.php?aid=532
The Future of Visual Computing
Programmable and Specialized Processing
Graphics and C/C++
Ray Tracing and Rasterization
Rendering and Simulation
…Evolution
Industry’s Most Advanced Physics Engine
Powerful Core Physics Engine
Rigid Body Dynamics
Collision Detection
Anti-tunneling, Joints, Springs and Motors
Advanced Dynamics
Cloth
Metallic Deformation
Soft Bodies
Force Fields
Physics Shaders
Smooth Particle Hydrodynamics
Core 2 Quad
(Quad Core)
GeForce
9800 GTX
(Many Core)
# of Cores 4 128
Particles 1 20x
Fluid 1 6x
Soft Bodies 1 5x
Cloth 1 5x
Heterogeneous Computing
Physics Processing
Source: NVIDIA estimates
See appendix for system specs
GPU PhysX
Complete port in one month via CUDA!
Rabid CUDA adoption by GPU PhysX ecosystem
Exponential increase in developer adoption
-
50
100
150
200
250
Power for Playing Games, Cool Operation
When Watching a Movie
Playing a Game
To
tal S
ys
tem
Po
wer
Co
ns
um
pti
on
Time
Low Power High Performance
Playing Bluray
Surfing the web
Source: NVIDIA
See Appendix for system specs
The World’s Most Affordable Vista Premium PC
Via CN+GeforceCeleron+
G945+ICH4
1+8 cores
36 GFLOPS
1 core
6.4 GFLOPS
Vista Premium
Blu-ray HD
DX10
Cost <$45 <$45
Source: See Appendix
Enabled by computing
technologies
Architected for extreme
low power
Uncompromising
computing and visual
experience
The Next Personal Computer Revolution
is Starting
NVIDIA’s Computer on a Chip
500 man years and culmination
of 15 years of innovation
Most advanced ultra-low power
computer ever built
The mobile device will become
our most personal computer
GPU - Poised to be a Disruptive Technology
2007 PC TAM
$34B
2012 PC TAM
$53B
Source: Mercury Research, NVIDIA
Appendix I
Slides 22-24. Benchmarks run on Asus P5K-V motherboard (Intel G33 based) with 2GB DDR2 system memory using Windows Vista Ultimate. Intel chipset
driver is 17.14.10.1283. NVIDIA graphics driver is 174.00.
Slide 41. Sources:
1. Interactive Visualization of Volumetric White Matter Connectivity in DT-MRI Using a Parallel-Hardware Hamilton-Jacobi Solver paper by Won-Ki Jeong, P.
Thomas Fletcher, Ran Tao and Ross T. Whitaker
2. GPU Acceleration of Molecular Modeling Applications paper.
3. Video encoding uses iTunes on the CPU, and Elemental on the GPU running under Windows XP. CPUs tested were Intel Core 2 Duo 1.66GHz and Intel
Core 2 Quad Extreme 3GHz. GPUs tested were GeForce 8800M on the Gateway P-Series FX notebook, and GeForce 8800 GTS 512MB. CPUs and
GeForce 8800 GTS 512 were run on Asus P5K-V motherboard (Intel G33 based) with 2GB DDR2 system memory. Based on an extrapolation of 1 min
50 sec 1280x720 HD movie clip. http://developer.nvidia.com/object/matlab_cuda.html
4. High performance direct gravitational nbody simulations on graphics processing units paper. Communicated by E.P.J. van den Heuvel
5. LIBOR paper by Mike Giles and Su Xiaoke.
6. FLAG@lab: An M-script API for Linear Algebra Operations on Graphics Processors paper
7. http://www.techniscanmedicalsystems.com/
8. General Purpose Molecular Dynamics Simulations Fully Implemented on Graphics Processing Units paper by Joshua A. Anderson, Chris D. Lorenz and
A. Travesset
9. Fast Exact String Matching On the GPU presentation by Michael C. Schatz and Cole Trapnell
Slide 51. Benchmarks:
1. Nbody on GeForce 8800GTS 512MB and nbody on the CPU both ran on a Sun Ultra24 workstation with one Intel Core2 Extreme Q6850 3Ghz, with
3GB memory.
2. Nbody on motherboard graphics benchmark was run on GeForce 8200 chipset-based motherboard. CPU was AMD Phenom 9600 Quad-Core 2.3Ghz
with 2GB memory.
Appendix II
Slide 77. Benchmarks:
1. “Art of Disney” sees consistent dropped frames with CPU decode, indicating that decode requires more CPU power than is available. CPU
usage in same system is only 28% when GPU performs decode. System specs: Intel Core 2 Duo T5550 (1.8GHz), GeForce 8800 GTS, 2 GB
DRAM on an Intel 965-based motherboard.
2. 19x transcode result based on comparing iTunes on a Core 2 Duo T5450 (1.67GHz) versus Elemental Technology's RapiHD using GeForce
8800 GTS, both running under Windows XP. At present audio is not being performed by RapiHD. Source video was the same for both, a
1min:50s clip, 1280x720p MPEG2 transcoded to MPEG4 for iPod. Same resulting resolution (640x360) and the same data rates and frames
per second in both. Audio decode is a very small workload compared to HD video decode, and its omission from part of this test is not likely to
make a material difference to the result.
Slide 91. Benchmarks:
1. PhysX CPU estimates based on a system with a Core 2 Quad Q6700.
2. GPU estimates based on PhysX running on one GeForce 9800 GTX with the graphics rendering on a separate GeForce 9800 GTX
3. Both running Windows XP Pro
Slide 95. Benchmarks run on: Phenom 9500 quad core, GeForce 9800GX2, 024MB Corsair DDR2-800, Seagate 7200.10 160GB, Vista
Enterprise, NVIDIA 174.91 and nForce 18.11 drivers.
Slide 97. As of June 2008, “Vista Premium” certification will require DirectX 10 support; 945G is not DirectX 10 capable. Blu-ray playback requires
HDCP support not present in 945G chipset. Market prices based on checks with customers.
Legal information
Performance tests and ratings are measured using specific computer systems and/or
components and reflect the approximate performance of NVIDIA products as measured by
those tests. Any difference in system hardware or software design or configuration may
affect actual performance. Buyers should consult other sources of information to
evaluate the performance of systems or components they are considering purchasing.
Copyright © 2008 NVIDIA Corporation. All rights reserved.
NVIDIA, The NVIDIA logo, GEFORCE, HYBRIDPOWER, SLI, TESLA, CUDA, TEGRA and
PUREVIDEO are trademarks or registered trademarks of NVIDIA Corporation in the U.S.
and/or other countries.
Other names and brands may be claimed as the property of others.