Post on 01-Apr-2015
transcript
2
CUDA GPU Accelerates ComputingThe Right Processor for the Right Task
CUDA GPUHundreds of parallel cores
CPUSeveral sequential cores
3
How CPUs and GPUs Work
void serial_function(… ) { ...}
void parallel_function(float ... ) { ...}
void main( ) { ... parallel_function<<...>>(..); serial_function(..); ...}
CUDA Application Code
Heavy parallel workload on the GPU Other serial routines on the CPU
2007 2008 2009 2010 2011 20120
200
400
600
800
1000
1200 Peak Double Precision FP GFlops/sec
Nehalem3 GHz
Westmere3 GHz
TeslaM2070
TeslaM2090
M1060
Kepler
8-coreSandy Bridge
3 GHz
GPU Performance Far Outstrips CPUs
Double Precision: NVIDIA GPU Double Precision: x86 CPU
5
Tianhe-1A Jaguar Nebulae Tsubame 2.0
Hopper II0
500
1000
1500
2000
2500
0
2
4
6
8Gi-
gaflopsMegawatt
s
Much More Power Efficient
Tesla GPUs Tesla GPUs Tesla GPUs CPU OnlyCPU Only
6
Transformational for Customers
4 Months 2 Years
$1M Budget
~120 Compute Nodes~45 CPU + GPU Nodes
Time to Discovery
7
#1 Numerical ComputationMATLAB
#1 Molecular DynamicsAMBER
#1 Engineering SimulationANSYS
#1 3D DCC3ds Max
Science Category GPU Port CompleteGPU Port Started & Results Published
Early GPU Port
Molecular DynamicsNAMDAMBER
DL_POLY
CHARMMGROMACS
Chemistry LAMMPSMOLPROGAMESS
CPMDDESMONDGAUSSIAN
Fluid Dynamics OpenFOAMS3D
FEFLOANSYS CFD
Structural Mechanics ANSYS MechanicalSimulia Abaqus/Std (beta)
CTHLS-DYNA Implicit
MSC Marc
PAM-CRASH IMPLICITMSC Nastran
Earth Science WRFHOMME, HYCOM
COSMO-2, PFLOTRANASUCA
CCSM/CESM
Material Science PARATECLAMMPSPWscf
CPMDVASP
GAUSSIAN
Oil & Gas Schlumberger Petrel VoxelGeo Paradigm
Analytics MatlabMathworks
Others GADGET2GTC
MILCDENOVO
Leading HPC Applications Ramping
9
CUDA Taking HPC by Storm
2005 2006 2007 2008 20090
500
1000
1500
2000
2500
3000
NVIDIA GPGPU:Papers and Articles
100,000 Active GPU Developers
400 Universities Teaching CUDA
1000+ Clusters Worldwide
35+ CUDA Research Centers
200,000,000 CUDA GPUs Deployed
100% OEMs offer CUDA GPUs