Date post: | 08-Sep-2014 |
Category: |
Technology |
Upload: | can-ozdoruk |
View: | 677 times |
Download: | 9 times |
TESLA K40 ACCELERATE COMPUTING FOR LARGE DATA SETS
0
1
2
3
4
5
CPU K20X K40
ns/day
Tesla K40 FASTER
1.4 TF| 2880 Cores | 288 GB/s
AMBER Benchmark: SPFP-Nucleosome CPU: Dual E5-2687W @ 3.10GHz, 64GB System Memory, CentOS 6.2, GPU systems: Single Tesla K20X or Single Tesla K40
AMBER Benchmark
SMARTER Unlock Extra Performance
Using Power Headroom
LARGER 2x Memory Enables More
Apps
6GB
Fluid Dynamics
Seismic Analysis
Rendering
12GB
10.29x
10.23x
8.67x
8.12x
1.66x
0 2 4 6 8 10 12
SPECFEM3D
QMCPACK 4x4x1
AMBER - SPFP-Cellulose_production_NPT
Chroma
ANSYS 14 SMP-V14sp-4
Physics CHROMA
Earth Science SPECFEM3D
Structural Mechanics ANSYS
Molecular Dynamics AMBER
Material Science QMCPACK
Tesla K40 : Acceleration for Large Problems
E5-2687W @ 3.10GHz Tesla K20X
Tesla K40 3
Bigger Challenges – Less Time
M&E More complex scenes,
Accelerates color grading
Graph Analytics Accelerate larger graphs
Neural Networks Larger training sets
Quantum Chemistry Larger problems, more
acceleration
Molecular Dynamics Larger problems, more
acceleration
Bioinformatics Newer algorithms, apps
Material Science Larger ion/electron systems
CFD Larger models, higher
throughput
High Energy Physics More advanced event triggers
Video Frame Rate Conversion
3D Rendering Transcoding and
Encoding broadcast video
Color grading for film and
video
Tesla K40 in Media and Entertianment Creation Distribution
1 Billion Tweets
Faster Decisions
Live Streaming and Analysis
Tesla K40 : Interactive and Real-time Analysis
To Learn more: Register for map-D Webinar on 29th Jan @ 9am PST
8 Tesla K40
0
50
100
150
200
Boar
d Po
wer
(W
atts
) Avg GPU Power in Watts for Real Applications on K20X
4
235W
Power Envelope
Power headroom to higher Performance
GPU Boost on Tesla K40
Base Clock
Workload # 1 Worst case Reference App
235W
Boost Clock #1
Workload # 2 E.g. AMBER
235W
Boost Clock #2
Workload # 3 E.g. ANSYS Fluent
235W
Convert Power Headroom to Higher Performance
5
810Mhz
745Mhz
875Mhz
1.40
1.27 1.23 1.27 1.26 1.28 1.25
1.07
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
ANSYS 14 SMP-V14sp-4 LAMMPS-EAM NAMD 2.9-APOA1 AMBER-SPFP-Nucleosome LSMS-Fe32 QMCPACK 3x3x1 CUBLAS DGEMM
K20X K40@base K40 @ boost
Tesla K40 Performance Relative to Tesla K20X
Real Apps Run Up to 1.4x faster with GPU Boost
ANSYS LAMMPS-EAM NAMD 2.9 APOA1
AMBER-SPFP-Nucleosome
LSMS-Fe32 QMCPACK 3x3x1
CUBLAS DGEMM
6
Non-Tesla
Compute Workload Behavior with GPU Boost
GPU Clock
Automatic clock switching
Default @ Shipping Boost Base
Preset Options Lock to base clock 3 Levels: Base, Boost1 or Boost2
Boost Interface Control Panel NV-SMI, NVML
Target duration for boost clocks ~50% of run-time
100% of workload run time Must-have for HPC workload
Boost Clock # 1
Boost Clock # 2
Tesla K40
Deterministic Clocks
Base Clock # 1
Using GPU Boost on Tesla K40
View the clocks
nvidia-smi -q –d CLOCK,SUPPORTED_CLOCKS
Set the Boost clocks
nvidia-smi -ac <MEM clock, Graphics clock>
Boost all 2880 Cores
End User selects the clocks
Higher memory b/w
GPU GPU
Host GPU
Customer Feedback on K40 w/GPU Boost
17% Faster 13% Faster 11% Faster
http://www.eyesopen.com/fastrocs
http://blog.xcelerit.com/benchmarks-nvidia-tesla-k40-vs-k20x-gpu/
K40 w/GPU Boost 40% higher perf
*Tesla K40 Performance Relative to Tesla K20X
Tesla Resources
! Want to know more about Tesla Products http://www.nvidia.com/object/tesla-servers.html http://www.nvidia.com/object/tesla-workstations.html
! Need help on using GPU Boost on Tesla K40 http://www.nvidia.com/object/tesla_product_literature.html
! Product details, specs, etc.
http://www.nvidia.com/object/tesla_product_literature.html
! Where to buy http://www.nvidia.com/object/where-to-buy-tesla.html
1. Sign up for FREE GPU Test Drive visit: http://www.Nvidia.com/GPUTestDrive
2. Accelerate your apps on latest K40 GPUs
3. Tell us how K40 and GPU Boost worked for you
Test Drive the World’s Fastest GPU
Upcoming GTC Express Webinars
January 29: map-D: A GPU Database for Real-time Dig Data Analytics and Interactive Visualization
January 30: Debugging CUDA Fortran using Allinea DDT
February 5: OpenMM - Accelerating and Customizing Molecular Dynamic Simulations on GPUs
February 25: Using GPUs to Supercharge Visualization and Analysis of Molecular Dynamics Simulations with VMD
Register at www.gputechconf.com/gtcexpress
GTC 2014 Registration is Open Hundreds of sessions in the areas of
§ Science and research
§ Professional graphics
§ Mobile computing
§ Automotive applications
§ Game development
§ Cloud computing
Register with GM20EXP for 20% discount
www.gputechconf.com