+ All Categories
Home > Documents > INTRODUCING NVIDIA DGX˜1 THE WORLD’S FIRST DEEP LEARNING ... · power-on to deep learning in...

INTRODUCING NVIDIA DGX˜1 THE WORLD’S FIRST DEEP LEARNING ... · power-on to deep learning in...

Date post: 21-Aug-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
1
INTRODUCING NVIDIA DGX-1 THE WORLD’S FIRST DEEP LEARNING SUPERCOMPUTER IN A BOX EXPERIENCE A TRUE TURNKEY SOLUTION WITH FULLY INTEGRATED SOFTWARE AND HARDWARE HARDWARE SOFTWARE Accelerate Your Deep Learning Today www.nvidia.com/dgx1 © 2016 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, NVIDIA Pascal and DGX-1 are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. All other trademarks and copyrights are the property of their respective owners. POWERED BY 8 NVIDIA TESLA P100 GPUs BUILT ON THE LATEST NVIDIA PASCAL GPU ARCHITECTURE ITERATE AND INNOVATE FASTER WITH UNPARALLELED DEEP LEARNING TRAINING PERFORMANCE GET STARTED WITH DEEP LEARNING MORE QUICKLY AND EASILY THAN EVER BEFORE WITH NVIDIA DGX-1 16 nanometer FinFET 3D transistors for faster performance with lower power consumption Revolutionary NVIDIA NVLink TM high-speed bidirectional interconnect for maximum multi-GPU application Performance- optimized deep learning software that accelerates all major deep learning frameworks CoWoS ® with HBM2 high-bandwidth memory for 3x bandwidth of previous generation at lower power 58X FASTER TRAINING 0 10X 20X 30X 40X 60X 50X Relative Performance (Based on Time to Train) 1310 Hours (54.58 Days) 23 Hours, less than 1 day 34X MORE PERFORMANCE 0 10 50 100 150 170 DGX-1 Performance in teraFLOPS CPU-Only Server DGX-1 CPU-Only Server 5 TFLOPS 170 TFLOPS DEPLOY QUICKLY AND SIMPLY Plug-and-play setup that takes you from power-on to deep learning in minutes CLOUD SERVICES AND SUPPORT Access to NVIDIA’s vast deep learning knowledge, expertise, and the latest software updates i i GPUs 8X NVIDIA Tesla ® P100 16GB/GPU 28,672 Total NVIDIA CUDA ® Cores GPU INTERCONNECT NVIDIA NVLink Hybrid Cube Mesh CPUs 2X 20-Core Intel ® Xeon ® E5-2698 v4 2.2 GHz STREAMING CACHE 4X 1.92 TB SSDs RAID 0 NETWORK INTERCONNECT 4X InfiniBand 100 Gbps EDR 2X 10GbE SYSTEM MEMORY 512 GB 2133 MHz DDR4 POWER 4X 1600 W PSUs (3200 W TDP) COOLING Efficient Front-to-Back Airflow CPU is dual socket Intel Xeon E5-2699v4. 170TF is half precision or FP16 Caffe benchmark with VGG-D, training 1.28M images with 70 epochs | CPU server uses 2x Xeon E5-2699v4 CPUs DEEP LEARNING USER SOFTWARE NVIDIA DIGITS GPU DRIVER NVIDIA GPU Compute Driver Software SYSTEM GPU-Optimized Linux Server OS DEEP LEARNING LIBRARIES NVIDIA cuDNN and NCCL ACCELERATED SOLUTIONS CONTAINERIZATION TOOL NVIDIA Docker MANAGEMENT NVIDIA Cloud Management Service 5 3 4 1 1 2 3 4 5 6 7 8 6 7 8 2 DEEP LEARNING FRAMEWORKS
Transcript
Page 1: INTRODUCING NVIDIA DGX˜1 THE WORLD’S FIRST DEEP LEARNING ... · power-on to deep learning in minutes CLOUD SERVICES AND SUPPORT Access to NVIDIA’s vast deep learning knowledge,

INTRODUCING NVIDIA DGX-1

THE WORLD’S FIRST DEEP LEARNING SUPERCOMPUTER

IN A BOX

EXPERIENCE A TRUE TURNKEY SOLUTION WITH FULLY INTEGRATED SOFTWARE AND HARDWARE

HARDWARESOFTWARE

Accelerate Your Deep Learning Today www.nvidia.com/dgx1

© 2016 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, NVIDIA Pascal and DGX-1 are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. All other trademarks and copyrights are the property of their respective owners.

POWERED BY 8 NVIDIA TESLA P100 GPUs BUILT ON THE LATEST NVIDIA PASCAL™ GPU ARCHITECTURE

ITERATE AND INNOVATE FASTER WITH UNPARALLELED DEEP LEARNING TRAINING PERFORMANCE

GET STARTED WITH DEEP LEARNING MORE QUICKLY AND EASILY THAN EVER BEFORE WITH NVIDIA DGX-1

16 nanometer FinFET 3D transistors for

faster performance with lower power

consumption

Revolutionary NVIDIA NVLinkTM high-speed

bidirectional interconnect for

maximum multi-GPU application

Performance- optimized deep

learning software that accelerates all major

deep learning frameworks

CoWoS® with HBM2 high-bandwidth memory for 3x

bandwidth of previous generation at lower

power

58X FASTER TRAINING

0 10X 20X 30X 40X 60X50X

Relative Performance (Based on Time to Train)

1310 Hours (54.58 Days)

23 Hours, less than 1 day

34X MORE PERFORMANCE

0 10 50 100 150 170

DGX-1

Performance in teraFLOPS

CPU-Only Server

DGX-1CPU-Only Server

5 TFLOPS

170 TFLOPS

DEPLOY QUICKLY AND SIMPLY

Plug-and-play setup that takes you from power-on to deep learning in minutes

CLOUD SERVICES AND SUPPORT

Access to NVIDIA’s vast deep learning knowledge, expertise, and the latest

software updates

ii

GPUs

8X NVIDIA Tesla® P100 16GB/GPU 28,672 Total NVIDIA CUDA® Cores

GPU INTERCONNECT

NVIDIA NVLink™

Hybrid Cube Mesh

CPUs

2X 20-Core Intel® Xeon®

E5-2698 v4 2.2 GHz

STREAMING CACHE

4X 1.92 TB SSDs RAID 0

NETWORK INTERCONNECT

4X InfiniBand™ 100 Gbps EDR2X 10GbE

SYSTEM MEMORY

512 GB 2133 MHz DDR4

POWER

4X 1600 W PSUs(3200 W TDP)

COOLING

Efficient Front-to-Back Airflow

CPU is dual socket Intel Xeon E5-2699v4. 170TF is half precision or FP16

Caffe benchmark with VGG-D, training 1.28M images with 70 epochs | CPU server uses 2x Xeon E5-2699v4 CPUs

DEEP LEARNING USER SOFTWARENVIDIA DIGITS™

GPU DRIVERNVIDIA GPU ComputeDriver Software

SYSTEMGPU-Optimized Linux Server OS

DEEP LEARNING LIBRARIESNVIDIA cuDNN and NCCL

ACCELERATED SOLUTIONS

CONTAINERIZATION TOOLNVIDIA Docker

MANAGEMENTNVIDIA CloudManagement Service

5

3

4

1

1

2

3

4

5

6

7

8

6

7

8

2

DEEP LEARNING FRAMEWORKS

Recommended