João Paulo Navarro, Solutions Architect
NVIDIA PLATFORM FOR AI- Linkedin
3
NVIDIA
GPU Computing
Gaming VR AI & HPC Self-Driving Cars
4
GPU COMPUTING AT THE HEART OF AI
Big Bang of Modern AI
103
105
107
1.5X per year
40 Years of CPU Trend Data
Single-threaded perf
GPU-Computing perf
1.5X per year
1.1X per year
1000X
by 2025
Performance Beyond Moore’s Law
Original data up to the year 2010 collected and plotted by M. Horowitz,
F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten New plot and data collected for 2010-2015 by K. Rupp
1980 1990 2000 2010 2020
AlexNet
CAMBRIAN EXPLOSION
Convolutional Networks Recurrent Networks
Generative Adversarial Networks
Reinforcement Learning
There is a Cambrian explosion of neural networks. Since AlexNet, thousands of new models have emerged. With hundreds of layers and billions of parameters, their complexity has soared by 500X in just 5 years. The hyperscale datacenters that host them serve billions of people, cost billions to operate, and are among the most complex computers the world has ever made. Maintaining great quality of service while minimizing cost is incredibly difficult. Jensen helps us remember with PLASTER.
PROGRAMMABILITY
LATENCY
ACCURACY
SIZE
THROUGHPUT
RATE OF LEARNING
ENERGY EFFICIENCY
Convolutional Networks
Recurrent Networks
Generative AdversarialNetworks
Reinforcement Learning
New Species
8
REVOLUTIONARY AI PERFORMANCE
Performance up to 100 CPUs
21 billion transistors – 5120 CUDA cores
New Tensor Core architecture inspired by the demands of deep learning
Volta is the Most Advanced Data Center GPU Ever Built
9
MAXIMIZING PERFORMANCE ON VOLTA
GPU Generational Training Scaling
0
4
8
12
K80 V100 Tensor Core
ResNet-152 Training, 8x K80 (16 GPUs total) compared with 8x V100 NVLink GPUs using NVIDIA 17.10 containers
Greater Than 10x Performance K80 vs. V100
10
DEEP LEARNING
11
AI AND DEEP LEARNING
NVIDIA AI PLATFORM
NVIDIA GPU Cloud NVIDIA AI Inference TITAN VEvery Cloud
Every Computer Maker
Tesla V100 DGX-1 and DGX Station
Announcing NEW 32GB
2XAnnouncing NEW 32GB
2X
13
DEEP LEARNING SOFTWARE
developer.nvidia.com/deep-learning
14
WHAT IS THE BEST DEEP LEARNING FRAMEWORK?
15
DL FRAMEWORKSHow to choose?
Jeff Dean and Francois Chollet from Google have indicated relevant DL framework statistics for adoption.
16
DL FRAMEWORKSHow to choose?
https://developer.nvidia.com/deep-learning-frameworks
17
DL FRAMEWORKSHow to choose?
https://developer.nvidia.com/deep-learning-frameworks
18
INFERENCE
19
AI INFERENCING AT THE SPEED OF LIGHT
HTTPS://WWW.YOUTUBE.COM/WATCH?V=-4UG6QFHPUM
20
THE BRAIN OF AI CARSNVIDIA DRIVE™ scalable AI platform for
entire range of autonomous driving
320+ companies have adopted DRIVE, for
data centers and in vehicles
Includes automakers and suppliers,
mapping and sensor companies, startups
and research orgs
21
NVIDIA DRIVE AUTOMOTIVE PERCEPTIONHTTPS://WWW.YOUTUBE.COM/WATCH?V=D1JDS-KXXJA
22
NVIDIA TENSORRT PROGRAMMABLE INFERENCE ACCELERATOR
TESLA V100
DRIVE PX 2
TESLA P4
JETSON TX2
NVIDIA DLA
TensorRT
Frameworks Platforms
24
NVIDIA TENSORRT10X BETTER DATA CENTER TCO
160 CPU Servers
45,000 Images / Second
65 KWatts
25
NVIDIA TENSORRT10X BETTER DATA CENTER TCO
1 NVIDIA HGX with 8 Tesla V100 GPUs
45,000 Images / Second
3 KWatts
1/6 the Cost | 1/20 the Power
4 Racks in a Box
TENSORRT - NVIDIA AI INFERENCE
SPEECH SYNTH
DGN, S2S
TensorRT 2
INT8
TensorRT 3
Tensor
Core
TensorRT
CNNs
TensorRT 4
TensorFlowIntegration
KaldiOptimization
ASR
RNN++
RECOMMENDER
MLP-NCF
NLP
RNN
IMAGE / VIDEO
CNN
30MHYPERSCALE SERVERS 190X
IMAGE / VIDEOResNet-50 with
TensorFlow
Integration
50XNLPGNMT
45XRECOMMENDER
Neural
Collaborative
Filtering
36XSPEECH
SYNTH WaveNet
60XASR
DeepSpeech 2
DNN
All speed-ups are chip-to-chip CPU to GV100.Sept ‘16 Apr ‘17 Sept ‘17 Apr ‘18
ONNX
WinML
27
BIG DATA & ANALYTICS
28
DATA DELUGE TO DATA HUNGRY
INCREASING DATA VARIETY
Search Marketing
Behavioral Targeting
Dynamic Funnels
User Generated Content
Mobile Web
SMS/MMS
Sentiment
HD Video
Speech To Text
Product/Service Logs
Social Network
Business Data Feeds
User Click Stream
Sensors Infotainment Systems
Wearable Devices
CyberSecurity Logs
ConnectedVehicles
Machine Data
IoT Data
Dynamic Pricing
Payment Record
Purchase Detail
Purchase Record
Support Contacts
Segmentation
Offer Details
Web Logs
Offer History
A/B Testing
BUSINESS PROCESS
PETABYTES
TERABYTES
GIG
ABYTES
EXABYTES
ZETTABYTES
Streaming Video
Natural Language Processing
WEB
DIGITAL
AI
29
WORKAROUNDS ARE NOT THE ANSWERS
EXPLORE THE OUTLIERS AND LONG-TAIL EVENTS
Pre-aggregation struggles at scale
RELY ON ACCURATE DATA
Scale out on CPU infrastructure has
tremendous hidden costs
SCALE WITH A ROI
Sampling misses the whole picture
$
30
NVIDIA ACCELERATED ANALYTICSGPUs in the Data Center
AI-ACCELERATEVISUALIZEANALYZE
31
GPU FOR ANALYTICS SOLUTIONS + ARCHITECTURES
Spark Scheduler
CORE TECHNOLOGIES
GPU-ACCELERATED DATA CENTER
ACCELERATED VISUALIZATION
ACCELERATED DATABASES
DEEP LEARNING
CloudNVIDIA DGX Products
CORE TECHNOLOGIES
TRADITIONALDATA CENTER
VISUALIZATION
DATABASES
NVIDIA Tesla GPUs
Mesos
32
GPU-ACCELERATION HAS NO LIMITSMapD
BlazeGraph
Kinetica
Leading In-Memory DB> 50x Slower
NoSQL DB’s> 100x Slower
Aggregate of queries - Time (s)Less is better!
SQream
1403
1843
700
GPUs 700X-800X faster
than graphs in all cases
700M Edges Single Node
Xeon 2650 vs 2 K80
1.98B Edges 16 EC2
r3.xlarge vs 16 K40s
1.98B Edges 16 EC2
r3.4xlarge vs 16 K40s2
1.98B Edges Spark CPU
Baseline
1
Speed-up over baseline spark CPU configuration
Speed-u
p (
hig
her
is f
ast
er)
33
GPU-ACCELERATION HAS NO LIMITSMapD
34
MAPD: GPU Accelerated Database
35
ML ACROSS INDUSTRIES
Finance Healthcare Telco
37
H2O4GPU PERFORMANCE
GLM XGBoost K-Means
40x10x5x
38
NVIDIA VOLTA IN EVERY CLOUD, EVERY DATACENTER
NVIDIA GPU CLOUDOptimized Stacks for Every Cloud
20,000+ Registered Organizations | 30 Containers
NOW on AWS, GCP, AliCloud, Oracle Cloud, DGX
HOW TO START?Develop on GeForce, Deploy on Tesla
GeForceStart development using GeForce
CloudScale out on cloud
Data CenterDeploy on data center
41
developer.nvidia.com
INCEPTION PROGRAM
https://www.nvidia.com/en-us/deep-learning-ai/startups/
João Paulo Navarro, Solutions Architect
NVIDIA PLATFORM FOR AI- Linkedin