+ All Categories
Home > Documents > The Different Forms of Machine Learning: How They Fit with ...

The Different Forms of Machine Learning: How They Fit with ...

Date post: 20-Feb-2022
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
66
#vmworld BCA1332BU The Different Forms of Machine Learning: How They Fit with VMware Justin Murray, VMware, Inc. Uday Kurkure, VMware, Inc. #BCA1332BU VMworld 2019 Content: Not for publication or distribution
Transcript
Page 1: The Different Forms of Machine Learning: How They Fit with ...

#vmworld

BCA1332BU

The Different Forms of Machine Learning: How They Fitwith VMware

Justin Murray, VMware, Inc.Uday Kurkure, VMware, Inc.

#BCA1332BU

VMworld 2019 Content: Not for publication or distribution

Page 2: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc.

Disclaimer

This presentation may contain product features or functionality that are currently under development.

This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

Technical feasibility and market demand will affect final delivery.

Pricing and packaging for any new features/functionality/technology discussed or presented, have not been determined.

2

The information in this presentation is for informational purposes only and may not be incorporated into any contract. There is no commitment or obligation to deliver any items presented herein. VMworld 2019 Content: Not for publication or distribution

Page 3: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc.

Agenda

3

Vision

AI, Machine Learning and Deep Learning defined

Two Types of Training Data for Machine Learning

Why do GPUs Help Performance?

GPU Performance Tests and Results on vSphere

Machine Learning Infrastructure on vSphere

Where to Learn More

VMworld 2019 Content: Not for publication or distribution

Page 4: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 4

Any GPU/AcceleratorVMworld 2019 Content: Not for publication or distribution

Page 5: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 5

AI, Machine Learning and Deep Learning

VMworld 2019 Content: Not for publication or distribution

Page 6: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 6

VMworld 2019 Content: Not for publication or distribution

Page 7: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 7

Machine Learning Lifecycle – Preparing the Training Dataset

Training Data Set

Labelled Examples

Prepare Training data

VMworld 2019 Content: Not for publication or distribution

Page 8: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 8

Machine Learning Lifecycle – Training

Training Data Set Model

Train the

Model

Labelled Examples

Prepare Training data

Training Phase

VMworld 2019 Content: Not for publication or distribution

Page 9: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 9

Machine Learning – Using the Testing Data

Testing Data –No labels

Classification / Prediction

Model

How Accurate

is the Model?

VMworld 2019 Content: Not for publication or distribution

Page 10: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 10

Machine Learning – Inference

Previously Unseen Data

from Operations

Classification / Prediction

Model

Execute the model for analysis of

the new data

Inference -Operations

Phase

VMworld 2019 Content: Not for publication or distribution

Page 11: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 11

Machine Learning – Training and Inference

Training Data Set

Previously Unseen Data

from Operations

Classification / Prediction

Model

Train the

Model

Labelled Examples

Execute the model for

analysis on the new data

Prepare Training data

Training Phase

Inference -Operations

Phase

VMworld 2019 Content: Not for publication or distribution

Page 12: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 12

Machine Learning – Deployment of the Inference

Training Data Set

Previously Unseen Data

from Operations

Mathematical ModelClassification /

Prediction

Mathematical ModelModel

2. Train the

Model

Labelled Examples

Execute the model for

analysis on the new data

1. Prepare Training data

Training Phase

Scoring/Inference

Phase

Re-Train

VMworld 2019 Content: Not for publication or distribution

Page 13: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 13

MachineLearning

DeepLearningData Lake

ON-PREM

OFF-PREM

training

The Machine Learning Infrastructure Landscape

Data Analytics

Two Main Phases in ML

• Training / Model Building

• Often very large training data sets

• Compute, storage, and network intensive

• Server-class infrastructure

• Inference / Scoring

• Apply existing models to new data

• Used for prediction

• Edge or core infrastructure

VMworld 2019 Content: Not for publication or distribution

Page 14: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 14

MachineLearning

DeepLearningBig Data

ON-PREM

OFF-PREM

training

inference

inference

Machine Learning Infrastructure Landscape

Data Analytics

VMworld 2019 Content: Not for publication or distribution

Page 15: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 15

Different Data Types in Machine LearningDifferent ML Infrastructure

Tabular Data

• Rows and columns

• Database tables

• Spreadsheets and CSV files

• CPUs may suffice

Image, Voice and Text Data • Image classification

• Video analytics,

• Advanced Financial Modeling

• GPUs apply mainly here

a b c d e f h i j k l

1 4 2 2 1 10 0 1 1 1 0

5 8 3 1 1 5 1 0 0 1 1

9 1 3 6 2 14 1 1 1 1 0

VMworld 2019 Content: Not for publication or distribution

Page 16: The Different Forms of Machine Learning: How They Fit with ...

16©2019 VMware, Inc.

The First Form of ML Training DataData in Tabular or CSV Form

VMworld 2019 Content: Not for publication or distribution

Page 17: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 17

Training Data Can Be Tabular

Acct Number

TxnID

Education Age Sex BalanceLimit

Married Paid 1 Month Ago

Paid 2 MonthsAgo

Paid 3 Months Ago

Default

1234 45 2. 21 1 100 0 1 1 1 0

5678 89 3 31 1 5000 1 0 0 0 1

9012 150 3 61 2 1400 1 1 1 1 0

Label

Examplesxi

Features

VMworld 2019 Content: Not for publication or distribution

Page 18: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 18

The Training Process

Weightsw

1234

5678

9012

1234 45 2 21 1 100 80 1 1 1

5678 89 3 31 1 5000 110 0 0 1

9012 150 3 61 2 1400 50 1 0 1

Feature Values - Numerical

Row X ColumnMatrix

Multiplication

VMworld 2019 Content: Not for publication or distribution

Page 19: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 19

H2O.ai’s Driverless AI – Automating Training using Tabular Data

VMworld 2019 Content: Not for publication or distribution

Page 20: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 20

Tabular Training Data – Seen in H2O.ai’s Driverless AI Tool

VMworld 2019 Content: Not for publication or distribution

Page 21: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 21

H2O.ai’s Driverless AI – The Prediction Target and Accuracy

VMworld 2019 Content: Not for publication or distribution

Page 22: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 22

H2O.ai’s Driverless AI – Prediction Target and Accuracy

VMworld 2019 Content: Not for publication or distribution

Page 23: The Different Forms of Machine Learning: How They Fit with ...

23©2019 VMware, Inc.

The Second Form of Training Data for MLImages, Text, Voice, Video

VMworld 2019 Content: Not for publication or distribution

Page 24: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 24

AI, Machine Learning and Deep Learning

VMworld 2019 Content: Not for publication or distribution

Page 25: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 25

Input Layer

Hidden Layers

Output Layer

A Deep Neural Network (DNN)

VMworld 2019 Content: Not for publication or distribution

Page 26: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 26

Data scientists commonly use GPUs for machine learning applications in

• Image classification

• Video analytics

• Financial modeling and analysis

GPUs are particularly useful in Deep Learning

• Where multi-level “deep” neural networks are being trained

Deep Neural Networks with GPUs can complete the training phase in less time than a CPU-based one would (by an order of magnitude)

GPUs and Machine Learning

VMworld 2019 Content: Not for publication or distribution

Page 27: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 27

General Purpose GPU

A high-end GPU has thousands of cores

These cores are used for matrix multiplication and other complex math

GPU cores are optimized exclusively for data computations

VMworld 2019 Content: Not for publication or distribution

Page 28: The Different Forms of Machine Learning: How They Fit with ...

28©2019 VMware, Inc.

Machine Learning Infrastructure in VMware vSphere

VMworld 2019 Content: Not for publication or distribution

Page 29: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 29

ONE Inference for ONE Image Takes10 Millions – 30 Billions multiply add operations

Training

• Large training data (thousands to millions samples)

• Repetition (large number of epochs, iterations)

Inference

• Require real-time processing

• Can have massive amount of concurrent requests in cloud production

High Performance Requirements for Machine Learning/Deep Learning

Source of the graph: https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/README.mdVMworld 2019 Content: Not for publication or distribution

Page 30: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 30

Computing Architectures for Machine Learning

GPU (Graphics Process Unit)Multi-core CPU Systems

FPGA (A field-programmable Gate Array) ASICs: TPU (Tensor Processing Unit)

M60, P100, P40, V100 cards

VMworld 2019 Content: Not for publication or distribution

Page 31: The Different Forms of Machine Learning: How They Fit with ...

31©2019 VMware, Inc.

Performance:GPUs vs CPUsWhy do you need GPUs for ML training?

VMworld 2019 Content: Not for publication or distribution

Page 32: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 32

Training Workloads: Handwriting Recognition & Language Modelling

Handwriting Recognition

Neural Network: Convolutional Neural Network

Dataset:

MNIST database of handwritten digits

• Training set: 60,000 examples

• Test set: 10,000 examples

Complex Language Modeling

• Given history of words, predicts next word

Neural Network: Recurrent Neural Network

• Large Model

– 1500 Long Short Term Memory (LSTM) units /layer

• Medium

– 650 LSTM units /layer

• Small

– 200 LSTM units /layer

Dataset:

• Penn Tree Bank (PTB) Database:

– 929K training words

– 73K validation words

– 82K test words

– 10K vocabularyVMworld 2019 Content: Not for publication or distribution

Page 33: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 33

Training Times with GPU vs. without GPU on a Virtualized Server

1.0

10.1

0

2

4

6

8

10

12

1 2

No

rmaliz

ed

Tra

inin

g T

ime

Lo

we

r is

be

tte

r

Handwriting Recognition with CNN on MNIST

1

7.9

0

2

4

6

8

10

1 2

No

rmaliz

ed

Tra

inin

g T

ime

Lo

we

r is

be

tte

r

Language Modeling with RNN on PTB

56 Hours for No-GPU8 Hours with vGPUEnergy Efficient

VMworld 2019 Content: Not for publication or distribution

Page 34: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 34

VMware vSphere with NVIDIA GPUs

Our customers are using GPUs on VMware vSphere

Accelerating 2D/3D Graphics workloads for VMware Horizon

Enabling VMware Blast Extreme protocol – Encoding / Decoding H.264 and H.265 Based

General Purpose GPU (GPGPU)– Machine learning / Deep Learning

– High performance computing workloads

VMworld 2019 Content: Not for publication or distribution

Page 35: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 35

Benefits of Virtualized GPUs in VMware vSphere

Virtualization Technology efficiently manages servers in the data centers

• Enables Diverse Workloads

– Windows and Linux VMs running on the same host

• Higher Consolidation Ratios

• Suspend/Resume of Virtualized GPU enabled VMs

– ML Training at night

– Interactive CAD jobs during the day

VMworld 2019 Content: Not for publication or distribution

Page 36: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 36

Benefits of Virtualized GPUs in VMware vSphere

vMotion of vGPU VMs

• ML Training or HPC jobs can take days

• Before server maintenance, vMotion the VMs to another host and then move them back after the maintenance. Thus, saving days of work

Combine the Power of GPUs with Management Benefits of Virtualization

VMworld 2019 Content: Not for publication or distribution

Page 37: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 37

Leverage GPU investment across different use cases

• ML Workloads on Linux for Data Scientist/ML researchers

• Virtual Desktop Infrastructure (VDI) for Office Workers on Windows

• 3-D CAD Workloads on Windows and Linux for Scientists

• Simulations on Linux

• End Users in Different Time Zones using GPUs at different times

• Improve Data Center Resource Utilization Using vGPUs in Data Centers

• Virtualized GPUs enable all of the above

A Typical Customer Scenario

VMworld 2019 Content: Not for publication or distribution

Page 38: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 38

Server

VMware Hypervisor (ESX)

Linux Virtual

Machine

Virtual Machine

Windows Virtual

Machine

Virtual Machine

What is a Virtualized NVIDIA GPU (vGPU) ?

Virtual Machine

NVIDIA GPU

H.265 Encode/Decode

Virtual Machine

NVIDIA Driver NVIDIA Driver

NVIDIA vGPU manager (vib)

NVIDIA DriverNVIDIA Driver NVIDIA Driver NVIDIA Driver

vGPU vGPUvGPU vGPU vGPU vGPU

CPUsNVIDIA

GPU

Ha

rdw

are

Vir

tua

liza

tio

n L

aye

r

VMworld 2019 Content: Not for publication or distribution

Page 39: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 39

Virtualized GPUs in vSphere

vSphereHypervisor

GPUGPU GPU

VMware DirectPath I/O

Virtual

Machine

Guest OS

GPU driver

Applications

Virtual

Machine

Guest OS

GPU driver

Applications

Virtual

Machine

Guest OS

GPU driver

Applications

Pass-th

rough

Pass-th

rough

Pass-th

rough

GPU

Pass-th

rough

vSphereHypervisor

vGPU

Virtual

Machine

Guest OS

GPU driver

Applications

Virtual

Machine

Guest OS

GPU driver

Applications

Virtual

Machine

Guest OS

GPU driver

Applications

Virtual

Machine

Guest OS

GPU driver

Applications

NVIDIA GRIDvGPU manager

vGPU

NVIDIA GRID vGPU

Virtual

Machine

Guest OS

GPU driver

Applications

Virtual

Machine

Guest OS

GPU driver

Applications

Virtual

Machine

Guest OS

GPU driver

Applications

vGPUvGPU

GRIDGPU

vGPU vGPU vGPU vGPU

vMotion

Sharing

vMotion

Sharing

vMotion

Sharing

vSphereHypervisor

Virtual Machine

Guest OS

VMware

GPU driver

Applications

NVIDIA Driver

GPU

vSGA

Virtual

Machine

Guest OS

VMware

GPU driver

Applications

multiple vGPUs/VMmultiple GPUs/VM

Diverse Workloads

VMworld 2019 Content: Not for publication or distribution

Page 40: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 40

Pascal (Virtualized for CUDA & Graphics)P40 Card with 24GB GPU Memory

40

virtualGPU type

physical board GRAPHICS CUDA

maximumvirtual GPUsper physical

GPU

GRID P40-1q Tesla P40 yes yes 24

GRID P40-2q Tesla P40 yes yes 12

GRID P40-3q Tesla P40 yes yes 8

GRID P40-4q Tesla P40 yes yes 6

GRID P40-6q Tesla P40 yes yes 4

GRID P40-8q Tesla P40 yes yes 3

GRID P40-12q Tesla P40 yes yes 2

GRID P40-24q Tesla P40 yes yes 1

VMworld 2019 Content: Not for publication or distribution

Page 41: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 41

Virtualization Benefits & Performance:vMotion for vGPUs Enabled VMs

VMworld 2019 Content: Not for publication or distribution

Page 42: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 42

Dell R730 – Intel Broadwell CPUs + 1 x NVIDIA P4040 cores (2 x 20-core socket) E5-2698 v4768 GB RAM

ESX: 6.7u1 NVIDIA Driver: 410.68

Dell R730 – Intel Broadwell CPUs + 1 x NVIDIA P4040 cores (2 x 20-core socket) E5-2698 v4768 GB RAM

Switch

vMotion for NVIDIA vGPU – Test-bed

VMworld 2019 Content: Not for publication or distribution

Page 43: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 43

Workload: SPECapc 3d Max 2015

Categories:

• Modelling

• Interactive Graphics

• Visual Effects

• GPU Rendering

• CPU Rendering

48 Tests:

• Underwater Animation

• Moving City

• Gizmo Transforms

• For complete list:

– Refer to https://www.spec.org/gwpg/apc.static/max2015info.html

43

VMworld 2019 Content: Not for publication or distribution

Page 44: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 44

vMotioning of Different vGPUs Running SPECapc

VMworld 2019 Content: Not for publication or distribution

Page 45: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 45

Six Concurrent vMotions of VMs Running SPECapc

45

VMworld 2019 Content: Not for publication or distribution

Page 46: The Different Forms of Machine Learning: How They Fit with ...

46©2019 VMware, Inc.

Machine Learning Infrastructure in VMware vSphere

VMworld 2019 Content: Not for publication or distribution

Page 47: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc.

Machine Learning Infrastructurein vSphere

NVIDIA vGPU

vGPUManagement

DirectPathIO

AutoScaling

DRS

VMworld 2019 Content: Not for publication or distribution

Page 48: The Different Forms of Machine Learning: How They Fit with ...

48©2019 VMware, Inc.

Performance: Native GPU vs Virtual GPU

VMworld 2019 Content: Not for publication or distribution

Page 49: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 49

4% overhead for both vGPU & DirectPath I/O compared to native GPU

Performance: Training Times on Native GPU vs Virtualized GPU

11.04 1.04

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1 2 3

No

rmaliz

ed

Tra

inin

g T

ime

sL

ow

er

is b

ett

er

Language Modeling with RNN on PTB

VMworld 2019 Content: Not for publication or distribution

Page 50: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 5016

Recommended for Virtualization

NVIDIA Data Center GPUs

V100 P40 T4 M10 P6

GPUs / Board

(Architecture)

1

(Volta)

1

(Pascal)

1

(Turing)

4

(Maxwell)

1

(Pascal)

CUDA Cores 5,120 3,840 2,5602,560

(640 per GPU)2,048

Tensor Cores 640 --- 320 --- ---

RT Cores --- --- 40 --- ---

Memory Size 32 GB/16 GB HBM2 24 GB GDDR5 16 GB GDDR632 GB GDDR5

(8 GB per GPU)16 GB GDDR5

vGPU Profiles

1 GB, 2 GB, 4 GB,

8 GB, 16 GB,

32 GB

1 GB, 2 GB, 3 GB,

4 GB, 6 GB, 8 GB,

12 GB, 24 GB

1 GB, 2 GB, 4 GB, 8 GB, 16

GB

0.5 GB, 1 GB, 2 GB,

4 GB, 8 GB

1 GB, 2 GB, 4 GB,

8 GB, 16 GB

Form FactorPCIe 3.0 Dual Slot & SXM2

(rack servers)

PCIe 3.0 Dual Slot

(rack servers)

PCIe 3.0 Single Slot (rack

servers)

PCIe 3.0 Dual Slot

(rack servers)

MXM

(blade servers)

Power 250W/300W 250W 70W 225W 90W

Thermal passive passive passive passive bare board

PERFORMANCEOptimized

DENSITYOptimized

BLADEOptimized

VMworld 2019 Content: Not for publication or distribution

Page 51: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 51

Turing T4 vs Pascal P40 vs Volta V100 Using Highest vGPU Profile

51

0

100

200

300

400

500

600

700

800

900

1 2 3

Tra

inin

g T

ime

s in

Se

co

nd

sL

ow

er

is b

ett

er

Training Times for Language Modelling Using RNN

VMworld 2019 Content: Not for publication or distribution

Page 52: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 52

rest-bed for NVIDIA on Horizon ViewML Using Containers in vSphere

52

Deep Learning Components• Machine Learning Workloads • TensorRT:19.02-py3• TensorRT-Server: 19.02-py3• TensorFlow: 1.10

Container in a VM Configuration• NVIDIA Docker: 18.09.1• vGPU T4-16Q• CentOS 7.4• ESX 6.X

Dell R730 – Intel Broadwell CPUs + Turing T4 GPU40 cores (2 x 12-core socket) E5-2698 V5 768 GB GB RAM

NVIDIA GPU CLOUD

(Container Repository)

VMworld 2019 Content: Not for publication or distribution

Page 53: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 53

Inferencing Workload: Image Classification Using ResNet50

Workload:– Image Classification

– 1000 classes/labels

Convolutional Neural Network

• ResNet: Residual Network

• Precision: FP 32

• 50 Layers

• Human Brain has similar structure

GPU: Turing T4 with 16GB of GPU Memory

• ResNet50 FP32 needs 2GB of GPU Memory

• T4-2Q profile => Max 8 Users Per T4 GPU

53

VMworld 2019 Content: Not for publication or distribution

Page 54: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 54

Fixed Share Scheduling: Image Classification Using NVIDIA TensorRT Server

54

1.00 1.00 1.00

1.16

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1 2 3 4

No

rmaliz

ed

La

ten

cy

Lo

we

r is

be

tte

r

# of VMs

Context Switching Overhead

1.00

2.00

4.00

7.00

7.71

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

8.00

9.00

10.00

1 2 3 4 5

No

rmaliz

ed

Th

rou

gh

pu

tH

igh

er

is b

ett

er

# of T4-2Q VMs

1xT4-16Q

VMworld 2019 Content: Not for publication or distribution

Page 55: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 55

Improving Inferencing Performance with Turing T4’s Tensor Cores

55

VMworld 2019 Content: Not for publication or distribution

Page 56: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 56

How to Improve Inference Latency, Throughput and Multi-tenancy Using TensorRT and vGPUs?

Uses FP32 and Needs 2 GB

Requires T4-2Q

Supports up to 8 Users on T4

Precision: FP16 or INT8 or INT4

Batch Size: specify a batch_size

Uses FP16 or INT8 or INT4

Needs 1 GB

Supports up to 16 Users on T4

Latency Improvements

Now we can support up to 16 Users on T4 with

Major Latency Improvements!VMworld 2019 Content: Not for publication or distribution

Page 57: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 57

Virtualized GPUs deliver near bare metal performance

VMware vSphere support a full spectrum of workloads and users using GPUs and CPUs

Virtualization magnifies the benefits of lower and mixed precision features of Tensor Cores in GPUs by improving latency, throughput and multitenancy

For more consolidation and multitenancy, use vGPU solution

vGPUs enable concurrent diverse workloads like ML and Graphics

Huge advantage of vMotion and Suspend/Resume feature of vGPU-enabled VMs

VMware vSphere is also a great platform for traditional machine learning involving tabular datasets

VMware vSphere combines performance of GPUs and data center management featuresof virtualization

Key Takeaways

VMworld 2019 Content: Not for publication or distribution

Page 58: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 58

Extreme Performance Series: Sessions

HBI2526BU Performance Best Practices

BCA1482BU SQL Server, Oracle, and SAP Monster Database VMs

HBI2090BU vSphere Compute & Memory Schedulers

HBI2880BU DRS 2.0 Performance Deep Dive

HBI2090BU vSphere & Intel Optane DC PMEM=Max Performance

BCA1430BU Accelerating Application/Database Performance In the Self-Learning SDDC

HBI1421BU Innovations in vMotion: Features, Performance and Best Practices

BCA1393BU SAP HANA on vSphere 6.7u2 and Intel Cascade Lake Best Practices

MLA1594BU Optimize Virtualized Deep Learning Performance with New Intel Architectures

BCA1332BU The Different Forms of Machine Learning: How They Fit with VMware

BCA2551BU Low Latency Media & Entertainment Workloads

HCI1606BU SAP HANA on vSAN: Best Practice Recommendations and Lessons Learned

HCI1619BU Troubleshooting performance issues with vSAN Performance Diagnostics

BCA1563BU High Performance Virtualized Spark Clusters on Kubernetes for Deep Learning

VMworld 2019 Content: Not for publication or distribution

Page 59: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 59

Extreme Performance Series: Hands On Labs

SPL-2004-01-SDC

SPL-2004-02-CHG

ELW-2004-01-SDC

SPL-2047-01-EMT

SPL-2048-01-EMT

ELW-2048-02-EMT

SURVEY: TheVMware Performance Engineering team is always looking for feedback about your experience with the performance of our products, our various tools, interfaces and where we can improve:

www.vmware.com/go/perf

Mastering vSphere Performance

vSphere Challenge Lab

Expert Led Workshop: Mastering vSphere Performance

Accelerate Machine Learning in vSphere Using GPUs

Launch Your Machine Learning Workloads in Minutes on vSphere

Expert Led Workshop: Launch Your Machine Learning Workloads in Minutes on vSphere& Accelerate them using GPUs

VMworld 2019 Content: Not for publication or distribution

Page 60: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 60

VMmark ML*

• Prototype version of our popular VMmark virtualization benchmark that focuses on machine learning

• Features simple, push button deployment of Kubernetes cluster and ML applications

• Can start with single host, and scale to many

• Initial applications:

– MLPerf inference workloads

– Deep Learning image classification workloads

Continue to participate in development of MLPerf training and inference benchmarks

• See https://mlperf.org/

Further Machine Learning/Deep Learning Work

*External Release of VMmark ML will be covered in the future

VMworld 2019 Content: Not for publication or distribution

Page 61: The Different Forms of Machine Learning: How They Fit with ...

VMworld 2019 Content: Not for publication or distribution

Page 62: The Different Forms of Machine Learning: How They Fit with ...

VMworld 2019 Content: Not for publication or distribution

Page 63: The Different Forms of Machine Learning: How They Fit with ...

63©2019 VMware, Inc.

Backup slides

VMworld 2019 Content: Not for publication or distribution

Page 64: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 64

Incorporate additional Kubernetes functionality into Spark tests

• Autoscaling, resource management, test on other K8s platforms

Continue to participate in development of MLPerf training and inference benchmarks

• See https://mlperf.org/

VMmarkML

• A new version of our popular VMmark virtualization benchmark that focuses on machine learning

• Features simple, push button deployment of Kubernetes cluster and ML applications

• Can start with single host, and scale to many

• Initial applications:

– MLPerf inference workloads

– Deep Learning image classification workloads used in this work

Further Machine Learning/Deep Learning Work

VMworld 2019 Content: Not for publication or distribution

Page 65: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc.

Q& A and backup

Contact

Uday Kurkure [email protected]

Justin Murray [email protected]

Thanks to our colleagues

• Lan Vu, Hari Sivaraman, Juan Garcia-Rovetta, Ravi Soundararjan

VMworld 2019 Content: Not for publication or distribution

Page 66: The Different Forms of Machine Learning: How They Fit with ...

©2019 VMware, Inc. 66

Many Types of Neural Networks

• DNNs - Deep Neural Networks– Have multiple hidden layers, along with their input

and output layers

• CNNs - Convolutional Neural Networks – Particularly useful in image recognition

• RNNs – Recurrent Neural Networks – Used for speech recognition or NLP

`

VMworld 2019 Content: Not for publication or distribution


Recommended