+ All Categories
Home > Documents > PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM...

PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM...

Date post: 27-Jul-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
25
PowerAI World’s Fastest AI Platform for Enterprise Sumit Gupta VP, HPC, AI, and Analytics IBM Cognitive Systems May 2017
Transcript
Page 1: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

PowerAIWorld’s Fastest AI Platform for Enterprise

Sumit GuptaVP, HPC, AI, and AnalyticsIBM Cognitive Systems

May 2017

Page 2: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

NewadditionstoPowerAI

2

Page 3: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

3

Transmission Line Inspection

Page 4: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

4

Data LakeTransform & Prep Data (ETL)

Trained Model

Images of Damaged

Components

Model Training

Transform & Prep Data (ETL)

Off-LineTraining

Production

Live Video

Page 5: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

5

Data Lake & Data Stores

Distributed Computing

ML & DL Libraries & Frameworks

Cognitive APIs (Eg: Watson)

In-House Cognitive APIs

Applications

Hadoop HDFS,NoSQL DBs

Spark, MPI

TensorFlow, Caffe, SparkML

Speech, Vision, NLP, Sentiment

Segment Specific: Finance, Retail, Healthcare, etc.

Accelerated Servers Storage

Accelerated Infrastructure

Transform & Prep Data (ETL)

Page 6: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

6

Data Lake & Data Stores

Distributed Computing

ML & DL Libraries & Frameworks

Cognitive APIs (Eg: Watson)

In-House Cognitive APIs

Applications

Accelerated Servers Storage

Data Prep, ETL, Curation, Data

Labeling

Performance to Reduce Training Time

Multi-tenant, Cluster Virtualization, DL

Framework Scaling

Feature extraction, Selecting Right Model,

Hyper-parameter tuning

Finding Right “Tagged” Data, Model Integrity

Use Case Identification, Access to Enough Data

Transform & Prep Data (ETL)

Page 7: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

PowerAI: Enterprise Class, Ease of Use, Faster Training

Enterprise Software Distribution

BinaryPackageofMajorDeepLearningFrameworkswithEnterpriseSupport

Tools for Ease of Development

GraphicaltoolstoEnhanceDataScientistDeveloper

Experience

Faster Training Times for Data Scientists

PerformanceOptimizedforSingleNode&Distributed

ComputingScaling

Page 8: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

PowerAI: Making AI More Accessible to Developers

• AIVision:TargetedatApplicationDevelopers

• DataExtraction,TransformationandPreparationtool

• DLInsight

• DistributedDeepLearning

Multi-tenant,Enterprise-readyDeepLearningPlatformforDataScientists8

Page 9: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

PowerAI

DL Frameworks + Libraries(TensorFlow, Caffe, ..)

IBM Data Science Experience (DSX)

Distributed Computing with Spark & MPI

DL Developer Tools

SpectrumScaleHigh-SpeedFileSystemviaHDFSAPIsClusterofNVLink Servers

PowerAI Enterprise (Coming soon)

IBM Enterprise Support

Application Dev Services

EnterpriseSupport&ServicestoAugmentEnterprise

Expertise

Packaged,Pre-CompiledDeepLearningFrameworks

(TensorFlow,Caffe,Torch,..)

OptimizedforScaling&FastTrainingTime

DataScientistsProductivityToolsTargetedtoDL

Developers

IBMConfidential

Page 10: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

DL Frameworks (TF, Caffe, etc)

Data Prep & ETL via Spectrum Conductor

with Spark

InputData

Deep Learning GUIData & Model

Management, ETL Tools, Monitor, Visualize,

Advise

DL InsightTuning Engine

AI VisionComputer Vision App Development Toolkit

IBM Spectrum Conductor with SparkSystem mgmt, Distributed ETL, Distributed Training, Hyper-Parameter Optimization

Distributed Training

Page 11: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

11

Page 12: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

Tumor Proliferation Assessment – mitosis detectionImages from electron-microscope Size of image - 70K * 60K

Framework Format Input Size (Faster R-CNN)

Caffe LMDB 1K*1K

TensorFlow TensorRecord 1K*1K

Data Transformation

Data Distribution among training, validation and testing

Data Shuffle

Page 13: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

Import data from different formats Transform, split and shuffle data

Page 14: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter
Page 15: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

RandomTPE

Tree-based ParzenEstimator

Bayesian

Multi-tenant Spark Cluster(IBM Spectrum Conductor with Spark)

Spark search jobs are generated dynamically and executed in parallel

Page 16: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

Data preparation Model training/tuning

Inference Marked result

Page 17: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

AIVision

Data Lake & Data Stores

Distributed Computing

ML & DL Libraries & Frameworks

Accelerated Servers Storage

Data set management Training task management

Model management Inference API management

Service Management LayerImage preprocessing

managementData label management

Self-defined Training with visualized

monitoring

Custom Learning for Image Classification

Inference API deployment

Image Labeling and Preprocessing

Vision Recognition LayerVideo Labeling

ServiceCustom Learning for

Object Detection

Page 18: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

AI Vision

18

Result on public cloud API : white, red, yellow and teal bird

Result on public cloud API : white and black short beak bird

I’m Aethopyga I’m Pycnonotus

We need to get a new model to classify birds with professional knowledge.

Acridotheres Acrocephalus Aethopyga

Butorides Corvus… >20 categories

User defines categories in AI Vision

Aethopyga: 0.90708

Pycnonotus: 0. 99988

Page 19: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

AI Vision

19

Medical image analysis for cytologic examination AI Talents:

We need tools to speed up

(study number from China)

Page 20: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

SAMPLE USE CASE: SALES ORDER PROCESSING

• Traditional capture is difficult on Sales Orders (SO)• Sales orders contain line data; one SO can have hundreds or

thousands of different line items• Large enterprises might have tens of thousands of clients ordering

items or services by email• Each client might have multiple locations that each has unique order

template(s)• Sample calculation: 40 000 clients x 20 locations -> 800 000 unique

Sales Order templates• To implement using traditional capture by templating:

• 10 hours / template -> 8 million hour exercise -> very bad business case!

• Each order could have hundreds of complex order items

Page 21: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

EXAMPLE: SALES ORDER PROCESSING USING DATACAP & ELINAR.AI

Oldorders/invoices+extractedinformation

=SeveralweeksofSuperComputercapacity(Power8 Minsky + power.ai)

TrainedAIModel

DatacapValidati-on&

Verificat-ion

IncomingOrder/Invoice

DatacapOCR/Layout

DatacapExtracti-

on

CustomerERP/Finance

Order/InvoiceHistory

AITraining

Page 22: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

DETAILS ON IMPLEMENTATION

• Lots of training material needed• IBM Datacap is used to create page layout.xml for each order• Previously human extracted values need to be mached into each

layout.xml for training purposes

• Clever data preparation allows higher quality/accuracy• We can use simple rules to tag certain types of data before it is fed

into neural network; for example Unit of Measurement (UOM) and ZIP code are easy

• Neural network can use these “hints” to increase training accuracy when data set is small; for example if page has 23 UOM tokens it is quite obvious that there has been 23 different order line

• Implemented using Torch LSTMs

Page 23: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

COMING SOON: ELINAR A.I. MINER FOR GDPR DATA• Set of AIs that can reliably extract personal data and privacy

information from:• Business documents and records• Databases and NoSQL data sources• Images

• Pipeline uses Neural Networks implemented using Caffe and Torch augmented with IBM BigInsights text miners and business rules

• Fully developed on IBM Power platfrom, AIs using power.ai

Page 24: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

WHY IBM POWER.AI?

• Nice packaging that has everything Deep Learning Nerd needs J

• Very fast time to value due simple installation; everything works “out-of-the-box”

• Leverages unique Power8 CPU-GPU NVLink communications on “Minsky” and P100 GPUs

• Allows developer to run insanely powerful “Minsky” supercomputer with standard AI tooling like Caffe and Torch

• We previously developed on high end x86, there is no going back

• Can run larger models faster

Page 25: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter

Thank You


Recommended