Ulrich Walter
Cognitive Systems HPC & Cloud Sales Leader
Böblingen, 29.06.2017
IBM Cognitive Systems –Technology for AI
Cognitive, AI and Analytics examples, trends and directions
IBM Systems
The world is changing
PastProcesses
Social Media explosion
Mobilerevolution
Power ofanalytics
Cloud enablement
XaaS
Location based
User genereated
Social feedbackloop
Collaborativebuying
Digital moneyPalm sizedAnd wearable
PredictiveAnalytics
Cognitive AI
Entire Integration
Boundary less
PresentCollecting, Assistants
FutureIntelligent, Autonomous
IT Centralized System
Batch oriented
Manual Processes
Autonomous
Point to Point Communication
| 2
IBM Systems
It‘s all about prediction and recognition
| 3
Cogito, Ergo SumRene Descartes, 1637
Γνῶθι σεαυτόν
Chilon of Sparta,555 B.C
IBM Systems
Today
>30000By 2020
85% 20%By 2020 By 2020
$47B
AI MOMENTUM
of companies will dedicate workers
to monitor and guide neural
networks.
spend on AI technologies
of all customer service
interactions will be powered by AI
bots
AI startups
| 4
IBM Systems | 5
Obama: My Successor Will Govern a Country Being Transformed by AI
IBM Systems
Overall Artificial Intelligence (AI) Space
Machine Learning
Deep LearningIT Systems break tasks
into Artificial Neural Networks
New Data Sources: NoSQL, Hadoop &Analytics
New class of applicationsMachine Learing & Training▪ Pattern matching▪ Image ▪ Real-time decision support▪ Complex workflows▪ Data Lakes
Extend Enterprise applications ▪ Finance: Fraud detection / prevention▪ Retail: shopping advisors ▪ Healthcare: Diagnostics and
treatment▪ Supply chain and logistics
Extend Predictive Analytics to Advance Analytics with AI
Human Intelligence Exhibited by Machines
Cognitive / ML/DL
“Human Trained” using large amounts of data & ability to learn how to perform the
task
Growing across Compute, Network, Middleware, and Storage
| 6
IBM Systems
By 2022,HPC-driven simulations and deep learning will be the core innovation engines driving 10,000x increase in compute requirements
| 7
Solution areas for AI
NLS andtext mining
systems
ImageRecognition
Autonomous systems
Robots and robot
collaboration
Multiple agent systems
Intelligent Training
Softbots anddigital twins
PredictiveAnalytics
Lerning and inference libraries
Knowledge representation language
Subsymbolic pattern regognition
Learning
Knowledge representation
Knowledge processing- Search - Acknowledge- Plan
On
tho
log
y
AI
Hard
ware
IBM Systems
Industry examples – Deep Learning/Big Data
Automotive and Transportation
Security and PublicSafety
Consumer Web, Mobile, Retail
Medicine and Biology Broadcast, Media and Entertainment
• Autonomous driving:• Pedestrian detection• Accident avoidance• Maintenance prediction
• Video Surveillance• Image analysis• Facial recognition and
detection
• Image tagging• Speech recognition• Natural language • Sentiment analysis
• Drug discovery• Diagnostic assistance• Cancer cell detection
• Captioning• Search• Recommendations• Real time translation
| 9
IBM Systems
The idea – A Computer with some human attributes
How can computers and
robots explore untouched and dangerous areas?
How can a computer
analyze and create movies?
How can robots collaborate autonomous
in teams and solve problems?
How can a computer recognize, understand and interpret human language?
How can computers detect human mood and
feelings?
How can computer
systems conclude on experience and data ?
How can computers and
robots become intelligente assistants?
How can systems learn
by experience ?
Intelligente
Software-
Systeme
des DFKI
| 10
IBM Systems
Mastering the turing test requires deep learning
Deep learning in action
| 11
IBM Systems
For machine/deep learning you need the following components
| 12
1. A large set of tagged data 2. A neuronal network
Input layer
Hidden Layer (s)
Output layer
3. An HPC server with GPUs and extreme high internal bandwidth
+ +
IBM Systems
People Ecosystems
IT Systems Things
Intelligence DATAData +
AnalyticsAnalytics
Structured & Unstructured data
Analytics platforms andframeworks
Enterprise data sources, analytics and deep learning
| 13
IBM Systems
Some principles of AI
| 14
Data Collection, Storage and Distribution
Storage nodes
POWER AI Framework
TextImage & Video
Voice & Sound Sensor
IBM
Watson
complementing
Canned or tinned
knowledge
Detect and Collect Store
Compress/Map Reduce
Tag/Aggregate
Knowledge Base
Analyze/LearnDistributed Deep Learning
Comparison and intrepretation
Combine
Conclude
1 2 3
IBM Systems
Deep learning in multiple layered convolutional neuronal networks (CNN)
Raw data Iterated data Tagged data
Elephants
Chairs
| 15
IBM Systems
Attributes as side information
| 16
Images Attributes Class
Long fluffy earsBrownfurLives in AustraliaFeeds on EukalyptusMammal
No ears Black and whitefeathers Lives in antarcticaFeeds on FishBird
1 0 1 1 0 1 01
0 0 0 1 0 1 10
Koala
Penguin
IBM Systems
Deep Learning in a NutshellShallow (supervised) machine learning pipeline
Feature extraction learning
0.349.341.450.012.55
“Coffee Mugs”
modelVery difficult to findrobust mathematical
Representations
Done by human experts
0.34
9.34
1.45
0.01
2.55
IBM Systems | 18
3.Feature Extraction
4.Modelling
5.Model1.
Unstructured Data
2. Tagging
Or
semantic Label
Deep Learning in a Nutshellclosed optimization of this problemby Neuronal Networks with many layers
-Coffee Mug-Right handle-white
Semanticlabel
Pixel Analysis, color and channel depths, patterns etc.
Building the model with a CNN
x1
x2
xn
𝑓 = (𝑥1, 𝑥2…𝑥𝑛)
IBM Systems | 19
Is it a machine
perception
problem
Is there sufficient
data to train on?
Look at other
approach
Gather more data
No
No
Yes
Align relevant
data sets
using big
data
ETL
middleware
to standard
schemaTag/aggregate
Execute
Training
models
Select and
define
training
algorithm
Evaluate
reuslts and
fine tune
algorithms
Deploy for
production
Training and inference
Data transformation
Problem identification A typical training cycle
IBM Systems
Simple example of classification for monitored learning
f( ) = “Merkel”
f( ) = “Gabriel”
f( ) = “Merkel”
Slide credit: L. Lazebnik
f(x)= y; y= Output, x= Input, f=classification function
Trainingsphase: Generate function f, which minimizes the classification error in function f
Testphase: Execute on Data not contained in training data.
IBM Systems
Deep Learning for picture recognition: networks on multiple layers with feature recognitized Neurons
Accident on the highway
Input layer
Output layer
Pic 2Pic 3
Image elements, arrays and borders,
Scenelements, Object artefacts
Objects
Pic 1
Scene- and Objectmodel
IBM Systems
http://playground.tensorflow.org Blog Post: goo.gl/WffecA
Googles Tensorflow as Workbench for machine learning
IBM Systems
Bitte geben Sie hier den Titel Ihrer Präsentation ein
y = Wx + b
Input
Weight
Ouput
Example: Find the housepreis (y) depending on the housesize (x),simplified because multidimensional, e.g. area, age , features
Objective: Search the best predictions for W und b
TensorFlow as operational graph of operations of Tensor data pack
x = Size of a house
y =
Pri
ce
of a
ho
use
Predictive error
● A Scalar is a Tensor● A Vector is a Tensor● A Matrix ist ein Tensor
IBM Systems
import tensorflow as tfx = tf.placeholder(shape=[None],
dtype=tf.float32,name='x')
W = tf.Variable(tf.random_normal([1], name=“W”)b = tf.Variable(tf.random_normal([1], name=“b”)y = W * x + b
with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
print(sess.run(y, feed_dict={x: x_in}))
Design of the Graphmodell
Start of learning environment
Initializaiton of Variables
Start of Trainings
TensorFlow program code in Python
+
matmul
W
b
x
y
IBM Systems
Iris setosa (0) Iris versicolor (1) Iris virginica (2)
Bitte geben Sie hier den Titel Ihrer Präsentation ein
/
Classification of three types of Iris using a TensorFlow-Model
IBM Systems
Development of Convolutional Networks
| 26
# of Transistors per CPU 106
# of Pixels 107
# of Transistors per CPU 109
# of Pixels 1014
IBM Systems
Sic Transit Gloria Mundi
Google Brain 2012
16.000 Servers~ 8 MW/h~ 50 TFLOPS
3 NVIDIA PASCAL GPUs~ 0,9kW/h~ 62 TFLOPS
| 27
IBM Systems
Leveraging the first CPU designed for accelerated computing
Faster Cores than x86
Larger Caches Per Core than x86
5X Faster CPU-GPU Data Communication
High Performance Cores
Fast & Large Memory System
Fast Power Interconnects for Accelerators
CAPI
NVLink
PCIe
P8
POWER8
| 28
IBM Systems
IBM Power Systems LC Line for AI, HPC and BigData
S822LC For High Performance Computing
• Incorporates the new POWER8 processor with NVIDIA NVLink
• Delivers 2.8X the bandwidth to GPUs accelerators
• Up to 4 integrated NVIDIA “Pascal” GPUs
S822LC For Big Data
• Ideal for storage-centric and high data through-put workloads
• Brings 2 POWER8 sockets for Big Data workloads
• Big data acceleration with work CAPI and GPUs
S821LC
• Storage rich single socket system for big data applications
• Memory Intensive workloads
S822LC
• 2X memory bandwidth of Intel x86 systems
• Memory Intensive workloads
S812LC
• 2 POWER8 sockets in a 1U form factor
• Ideal for environments requiring dense computing
High Performance Computing
OpenPOWER servers for cloud and cluster deployments that are different by design
| 29
IBM Systems
Power S822LC for HPC (aka Minsky) vs x86 with P100 GPU
▪ 2.8X the CPU-GPU bandwidth compared to x86 based systems– S822LC for HPC with CPU-GPU NVLink capability not available on x86 servers
▪ faster than any PCI-E platform with 4 GPUs
– S822LC for HPC packaging allows for higher power/frequency
▪ X86 P100 PCI-E Performance compares
– Kinetica: 2.7X vs x86 with 4 PCI-E based P100
– CPMD: 3X performance of CPU only implementation
▪ The first ever GPU accelerated version of CPMD
– NAMD: 30% increase when combine with visualization code
| 30
IBM Systems | 32
IBM Systems
Power AI takes advantage of NVLink between the POWER8 CPU the P100 GPUs to increase system bandwidth
▪ NVLink between CPUs and GPUs
enables fast memory access to large data
sets in system memory
▪ Two NVLink connections between each
GPU and CPU-GPU leads to faster data
exchange
GPU with NVLink
Power Chipwith NVLink
Gra
ph
ics
Me
mo
ry
System Memory
40+40 GB/s
Gra
ph
ics
Me
mo
ry
PCIe x16
NVIDIA GPU
Graphics Memory
System Memory
16+16 GB/s
| 33
IBM Systems
Throughput test with MINSKY and x86 platforms
| 34
Advantages
- Reduced Training times x3in comparions to PCIe
- Rapid deployment ofmodels
- GPU efficiency at > 95%
- Well balanced system
IBM Systems | 35
IBM Systems
POWER 8 CAPI Coherent Accellerator Processor Interface
| 36
▪ Virtual Addressing– Accelerator can work with
same memory addresses
that the processors use
▪ Hardware Managed Cache
Coherence– Enables the accelerator to
participate in “Locks” as a
normal thread Lowers
Latency over IO
communication model
Customizable Hardware Application Accelerator
• Specific system SW, middleware, or user application
• Written to durable interface provided by PSL
Processor Service Layer (PSL)
• Present robust, durable interfaces to applications
• Offload complexity / content from CAPP
Coherence Bus
POWER8
CAPP
PCIe Gen 3Transport for encapsulated messages _
PSL
FPGA or ASIC
IBM Systems | 37
CAPI vs. I/O Device Driver: Data Prep
IBM Systems
AI and ML – do‘s and dont‘s
Try to make waste to knowledge
Perfection required
No differentiator between good and
bad
No data or limited amount of data
available
Too Complex dependencies
Impossible to program
Specialized
Highly customized solutions required
Customization required
Long term autonomous
learning
Too much data
No scalability by human interaction
AI/ML do‘s AI/ML don‘ts
| 38
IBM Systems
|
39
Development of Hybrid Cloud Metasystems as data sources for AI
Rapid depolymentTime to market
Access to external data
Improved flexibility
Service
System of Records & AI
Governance & Control
Temporary connection
Cloud Service b
Cloud Service cCloud Service a
APIEcosystems
Permanent connection
IBM Systems
Security, defence,
protection of cyber crime
Health &
research Weather, climate
research & Agriculture
Connected, autonomous vehicles
and intelligent traffic systems
Retail and MarketingBanking, finance & insurance
Industrie 4.0
Wearables & mobility
Infotainment, industrial & military
health and fitness
Connected Home
API
API
API
API
API
API
API
API
Connecting data islands for a hyperconnected and cognitive digital universe
Energy, utilities and
Smart cities
API
IBM
Bluemix
IBM
Watson
IBM Hybrid
Cloud
API
| 40
IBM Systems
Challenges ahead
| 41
1. Digital transformation
2. Data and ecosystems as competitive advantage
3. Business value estimation
4. Combine & Conclude on AI methods and data (e.g. picture + voice/sound + sensor = x)
5. Organziational changes
6. Lifecycle Management
7. Service Orchestration
8. Governance and Control
9. Integration of legacy systems
10. Security and Compliance
Conclusion1. Deep learning and AI will touch every area of our life
2. Autonomous Systems require AI/deep learning based on big data
3. Autonomous Systems must combine subsymbolic and symbolic AI processes in hybrid architectures
4. Business processes must adopt AI/DL as an important business value and driver for new businessmodels
5. The AI/DL system infrastructure must be as well scalable, reliable and efficient for compute, networkand storage
6. Collaboration with a variety of enterprises (x2x) and customers and deep integration of AI/DLprocesses will become standard
7. Multiple autonomous systems can operate as hybrid teams in order to collaborate as a team
8. Auto-Pilots and autonomous driving will become possible. Humans just need to intercept inexceptional situations.
9. Beside of technical and business oriented questions of autonomous systems there are still multipleethical, juristic and social areas to be considered.
IBM Systems | 43
IBM Systems
Copyright © 2016 by International Business Machines Corporation. All rights reserved.
No part of this document may be reproduced or transmitted in any form without written permission from IBM Corporation.
Product data has been reviewed for accuracy as of the date of initial publication. Product data is subject to change without notice. This document could include technical inaccuracies or typographical errors. IBM may make improvements and/or changes in the product(s) and/or program(s) described herein at any time without notice. Any statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Any reference to an IBM Program Product in this document is not intended to state or imply that only that program product may be used. Any functionally equivalent program, that does not infringe IBM's intellectually property rights, may be used instead.
THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER OR IMPLIED. IBM LY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NONINFRINGEMENT. IBM shall have no responsibility to update this information. IBM products are warranted, if at all, according to the terms and conditions of the agreements (e.g., IBM Customer Agreement, Statement of Limited Warranty, International Program License Agreement, etc.) under which they are provided. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. IBM makes no representations or warranties, ed or implied, regarding non-IBM products and services.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made, in writing, to:
IBM Director of LicensingIBM CorporationNorth Castle DriveArmonk, NY 1 0504- 785U.S.A.
Legal Notices
IBM Systems 45
IBM, the IBM logo, ibm.com, IBM System Storage, IBM Spectrum Storage, IBM Spectrum Control, IBM Spectrum Protect, IBM Spectrum Archive, IBM Spectrum Virtualize, IBM Spectrum Scale, IBM Spectrum Accelerate, Softlayer, and XIV are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at http://www.ibm.com/legal/copytrade.shtml
The following are trademarks or registered trademarks of other companies.
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
IT Infrastructure Library is a Registered Trade Mark of AXELOS Limited.
Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom.
ITIL is a Registered Trade Mark of AXELOS Limited.
UNIX is a registered trademark of The Open Group in the United States and other countries.
* All other products may be trademarks or registered trademarks of their respective companies.
Notes:
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
This presentation and the claims outlined in it were reviewed for compliance with US law. Adaptations of these claims for use in other geographies must be reviewed by the local country counsel for compliance with local laws.
Legal Notices
Thank you!
IBM Systems
ibm.com/systems/hpc
| 46
Thank you!
IBM Systems
ibm.com/systems/hpc
| 47
IBM Systems
Experiences
| 48
IBM Systems | 49
IBM Systems | 50
IBM Systems | 51
IBM Systems | 52
IBM Systems
IBM Investment in Innovation
Accelerated and Open Source Data Bases and storage
Accelerated DB: Kinetica, Blazegraph
OSDB: EnterpriseDB, MongoDB, Redis, Neo4J, Cassandra
Top R&D Applications
GROMACS, Gaussian, NAMD,
VMD, WRF, VASP,
OpenFOAM, LS Dyna,
AMBER, NCBI – BLAST, GATK4,
NWChem GAMESS,
Quantum ESPRESSO
LAMMPS, CHARMM
CP2K, LQCD, QMCPack
MILC, Chroma, QPACE
COSMO, Abinit, COMSOL,
CPMD, GTC, HOMME
HYCOM
Machine Learning/ Deep Learning
PowerAI ML/DL Software Distro (link)
•Built for Deployment Speed & with Real Performance Optimization
•Caffe, Torch, Theano, DIGITS
• Python, OpenBLAS and other dependencies
Caffe, Torch, Theano, DIGITS, TensorFlow, DL4J, more on POWER
Custom Caffe- CPU/GPU NVLink Optimized
| 53
IBM Systems
Several Options to Realize Performance Enhancements via GPU Acceleration
Libraries
• ESSL/PESSL
• NVIDIA Libraries• Math library, cuBlas,
NPP, etc
Programing models
supporting directives
• OpenACC
• Open MP
Programing language which
targets GPU
• CUDA
• Easy to Implement
• Tested and Supported
• Limited – your needs may
not be covered
• Modification of existing
programs with directives
• Compiler assists with
mapping to device
• Most time intensive
• Requires expertise
• Achieves best performance
results
Ease of Use
Best Application Performance
Easy
Best
| 54
IBM Systems
IBM Power 822LC - 2 Socket Power 8, 4 GPU System
POWER 8 with NVLINK (2x)• 190W Sort• Integrated NVLink 1.0
Memory DIMM’s Riser (8x)• 4 IS DDR4 DIMMs per riser• Single Centaur per riser• 32 IS DIMM’s total• 32-1024 GB memory capacity
PCIe slot (1x)• Gen3 PCIe• HHHL Adapter
PCIe slot (2x)• Gen3 PCIe• HHHL Adapter
NVidia GPU • SXM2 form factor• NVLink 1.0• 300 W • Max of 2 per socket
Power Supplies (2x)• 1300W • Common Form Factor Supply
Cooling Fans • 80mm Counter- Rotating Fans• Hot swap
HDD Option (2x)• 0-2, 1TB SATA HDD• Tray design for install/removal• Hot Swap
Service Controller Card• BMC Content
| 55