Alveo Overview for Hyperscale and HPC
Applications
Viraj Paropkari
Senior Manager, Global Data Center Marketing
Arm HPC User Group (AHUG) 2019
20th June 2019, Frankfurt, Germany
XILINX CONFIDENTALXILINX CONFIDENTALXILINX CONFIDENTALXILINX CONFIDENTAL
Agenda
˃Data Center Focus & Strategy
˃Alveo Data Center Boards & Ecosystem
Overview
˃Use Cases- Compute, Storage, Networking
˃Future of Computing
˃Getting Started
>> 2
@ Copyright 2019 Xilinx© Copyright 2019 Xilinx
XILINX
3
Headquarters
Research and Development
Sales and Support
Revenue Employees Worldwide Customers Industry Firsts Patents
$3.0B ~4,400 20K+ 60+ 4,000+
© Copyright 2019 Xilinx
p
Data Center Opportunity
Compute
Storage
Network
TelcoCloud / Edge
High PerformanceComputing
Enterprise Private Cloud
Hyperscale Public Cloud
> Heterogeneous Computing post Moore’s Law
> Exponential Data Growth
> Dawn of AI
© Copyright 2019 Xilinx
Era of Reconfigurable Accelerators
VIDEOANALYTICS
MACHINE
LEARNING
FINANCIALHPC & Life Sciences
DATABASE Apps
NetworkStorage
Compute
6
© Copyright 2019 Xilinx
Classification Object Detection
Speech
RecognitionRecommendation
Engine Anomaly Detection
CNN RNN, LSTM MLP
APPLICATIONS
Diverse AI models and Neural Networks (NN’s) for a broad range of applications
Data
Analytics
Algorithm Diversity and Fast Evolution
RF, LR
7
© Copyright 2019 Xilinx
New Algorithms Need New Architectures
8
AlexNet GoogLeNet DenseNet
Highest throughput, latency, and efficiency requires different HW architecture
© Copyright 2019 Xilinx
Optimized Performance
9
Requires Custom Memory and Datapath
1) Custom Data Path2) Custom Precision
3) Custom Memory Hierarchy
Off-Chip
DDR
On-Chip
Memory
© Copyright 2019 Xilinx
Reconfigurable Data Center Accelerators
10
Tier1 HS adopting FPGA for general purpose acceleration across multiple in-house and external workloads
© Copyright 2019 Xilinx
Xilinx TransformationS
W P
rog
ram
ma
bilit
y
Device Category
SoCFPGA
MPSoC
RFSoC
ACAP
From Devices to Platforms
11
© Copyright 2019 Xilinx
ACCESSIBLEDeploy in the cloud or on-premises
Rich set of accelerated Applications
FASTBuilt for high throughput, ultra-low latency
Accelerate compute, networking, storage
ADAPTABLEDeploy optimized domain-specific architectures
Adapt to changing algorithms
© Copyright 2019 Xilinx
*Low-latency GoogLeNet v1
U25038TB/sInternal SRAMBandwidth
54MBInternal SRAM Capacity
1,341KLUTs
4100img/sCNN Throughput*
U28030TB/sInternal SRAMBandwidth
41MBInternal SRAM Capacity
1,079KLUTs
460GB/sHBM2 Memory Bandwidth
U20031TB/sInternal SRAMBandwidth
35MBInternal SRAM Capacity
892KLUTs
3100img/sCNN Throughput*
© Copyright 2019 Xilinx
ALVEO Solution Stack
15
CLOUD ON-PREMISE
Solution Providers
App & IP Developers
ChannelPartners
End Customers
Tencent Cloud
Data
Analytics
Video & Image
Processing
Machine
Learning
Financial
Computing
Life Science
& HPC
© Copyright 2019 Xilinx
Growing Ecosystem
16
Life Sciences &
HPC
Video ProcessingData Analytics
Machine Learning Financial
Computing
Image Processing
© Copyright 2019 Xilinx
Growing Solution Partner Network
OEM Partners
VARs
Distributors
Customers
19
© Copyright 2019 Xilinx
AI Accelerated Dark Matter Search (CERN)
https://www.xilinx.com/content/dam/xilinx/publications/powered-by-xilinx/cern-case-study-final.pdf
21
Real-time ML Inference + Sensor pre-processing
Achieving 100ns Inference Latency on 150 Terabytes/Second Data Rates
Unachievable by CPUs & GPUs
Xilinx FPGA
CMS
Sensor
© Copyright 2019 Xilinx
Computational Fluid Dynamics
22
ALVEO Accelerated CFD Kernels
Faster Time to insight, Fewer Nodes
• 4x Faster simulation time• 80% lower energy consumption• 6x better performance per Watt
© Copyright 2019 Xilinx
Precision MedicineGenomic Data Analytics
FPGA
GPU
CPU
Performance
90x
1x
Accelerates Sequencing by 90x
23Source: Xilinx Analysis
© Copyright 2019 Xilinx
Live Video StreamingVideo Transcoding for VP9 Live Stream
FPGA
ASIC
GPU
CPU
Frames per second
1x
30x
30x higher performance
25% cost reduction
24Source: Xilinx Analysis
© Copyright 2019 Xilinx
Video Processing + AI Inference
FPGA
GPU
CPU
Performance
100x
1x
20x
100x higher performance
Significant cost reduction
25Source: Xilinx Analysis
© Copyright 2019 Xilinx
Computational Storage
CPUDRAM
FPGA
SSD
FPGA
SSD
FPGA
SSD
5x Speedup forData-Intensive Workloads
Compression
Encryption
Database query offloads
Reduces TCO >2x
Increased performance
Frees up Host CPU cycles
© Copyright 2019 Xilinx
Smart Networking Acceleration
More Efficient InfrastructureAdaptable to support new protocolsExtensible with programming in P4 & C/C++
SolarFlare SmartNICProcessing >100 million packets/secDual 100G QSFPUnder 75 watts
XILINX CONFIDENTALXILINX CONFIDENTALXILINX CONFIDENTALXILINX CONFIDENTAL>> 30
Now Shipping to multiple tier-1 customers.
GA 2H 2019
© Copyright 2019 Xilinx36
WWW.XILINX.COM/ACCELERATOR-PROGRAM
© Copyright 2019 Xilinx
More Information Available on Xilinx.com
>> 37
Xilinx.com
Product Brief
Product Selection Guide
Getting Started Guide
Data Sheet
ML Solution Brief
SDAccel Solution Brief
ABR Transcoding Solution Brief
Accelerating DNNs with Alveo White Paper
Applications Directory
https://www.xilinx.com/alveo