October 2018
Ziv Serlin - VP Architecture & Co-founder
E8 – Serving Data Hungry Application
Presenter
©2018 E8 Storage, Proprietary and Confidential2
Ziv SerlinCo-Founder & VP ArchitectureIBM XIV, Primary Data, Intel
In E8 Mr. Serlin was involve on architect E8 product for the 1st two years, recently Mr. Serlin is focusing on the extensive engagement with E8 end customers exploring, helping and defining with the customers their E8 deployment models, In addition to that Mr. Serlin lead the strategic engagement of E8 HW partners. An expert in designing complex storage systems, Mr. Serlin has extensive experience as a system architect at Intel and was the HW R&D manager of IBM/XIV product (scale out high-end enterprise block storage system). He earned BSC in Computer Engineering at the Technion. He hold storage patent from his work at IBM.
About E8 Storage
• Founded in November 2014 by storage industry veterans from IBM-XIV• Leading NVMe over Fabrics solution in the market
• In production with customers in U.S. and Europe, expanding now to Asia• Awarded 10 patents (granted) + 4 pending for E8 architecture
• World-wide Team:• R&D in Tel-Aviv• Sales & marketing in U.S., Europe and Asia
• Flash Memory Summit 2016 & 2017 Most Innovative Product Award
©2018 E8 Storage, Proprietary and Confidential3
Storage Growth is Data Growth
©2018 E8 Storage, Proprietary and Confidential4
100%
5%
60%
Performance load on that database grows 100% year
The size of their database grows 60% a year
Business volume grows 5% a year
E8 Storage Accelerates Data Hungry Applications
Big Data Analytics Image Recognition Real-time Security AI/ML Video 4K Post-Prod
The Problem (Part 1): Why not use local SSDs in servers?
• “The DevOps Problem”• Things that work on laptops become 10x
slower on the production infrastructure
• “The islands of storage problem”• Local SSDs in servers mean inefficient
capacity utilization, no sharing of SSD data
• Local SSDs couple storage and compute
• Server purchasing requires upfront investment in SSDs
©2018 E8 Storage, Proprietary and Confidential5
Local SSDs today achieve latency 10x faster than all-flash arrays
0.1ms 1ms
???
Local SSD AFA
The Problem (Part 2): Why not use SSDs in SAN/NAS?
• Not enough performance• Traditional all-flash arrays (SAN/NAS)
get 10%-20% of the potential performance of NVMe SSDs
• Classic “scale-up” bottleneck
• Dual controller bottleneck• All I/O gated by controller CPU• Switching the SSDs from SAS to NVMe
cannot alleviate the controller bottleneck
©2018 E8 Storage, Proprietary and Confidential6
First gen architectures cannot unlock the full performance of NVMe
The Solution: NVMe Over Fabrics
• By attaching SSDs over the network, rather than using local SSDs in each server, Customer is able to:
• Reduce total acquisition cost of SSDs by 60%• Defer capacity purchases until needed• Pay as you grow – at lower future prices
• Improve capacity utilization of SSDs to 90%• Shared volumes vs local replicas • Easily add capacity only as needed
©2018 E8 Storage, Proprietary and Confidential7
4 Year Cost of SSDs in a Data-Center
Moving to shared NVMe storage delivers strong TCO for large scale customer deployments
$20
$40
$60
$80
$12$20
$26$30
E8 Storage Unlocks the Performance of NVMe
©2018 E8 Storage, Proprietary and Confidential8
1000
100 120
Read Latency (us)(@4K)
AFA with 24 SSDs
Single NVMe SSD
E8 24 NVMe SSDs
300K 750K
10M
IOPS (@4K read)
2.4 3.1
40
Read/Write Bandwidth (GB/s)
See the Demo!
9
E8 Storage Product Overview
©2018 E8 Storage, Proprietary and Confidential
What is NVMe™? (Non-Volatile Memory Express)
• High performance, low latency• Efficient protocol with lower stack overhead• Exponentially more queues / commands than SAS• Parallel processing for SSDs vs serial for HDDs
• Support for fabrics (NVMe-oF™)• Originally designed for PCIe (internal to servers)• Expands support for other transport media
• RDMA Based: RoCE, iWARP, Infiniband• Non-RMDA: FC, TCP
• Maintains NVMe protocol end to end
©2018 E8 Storage, Proprietary and Confidential10
Communication protocol designed specifically for flash storage
I/O Queues Commands per QueueSAS 1 256NVMe 65,535 64,000
The E8 Storage Difference
• Unleash the parallelism of NVMe SSDs
• Direct drive access for near line rate performance• Separation of data and control paths; no controller
bottleneck• E8 Agent offloads up to 90% of data path operations
• Simple, centralized management• Intuitive management GUI for host / volume management• E8 Agents auto-discover assigned LUNs
• Scalable in multiple dimensions• Up to 126 host agents per E8 Controller• Up to 8 Controllers per host 2PB in single management
©2018 E8 Storage, Proprietary and Confidential11
A new architecture built specifically for high performance NVMe
Designed for Availability and Reliability
• Host agents operate independently• Failure of one agent (or more) does not affect other agents• Access to shared storage is not impacted
• RAID-6/RAID-5/RAID-10 data protection
• Network multi-pathing
• Enclosure high availability• Option 1: HA enclosure + dual-ported SSDs• Option 2: Cross-enclosure HA + single-ported SSDs
©2018 E8 Storage, Proprietary and Confidential12
No single point of failure anywhere in the architecture
Host Servers with E8 Host Agents
13
E8 Storage Customers and Use-Cases
©2018 E8 Storage, Proprietary and Confidential
Genomic Acceleration with E8 Storage
• "We were keen to test E8 by trying to integrate it with our Univa Grid Engine cluster as a consumable resource of ultra-performance scratch space. Following some simple tuning and using a single EDR link we were able to achieve about 5GB/s from one node and 1.5M 4k IOPS from one node. Using the E8 API we were quickly able to write a simple Grid Engine prolog/epilog that allowed for a user-requestable scratch volume to be automatically created and destroyed by a job. The E8 box behaved flawlessly and the integration with InfiniBand was simpler than we could have possibly expected for such a new product."
• - Dr. Robert Esnouf, Director of Research Computing Oxford Big Data Institute + Wellcome Center for Human Genetics
©2018 E8 Storage, Proprietary and Confidential14
Shared NVMe as a fast tier for parallelizing genomic processing
From 10 hours per genome to 1 hour for 10 genomes!
E8 for AI/ML with IBM GPFS and Nvidia
• A GPU cluster requires 0.5PB-1PB of shared fast storage
• But GPU servers have no real estate for local SSDs…
• E8 Storage provides concurrent access for 1000 (!) GPUs per cluster
• 10x Performance of Pure Storage FlashBlade
• 4x Performance of IBM ESS SSD Appliances, for half the cost
©2018 E8 Storage, Proprietary and Confidential15
Shared NVMe Accelerates Deep Learning
Pure StorageFlashBlade
IBM GPFS + ESS E8 + GPFS
Cost ($/GBu)
GPU Farm: Nvidia DGX-1• Up to 8 GPUs per node• GPFS Client + E8 Agent run on
x86 within GPU Server• Up to 126 GPU nodes in cluster
Mellanox 100G IB
0
500
1000
1500
2000
2500
3000
1 GPU node 10 GPU nodes 100 GPU nodes
Images per second, per GPU node(ResNet-50 Image Recognition Training)
Pure Storage
IBM GPFS + ESS
E8+GPFS
Shared NVMe Storage• E8-D24 2U24-HA• Dual-port 2.5” NVMe Drives• Up to 184TB (raw) per 2U• Patented Distributed RAID6
E8 for AI/ML with IBM GPFS and Nvidia
• GPU Environment is Perfect for E8• x86 processors on GPU nodes are free: run GPFS client + E8 agent• GPU cluster typically connected with 100G IB or ROCE• GPU nodes have no real estate for SSDs: requires external shared SSD solution
• Highly Scalable, up to 1000 (!) GPUs in single cluster• 1-8 GPUs per node, up to 126 nodes per cluster• Scale number of NVMe drives and NVMe enclosures easily: 20TB – 2PB
• 10x Performance of Pure Storage FlashBlade• AI/ML data-sets incompatible with de-dup and compression• Which also ruins performance…
• 4x Performance of IBM ESS SSD Appliances, for half the cost• Open hardware architecture• No NSDs: Single hop from GPFS clients to SSDs through E8’s patented architecture• All network attach, no SAS cabling, no one-to-one mappings
©2018 E8 Storage, Proprietary and Confidential16
Shared NVMe Accelerates AI/ML Workloads
0
5
10
15
20
25
30
Pure StorageFlashBlade
IBM GPFS + ESS Weka.IO (NVMe) E8 + GPFS
Epoch Time (Hours)
0
1
2
3
4
5
6
7
Pure StorageFlashBlade
IBM GPFS + ESS Weka.IO (NVMe) E8 + GPFS
Cost ($/GBu)
E8 Storage accelerate GPU data hungry applications
17©2018 E8 Storage, Proprietary and Confidential
Better Throughput: E8 Storage processed more images per second than local SSD
Better Throughput
Lower Latency
Lower Latency: Training time was faster with E8 Storage than local SSD
Using E8 with IBM Spectrum Scale
• Scalable to larger configurations• Can mix connectivity depending on requirements
• Standalone pool• Shared LUNs
• LROC• Non-shared LUNs (direct connect clients only)
• HAWC
©2018 E8 Storage, Proprietary and Confidential18
IB/ RoCE
SSDSSDSSDSSDSSDSSD
E8-D24 (Dual-port SSDs)E8 MDS
RAID-6
E8 MDS
ClientE8 Agent
Client Client Client
NSDE8 Agent
NSDE8 Agent
Fastest Shared Block Storage in the World
• 0.57ms - Record breaking response time!• 45% lower ORT for the same builds• 8x lower latency on average
• The power of Intel Optane• Ultra-low latency for data intensive
apps
• More performance, less hardware
©2018 E8 Storage, Proprietary and Confidential19
* As of SPEC SFS®2014_swbuild results published August 2018. SPEC SFS2014 is the industry standard benchmark for file storage performance. See all published results at https://www.spec.org/sfs2014/results/
E8 Storage 24 NVMe SSDs 2U
WekaIO 64 NVMe SSDs 8U
Huawei 6800 F 60 SAS SSDs 14U
8x lower latency!
STAC®-M3™ Benchmark for High Speed Analytics
• High performance with eXtremeDB®• Real-time analytics for tick database• Designed for today’s financial systems• Accelerated with Intel Optane technology
• Record breaking performance!• Faster response times in 5 of 17 operations*• More consistent response times overall
• Lowest overall execution time• More than 2.5x faster than competition!
©2018 E8 Storage, Proprietary and Confidential20
* Of the published, audited results on https://stacresearch.com/ as of May 2018. Graphs show the 2 closest competitors for overall results.
0 50 100 150 200 250
10T.MKTSNAP*
1T.WKHIBID*
1T.YRHIBID-2
Best STAC-M3 Response Times (ms)
0 5000 10000 15000 20000
100T.VWAB-12D-NO
10T.VOLCURV
1T.NBBO
1T.WRITE.LAT2
E8 Storage Competitor A Competitor B
17x Faster!
Roadmap: NVMe Takes Over the Data Center
©2018 E8 Storage, Proprietary and Confidential21
NVMe tiers• Performance tier (Intel
Optane, Samsung ZNAND)• Capacity tier: TLC/QLC, Intel
ruler / Samsung m.3• E8 automates tiering and
resiliency scheme per tier
NVMe Over Fabrics• E8 Software is an enabler
for managed JBOFs• E8 Agents can talk to
standard commodity NVMe-oF JBOFs
Smart NICs• NICs with CPUs allow E8
Storage to run without host friction
• Support for VMware and Windows
Compute Networking Storage
Data Center Technology Trends Open the Door for E8 Storage Architecture
Smart NICs Accelerate NVMe-oF Adoption
• Before• E8 Agent runs on host server user space• Typically requires 1-2 CPU cores• Supports Linux servers
• After• E8 Agent runs inside the Smart NIC• Uses CPU on Smart NIC, not host server• Support for Windows and VMware
©2018 E8 Storage, Proprietary and Confidential22
Accelerate application performance with more processing power
View Live Demo with Broadcom and Mellanox Smart NICs
E8 Storage – Rack Scale Flash. No Compromise.
©2017 E8 Storage, Proprietary and Confidential23
Centralized Storage Reliability
Hyper-scalability
Affordable100% COTS
PCIe SSDPerformance
©2018 E8 Storage, Proprietary and Confidential23
THANK YOU!
• For more information, please feel free to contact • Alfred Hui – [email protected] (Regional Director, Asia ) • +86-184-2013-7210• Wechat connect:
©2018 E8 Storage, Proprietary and Confidential24