Caribou:Intelligent Distributed Storage
Zsolt István, David Sidler, Gustavo AlonsoSystems Group, Department of Computer Science, ETH Zurich 1
2
Rack-scale thinking ToR Switch
Storage Storage
Storage Storage
Compute
Compute
Compute
Compute
In the Cloud
In an Appliance
+ Provisioning
+ Independent Scalability
- Data movement bottleneck
3
Storage Design Options
Oracle Exadata
IBM PureData
Deuteronomy
…
Samsung YourSQL
Winsconsin SmartSSD
Kinetic Drives
BlueCache
…
Features similar to
software
Balanced design
+ Full-fledged
- SW+HW overhead
- Large footprint
- Outside management
+ No-overhead access
+ Small footprint
Compute > Bandwidth Compute < Bandwidth
Compute ~ Bandwidth
Intelligent Distributed Storage with FPGAs
Easy integration on commodity network
Random access to tuples & in-storage scans
Selection predicate pushdown
Data replicated consistently to nodes
Extensible (open-source) design
4
What is Caribou?
Caribou
Node
10Gbps Switch
Clients
Clients
Clients
Caribou
Node
Caribou
Node
Caribou
Node
Clients
Clients
fpgasystems
Field Programmable Gate Array
Reprogrammable hardware
Large number of configurable logic blocks
Tight integration, massive parallelism
Network/App Co-design
Innovation…
5
FPGA 101
FPGA
Caribou
Node
10Gbps Switch
Clients
Clients
Clients
Caribou
Node
Caribou
Node
Caribou
Node
Clients
Clients
6
Inside a Caribou node
Caribou
DRAM
ProcessingKey-value
managementReplication
Network
TCP/IP
1000s of
connections,
SW clients
Software clients, Key-value interface (Single-key lookup or Scanning)
Cuckoo hash
table, slab memory
allocation,
bitmap indexes
Conditionals,
Regex,
Decompression
Primary/backup
Atomic
Broadcast
The pipeline runs at the
same speed at the
network (line-rate)
7
Throughput of random access to storage
8
Random access response times
0
10
20
30
40
50
60
0 64 128 192 256
Re
sp
on
se
tim
e [
us
]
Value size [B]
Get Put/Update Put/Update (Replicated)
• Response times comparable to SW on Infiniband, but Caribou uses
commodity networking
SELECT … FROM customer
WHERE age<35 AND purchases>2
AND address LIKE “%Luzern%CH%”
Multiple comparisons to constants (conjunction)
Substrings or regular expression matching [1]
Can filter compressed data (LZ77)
Extensible pipeline design
[1] Accelerating Pattern Matching Queries in Hybrid CPU-FPGA Architectures. D. Sidler, Zs. Istvan, M. Ewaida, G. Alonso. 2017 ACM SIGMOD/PODS Conference (SIGMOD'17)
9
Operator push-down
The filtering circuits
are parameterized at
runtime, with no
overhead.
10
Exploiting Parallelism
Regular
Expressions
DR
AM
Transform
Comparison
Predicate
LZ77
LZ77
LZ77
LZ77
…
Regex
Core
Regex
Core
Regex
Core
Regex
Core
…
…
Th
roughput
Thro
ughput
Complexity
Va
lue
Va
lue
Value’
0Value’
1
Value’
1
Value’
1
Keep?
11
Scan and filter
Choice of filter and value size do not impact scan rate.
Bound by the
Filter
performance
Bound by the
network/client
Scan rate in GB/s is
same regardless
value size
Filtering can be combined with random access reads as well
12
Near Data Processing without Surprises
In-Storage Processing Stand-alone boards, MPSoC (ARM+FPGA)
Add NVMe flash, N.V. Memory
Explore different KVS (memcached, redis, …)
In-Network Processing Microsoft Catapult NICs
Work on streaming data
Distributed service in the cloud
Accelerator Intel Xeon+FPGA
Offload computation without partitioning or copying data
13
“The Times They Are A-Changin”
Data movement bottleneck on many levels
Caribou – Intelligent Distributed Storage Software-like service in a small footprint
Balanced design with “right amount” of compute
Caribou – Platform to Explore Near-data Processing Open source, modular and portable
Data processing operators applicable on other HW platforms
https://github.com/fpgasystems/caribou
14
Time to Explore…
https://www.systems.ethz.ch/fpga/ [email protected]