+ All Categories
Home > Documents > FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Date post: 21-Dec-2015
Category:
View: 226 times
Download: 4 times
Share this document with a friend
16
FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque
Transcript
Page 1: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

FAWN: A Fast Array of Wimpy Nodes

Presented by:Clint Sbisa & Irene Haque

Page 2: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Motivation

Large-scale data-intensive applications        Facebook, LinkedIn, Dynamo CPU-I/O Gap        storage, network and memory bottlenecks        low CPU utilization CPU Power        slower CPUs execute more queries per second per Watt        1 billion vs. 100 million instructions per Joule        inefficient energy saving techniques Memory Power 

Page 3: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

FAWN

Data-intensive, computational simple workloadsSmall objects - 100B - 1KB Cluster of embedded CPUs using flash storage        Efficient        Fast random reads        Slow random writes FAWN-KV         Key-value storage        Consistent HashingFAWN-DS        Data store        Log structured  

Page 4: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

FAWN - DS

Log-structure key-value storeContains all values in a key range for each virtual ID Maps 160-bit key        Hash Index bucket = i low order index bits        key fragment = next 15 low order bits6 byte in-memory Hash Index stores frag and pointer   

Page 5: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

FAWN - DS

Basic Functions:        Store        Lookup        Delete                                 Concurrent operations

Virtual Node Maintenance:    Split    Merge    Compact

Page 6: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Consistent hashing of back-end VIDs Management node        assigns each front-end to circular key space  Front-end nodes        manages its key space        forwards out-of-range request    Back-end nodes - VIDs        contacts front-end when joining        owns a key range

FAWN - KV

Page 7: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Chain replication

FAWN - KV

Page 8: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Join    split key range     pre-copy    chain insertion    log flush    Leave    merge key range    Join into each chain 

FAWN - KV

Page 9: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Individual Node Performance

• Lookup speed

• Bulk store speed: 23.2 MB/s, or 96% of raw speed

Page 10: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Individual Node Performance

• Put speed

• Compared to BerkeleyDB: 0.07 MB/s – shows necessity of log-based filesystems

Page 11: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Individual Node Performance

• Read- and write-intensive workloads

Page 12: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

System Benchmarks

• System throughput and power consumption

Page 13: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Impact of Ring Membership Changes

• Query throughput during node join and maintenance operations

Page 14: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Impact of Ring Membership Changes

• Query latency

Page 15: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Alternative Architectures

• Large Dataset, Low Query → FAWN+Disk

• Small Dataset, High Query → FAWN+DRAM

• Middle Range → FAWN+SSD

Page 16: FAWN: A Fast Array of Wimpy Nodes Presented by: Clint Sbisa & Irene Haque.

Conclusion

• Fast and energy efficient processing of random read-intensive workloads

• Over an order of magnitude more queries per Joule than traditional disk-based systems


Recommended