+ All Categories
Home > Documents > FAWN: Fast Array of Wimpy Nodes

FAWN: Fast Array of Wimpy Nodes

Date post: 16-Feb-2016
Category:
Upload: hafwen
View: 46 times
Download: 0 times
Share this document with a friend
Description:
FAWN: Fast Array of Wimpy Nodes. A technical paper presentation in fulfillment of the requirements of CIS 570 – Advanced Computer Systems – Fall 2013 Scott R. Sideleau [email protected] 14-Nov-2013. Overview. Identify the problem space FAWN as a solution - PowerPoint PPT Presentation
Popular Tags:
25
FAWN: Fast Array of Wimpy Nodes Click icon to add picture A technical paper presentation in fulfillment of the requirements of CIS 570 – Advanced Computer Systems – Fall 2013 Scott R. Sideleau [email protected] 14-Nov-2013
Transcript
Page 1: FAWN: Fast Array of Wimpy Nodes

FAWN: Fast Array of Wimpy Nodes

Click icon to add picture

A technical paper presentation in fulfillment of the requirements ofCIS 570 – Advanced Computer Systems – Fall 2013

Scott R. [email protected] 14-Nov-2013

Page 2: FAWN: Fast Array of Wimpy Nodes

2

Page 3: FAWN: Fast Array of Wimpy Nodes

Overview• Identify the problem space

• FAWN as a solution– Architecture principles– Unique key-value storage

• Evaluate and benchmark a 21-node FAWN cluster

• Identify when FAWN makes sense

3

Page 4: FAWN: Fast Array of Wimpy Nodes

Theoretical Problem Space• CPU I/O gap

– Modern processors are so efficient that a lot of time is spent idle

• CPU power consumption scales linearly– Increased caches to keep the superscalar pipelines fed is a driver

• Dynamic Voltage Frequency Switching (DVFS) is inefficient– Intel SpeedStep technology– CPU still operates generally at 50% power consumption

4

Page 5: FAWN: Fast Array of Wimpy Nodes

What’s the real problem?• Electricity is expensive!

– Home usage is measured in KW, data center usage in MW

• Facebook use up to $1 million a month in electricity– Only three data centers!

• Oregon, USA• Virginia, USA• Sweden

5

Page 6: FAWN: Fast Array of Wimpy Nodes

Facebook’s Not Playing Around• Fourth data center to be powered by renewable wind

– Iowa, USA

6

http://goo.gl/sFmmxz dtd 14-Nov-2013

Page 7: FAWN: Fast Array of Wimpy Nodes

Proposed Solution• Fast Array of Wimpy Nodes (FAWN)

– Bridge the I/O gap• Use slower CPUs and faster Flash storage

– Reduce power consumption per node• Embedded CPUs consume significantly less power

– Address distributed storage for the new architecture• New key-value storage system (FAWN-KV)

– Complementary per node data store (FAWN-DS)

7

Page 8: FAWN: Fast Array of Wimpy Nodes

8

System Architecture

Page 9: FAWN: Fast Array of Wimpy Nodes

9

Basic Functions

Page 10: FAWN: Fast Array of Wimpy Nodes

10

Replication & Consistency

Page 11: FAWN: Fast Array of Wimpy Nodes

Understanding Flash Storage• Fast random reads

– 175x faster than HDDs– Vary wildly between make/models

• Efficient I/O– Very low power– High query per Joule rate vs. HDDs

• Slow random writes– Expensive erase/write cycle– Motivation for log structured (i.e. sequential) data storage

11

Page 12: FAWN: Fast Array of Wimpy Nodes

Optimized Maintenance Functions• Split

– Used when adding a node to the cluster– Read, then sequential write to two new data stores if key is in range

• Merge– Used when deleting a node from the cluster– Mutually exclusive stores, so append one data store to the other

• Compact– Cleans up entries in a data store– Skip orphans, out-of-range, deleted and write to new data store

12

Page 13: FAWN: Fast Array of Wimpy Nodes

13

Optimized Sequential Read & Writes

Page 14: FAWN: Fast Array of Wimpy Nodes

14

Front-end Consistent Hashing

Page 15: FAWN: Fast Array of Wimpy Nodes

15

Node Join

Page 16: FAWN: Fast Array of Wimpy Nodes

Node Leave• Rather than split the data stores, nodes merge them

• In reality, this means…– Add a new replica into each chain the departing node belonged to– So, the processing is the same as a join event

16

Page 17: FAWN: Fast Array of Wimpy Nodes

Failure Detection• Nodes are assumed to be fail-stop

– Front-end and back-end nodes gossip at a known rate• If timeout, front-end initiates leave operation for failed node

• Current design only copes with node failures– Coping with network failures require future work

17

Page 18: FAWN: Fast Array of Wimpy Nodes

Single Node Evaluation• Performance almost entirely dependent on flash media

18

Page 19: FAWN: Fast Array of Wimpy Nodes

21-Node Evaluation• In general, the back-ends prove to be well-matched

19

Page 20: FAWN: Fast Array of Wimpy Nodes

21-Node Evaluation• Relatively responsive through maintenance operations

20

Page 21: FAWN: Fast Array of Wimpy Nodes

21-Node Evaluation• Slightly slower than production key-value systems

– Worst case response times on-par

21

Page 22: FAWN: Fast Array of Wimpy Nodes

21-Node Evaluation• Power draw is low and consistent across operations

22

Page 23: FAWN: Fast Array of Wimpy Nodes

21-Node Evaluation• Power draw is low and consistent across operations

– Query per Joule is an order of magnitude higher than traditional production distributed systems

• 1 billion instructions per Joule• 1/3 the frequency• 1/10 (or less) the power

23

Page 24: FAWN: Fast Array of Wimpy Nodes

When does FAWN matter?• It depends on the workload…

24

Page 25: FAWN: Fast Array of Wimpy Nodes

QUESTIONS?Thanks very much!

25


Recommended