FAWN: Fast Array of Wimpy Nodes

transcript

Click icon to add picture

A technical paper presentation in fulfillment of the requirements ofCIS 570 – Advanced Computer Systems – Fall 2013

Scott R. Sideleaussideleau@umassd.edu 14-Nov-2013

Overview• Identify the problem space

• FAWN as a solution– Architecture principles– Unique key-value storage

• Evaluate and benchmark a 21-node FAWN cluster

• Identify when FAWN makes sense

Theoretical Problem Space• CPU I/O gap

– Modern processors are so efficient that a lot of time is spent idle

• CPU power consumption scales linearly– Increased caches to keep the superscalar pipelines fed is a driver

• Dynamic Voltage Frequency Switching (DVFS) is inefficient– Intel SpeedStep technology– CPU still operates generally at 50% power consumption

What’s the real problem?• Electricity is expensive!

– Home usage is measured in KW, data center usage in MW

• Facebook use up to $1 million a month in electricity– Only three data centers!

• Oregon, USA• Virginia, USA• Sweden

Facebook’s Not Playing Around• Fourth data center to be powered by renewable wind

– Iowa, USA

http://goo.gl/sFmmxz dtd 14-Nov-2013

Proposed Solution• Fast Array of Wimpy Nodes (FAWN)

– Bridge the I/O gap• Use slower CPUs and faster Flash storage

– Reduce power consumption per node• Embedded CPUs consume significantly less power

– Address distributed storage for the new architecture• New key-value storage system (FAWN-KV)

– Complementary per node data store (FAWN-DS)

System Architecture

Basic Functions

Replication & Consistency

Understanding Flash Storage• Fast random reads

– 175x faster than HDDs– Vary wildly between make/models

• Efficient I/O– Very low power– High query per Joule rate vs. HDDs

• Slow random writes– Expensive erase/write cycle– Motivation for log structured (i.e. sequential) data storage

Optimized Maintenance Functions• Split

– Used when adding a node to the cluster– Read, then sequential write to two new data stores if key is in range

• Merge– Used when deleting a node from the cluster– Mutually exclusive stores, so append one data store to the other

• Compact– Cleans up entries in a data store– Skip orphans, out-of-range, deleted and write to new data store

Optimized Sequential Read & Writes

Front-end Consistent Hashing

Node Join

Node Leave• Rather than split the data stores, nodes merge them

• In reality, this means…– Add a new replica into each chain the departing node belonged to– So, the processing is the same as a join event

Failure Detection• Nodes are assumed to be fail-stop

– Front-end and back-end nodes gossip at a known rate• If timeout, front-end initiates leave operation for failed node

• Current design only copes with node failures– Coping with network failures require future work

Single Node Evaluation• Performance almost entirely dependent on flash media

21-Node Evaluation• In general, the back-ends prove to be well-matched

21-Node Evaluation• Relatively responsive through maintenance operations

21-Node Evaluation• Slightly slower than production key-value systems

– Worst case response times on-par

21-Node Evaluation• Power draw is low and consistent across operations

– Query per Joule is an order of magnitude higher than traditional production distributed systems

• 1 billion instructions per Joule• 1/3 the frequency• 1/10 (or less) the power

When does FAWN matter?• It depends on the workload…

QUESTIONS?Thanks very much!

FAWN: Fast Array of Wimpy Nodes

Documents