+ All Categories
Home > Documents > FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik...

FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik...

Date post: 27-Feb-2021
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
47
Introduction Design and Architecture FAWN-DS FAWN-KV Evaluation FAWN - a Fast Array of Wimpy Nodes Tomasz Dubrownik University of Warsaw January 12, 2011 Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes
Transcript
Page 1: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

FAWN - a Fast Array of Wimpy Nodes

Tomasz Dubrownik

University of Warsaw

January 12, 2011

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 2: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Outline

1 Introduction

2 Design and Architecture

3 FAWN-DS

4 FAWN-KV

5 Evaluation

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 3: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Key issues

Growing CPU vs. I/O gap

Contemporary systems must serve millions of users

Electricity consumed adds up to significant costs

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 4: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Key issues

Is there a way to exploit the CPU vs. I/O gap to the users’advantage?

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 5: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Observations

Many industry problems exhibit massive data parallelism withrelatively small computational demands

A fair amount of real-life problems heavily depends onefficient, distributed key-value stores that span severalgigabytes

Such stores often contain millions of small items (on the orderof kilobytes)

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 6: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

A motivating example

Twitter

A wonderfully popular service, Twitter has all the above-mentionedproperties. Each tweet is limited to 140B. There is fairly littleprocessing performed on the tweets, yet just the search system isstressed by an average of 12000 queries per second. There is astream of over a thousand tweets per second entering the system.A high-performance key-value store is crucial to the operation. Atthe same time the cost of running a conventional cluster capable ofmeeting this demand is extremely high.

Disclaimer

To my knowledge, FAWN is not being used in Twitter. But itwould probably make a lot of sense if it were. Thank you.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 7: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

The problem, defined

To engineer a fast, scalable key-value store for small (hundreds tothousands of bytes) itemsThis store is expected to:

respond to upwards from thousands of random queries persecond (QPS)

conserve power as much as possible

meet service level agreements regarding latency

scale well upwards as the system grows

scale well downwards as demand fluctuates during operatinghours

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 8: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Possible solutions (1)

A cluster of traditional servers with HDD as storage.Problems:

very poor performance for random accesses, unless RAID or asimilar disk array is used

if RAID is to be used, both initial price and total cost ofownership skyrocket

most of the power consumption is fixed — not much power isconserved during low load periods

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 9: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Possible solutions (2)

A cluster of traditional servers with RAM as storage (thinkmemcached)Problems:

very high cost in terms of $/GB

robustness is lost unless additional systems are employed

power consumption is just as bad as before

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 10: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Possible solutions (3)

A cluster of traditional servers with SSD as storageProblems:

while random reads are great, random writes are terrible(BerkleyDB running on SSD averages just 0.07MBps)

power consumption is just as bad as before

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 11: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Possible solutions (4)

A combination of the aboveProblems:

a combination of the above :)

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 12: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Introducing FAWN

A slightly different approach:

Let’s use energy-efficient, wimpy processors coupled with fastSSD storage.

Design a custom key-value store exploiting the characteristicsof flash storage.

That way power consumption can be kept to a minimumwhile retaining high performance and robustness.

The resulting system has a lower total cost of ownership andgood scalability.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 13: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Outline

1 Introduction

2 Design and Architecture

3 FAWN-DS

4 FAWN-KV

5 Evaluation

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 14: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Anatomy of a key-value data store

A request can be either a get, put or delete

Keys are 160-bit integers

Values are small blobs (typically between 256B and 1KB)

Each request pertains to a single key-value pair — there is norelational overlay at this level

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 15: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Overview

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 16: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Overview

The cluster is composed of Front-ends and Back-ends

Front-ends forward requests to appropriate back-ends andreturn responses to clients

The front-ends are responsible for maintaining order in thecluster

Back-ends run the FAWN-DS datastores (one per key-range)

Together the machines form a single FAWN-KV key-valuestore

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 17: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Front-end

Responsibilities:

passing requests and responses

keeping track of back-ends’ Virtual IDs and their mapping tokey ranges

managing joins and leaves.

Example configuration used for evaluation:

Intel Atom CPU (27 W)

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 18: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Back-end

A back-end runs one FAWN-DS data store per key range.Each data store supports the basic key-value requests, as well asmaintance operations (Split, Merge, Compact)Example configuration used for evaluation:

AMD Geode LX CPU (500MHz)

256MB DDR SDRAM (400MHz)

100Mbps Ethernet

Sandisk Extreme IV CompactFlash (4GB)

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 19: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Back-ends, cont.

Back-ends are organized in a logical ring which coincides withthe key space (mod 2160)

Each back-end is assigned a fixed number of Virtual IDs inhopes of maintaining balance

Virtual IDs are the lowest keys a node handles

This allows for a well-defined successor relation on keys andvirtual nodes

More on this later.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 20: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Outline

1 Introduction

2 Design and Architecture

3 FAWN-DS

4 FAWN-KV

5 Evaluation

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 21: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Peculiarities of flash storage

Flash media differ from traditional HDDs in a number of ways,some of which seriously impact persistent data store designs.

Random reads are nearly as fast as sequential reads

Random writes are very inefficient (owing to the fact that awhole page needs to be flashed)

Sequential writes perform admirably

On modern devices, semi-random writes (random appends toa small number of files) are nearly as fast as sequential writes

These features can be exploited by using a log-structured datastore.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 22: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

FAWN-DS

To take advantage of the properties of flash storage, FAWN-DS isstructured as follows:

The key-value mappings are stored in a Data Log on the flashmedium. This store is append-only.

To provide fast random access, a hash index map into the datalog is kept in RAM. In order to reduce the memory footprint,keys are reduced, inflicting as a trade-off a (configurable)chance of necessitating more than one flash access.

To reclaim unused storage space, a Compact operation isintroduced. It is designed to be as efficient as possible onflash, using only bulk sequential writes.

In order to facilitate reconstruction of the in-memory index,checkpointing is utilized.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 23: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Lookup

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 24: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Lookup cont.

Two smaller numbers are extracted from the key:

The index bits — the lowest i bitskey fragment — the next lowest k bits

The index bits serve as an index into the first in-memory hashindex.

If the bucket pointed to by the index bits is valid and the keyfragments match, the data log entry is retrieved and the fullkeys compared.

If keys match, the record is returned, otherwise the nextbucket in the hash chain is examined as above.

If nothing is found, an appropriate response is generated.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 25: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Lookup, now in pseudocode!

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 26: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Store and Delete

When a value is inserted into the store, it is simply appended tothe data log and the corresponding bucket are changed to point tothe new record. The valid bit is set to true.When a record is to be deleted, a delete entry is appended to thelog (for fault-tolerance) and the valid bit in the correspondingbucket is set to false.Actual storage space is not reclaimed until a Compact is performed.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 27: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Maintenance operations

Split is issued when the key range is divided as a new virtualnode joins the ring. It scans the data log sequentially andwrites out the appropriate entries into a new one.

Merge is responsible for merging two data stores into one,encompassing the combined key range. It achieves this bycopying entries from one log into the other.

Compact copies the valid data store entries into a new log,skipping those that have been orphaned by puts and thosethat were actively deleted.

Owing to the append-only design it is possible to perform theseoperations concurrently with normal requests, only locking toswitch data stores while finalizing maintenance.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 28: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Outline

1 Introduction

2 Design and Architecture

3 FAWN-DS

4 FAWN-KV

5 Evaluation

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 29: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

In order to provide a robust, scalable service the back-ends runningFAWN-DS instances are joined together and managed by front-endnodes, which in turn in industry applications would be connectedto a master node.

Fault-tolerance is introduced via replication

Each front-end is ideally responsible for some 80 back-endsand manages joins and leaves, exposing a simple put, get,delete interface

Additionally, front-ends can route requests betweenthemselves and cache responses, leaving the master node asan optimization and a convenience without leaving it a singlepoint of failure

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 30: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Life-cycle of a request

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 31: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Life-cycle of a request, elaborated

Each front-end is assigned a contiguous portion of the keyspace

Upon receiving a request it either processes it using itsmanaged back-ends or forwards it if the key belongs to adifferent front-end

Front-ends maintain a list of virtual nodes and theircorresponding addresses, and thus can instantly translate therequest to the appropriate FAWN-DS calls

While the request is processed by back-ends, the front-endensures replication is maintained

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 32: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Replication in Chains

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 33: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Replication in Chains, cont.

Each key defines a chain in the virtual node ring

A fixed number of nodes maintains copies of the mapping

The nodes are obtained by iterating the successor function ofthe key

The first node that contains a replica is the head of the chain

The last node is the tail

Every put request is issued to the head of the chain and waits foran acknowledgement from the tail. Every get is passed to the tail.This ensures consistency and proper ordering of changesthroughout the change.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 34: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Replication of a put

After receiving the put request, the head forwards the putalong the chain and waits for an acknowledgement.

If all goes well, the tail acknowledges both to the front-endand recursively to its predecessor.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 35: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

How a join is handled

When a (virtual) node joins the FAWN-KV ring precisely one keyrange is split in two. To maintain replication the followinghappens:

The current tail transmits its whole log to the new node(pre-copy)

The front-end informs the nodes in the chain of the join via achain membership message

In response to said message, nodes flush updates receivedduring pre-copy down the chain

Please refer to the paper for details on how updates arriving duringthe flush are handled, as well as the special cases of joining as heador tail.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 36: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

What happens when a node leaves

When a node leaves the ring, each node that is supposed to takeover the replicas in essence joins the replica chain at a differentposition in the key space, so the protocol is essentially the same asfor a join.At this stage failure detection is achieved by a heartbeat. If a nodemisses a set number of heartbeat signals, the front-end initiates aleave and appropriate action is taken.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 37: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Outline

1 Introduction

2 Design and Architecture

3 FAWN-DS

4 FAWN-KV

5 Evaluation

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 38: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Procedure description

FAWN’s performance was evaluated under a number of criteria:

Single node efficiency (compared to baseline hardwarecapabilities)

Cluster performance (tested on a 21 back-end/1 front-endsystem)

Energy efficiency

The results were then compared with a number of more traditionalconfigurations.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 39: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Single node performance

Baseline:Seq. read Rand. read Seq. write Rand. write

28.5 MBps 1424 QPS 24 MBps 110 QPS

FAWN:Data size Rand read (1KB) Rand read (256B)

125MB 51968 QPS 65412 QPS1GB 1595 QPS 1964 QPS

3.5GB 1150 QPS 1298 QPS

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 40: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Gets vs Puts

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 41: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Cluster — performance and power consumption

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 42: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Important points on power consumption

The plot displayed does not take into account the front-end(further 27W)

The networking hardware used takes 20W to operate(included in the plotted figure)

Even factoring in the front-end, the system achieved 330queries per Joule. A desktop computer can provide about 50Q/J using SSD.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 43: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

CDF of Query Latency

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 44: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Comparison with alternative approaches (projected)

Important point

The FAWN entries in this table are expected performancemeasurements of systems built using state of the art components.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 45: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Solution space for system builders (projected)

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 46: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

Conclusions

FAWN is demonstrated to be a viable approach to providingcost-efficient data stores

Using wimpy processors in an array can reduce powerconsumption while retaining performance

Barring breakthrough discoveries, FAWN-like technologies areexpected to deliver the lowest TCO for a large portion of theproblem space

Larger scale testing is necessary to establish the correctness ofthese claims and to demonstrate scalability

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes

Page 47: FAWN - a Fast Array of Wimpy Nodesiwanicki/courses/ds/2010/... · 2011. 1. 13. · Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes. Introduction Design and Architecture FAWN-DS

IntroductionDesign and Architecture

FAWN-DSFAWN-KVEvaluation

References

[FAWN] D. G. Andersen, J. Franklin, M. Kaminsky, A.Phanishayee, L. Tan, and V. VasudevanFAWN: A Fast Array of Wimpy NodesProceedings ACM SOSP 2009, Big Sky, MT, USA, October2009.

All images are taken from the FAWN paper.

Tomasz Dubrownik FAWN - a Fast Array of Wimpy Nodes


Recommended