+ All Categories
Home > Documents > Amazon ElastiCache backed by DynamoDB -...

Amazon ElastiCache backed by DynamoDB -...

Date post: 17-Feb-2018
Category:
Upload: duongdan
View: 217 times
Download: 0 times
Share this document with a friend
21
Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela Miao September 26 th , 2013 1
Transcript
Page 1: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Amazon ElastiCache

backed by DynamoDB Summer Internship Presentation

Daniela Miao

September 26th, 2013

1

Page 2: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Agenda

• Product Overview

• Intern Project Scope

• Achievements – Working Prototype

• Major Challenges

• Design

• Implementation

• Preliminary Performance Results

• Future Work

• Open Questions

• Q&A

2

Page 3: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

ElastiCache Product Overview

• In-memory cache in the cloud, backed up the popular

Memcached engine (used by Facebook, Livejournal etc.)

• Improves the performance of web applications by allowing

retrieval of information from fast cache nodes and clusters

• Existing customers: airbnb, PBS, tapjoy etc.

3

Page 4: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

ElastiCache Product Overview

• Existing Problems • In-memory cache lacks data persistence and durability

• Loses all data in case of power outages, node failures or inadvertent machine reboots

• Customers are interested in getting best of both worlds: scalable performance of in-memory cache and data reliability across node reboots.

4

Page 5: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Intern Project Scope

• Explore feasibility of the product and major challenges

• Focus on “set” requests to memcached (write requests to DynamoDB)

• Prototype focused on solving the major issue of maintaining consistency across memcached engine and DynamoDB, without extensively considering error cases

• Generate initial performance results to gain a basic understanding of the impacts

• Document design progress on wiki, including a quick overview of the basic memcached architecture (included in Appendix)

5

Page 6: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Achievements – Working Prototype

• Connection between a local instance of ElastiCache

(memcached engine) and DynamoDB instance launched

created via AWS Console

• AWS Console serves all AWS products/services

• Supports manual request (demo completed in Amazon)

• Supports automated requests (stress test including

hundreds of concurrent requests)

• Difficult!

6

Page 7: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Challenges – Design & Implementation

• Memcached is open-source, scalable caching engine

written in C - unfortunately not documented very well

• Libevent enables powerful and efficient connection management

• Major Issues:

1. How to integrate DynamoDB backend without compromising

existing memcached performance (using libevent)

• Current solution: A second thread pool for database operations

2. No existing C Client for DynamoDB

• Current solution: Custom C wrapper around the C++ Client (in dev)

3. Maintaining consistency across memcached engine and

DynamoDB table (behavior with concurrent sets on same key)

• Current solution: Additional counter hash table to keep track of item updates

7

Page 8: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Challenges – Design & Implementation

• Memcached is open-source, scalable caching engine

written in C - unfortunately not documented very well

• Libevent enables powerful and efficient connection management

• Major Issues:

1. How to integrate DynamoDB backend without compromising

existing memcached performance (using libevent)

• Current solution: A second thread pool for database operations

2. No existing C Client for DynamoDB

• Current solution: Custom C wrapper around the C++ Client (in dev)

3. Maintaining consistency across memcached engine and

DynamoDB table (behavior with concurrent sets on same key)

• Current solution: Additional counter hash table to keep track of item updates

8

Page 9: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Consistency Problem

Memcached

command key value

set foo bar

set foo newBar

set foo newestBar

concurrent

1

2

3

set order in memcached

DynamoDB

command key value

set foo newBar

set foo newestBa

r

set foo bar

2

3

1

dispatched to multiple DB threads, could arrive at DynamoDB in any order

9

Page 10: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Consistency Problem

Memcached

command key value

set foo bar

set foo newBar

set foo newestBar

concurrent

1

2

3

set order in memcached

DynamoDB

command key value

set foo newBar

set foo newestBa

r

set foo bar

2

3

1

dispatched to multiple DB threads, could arrive at DynamoDB in any order

Memcached has “newestBar”

DynamoDB has “bar”

10

Page 11: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

• Proposal: keep a counter value per key in a global table

Set request

Store to memcached

Lock the item based on

key

Unlock the item

Increment the counter based on key

Dispatch write request to DynamoDB

Consistency Solution (Naïve)

11

Page 12: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Consistency Problem Revisited

Memcached

command key value counter

set foo bar 1

set foo newBar 2

set foo newestBar 3

concurrent

1

2

3

set order in memcached

DynamoDB

command key value counter

set foo newBar 2 2

set #2 arrives at DynamoDB first, performs a write if key “foo” does not exist, successfully writes to DB

12

Page 13: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Memcached

command key value counter

set foo bar 1

set foo newBar 2

set foo newestBar 3

concurrent

1

2

3

set order in memcached

DynamoDB

command key value counter

set foo newBar 2

set foo newestBar 3

2

3

set #3 arrives, performs a get first, checks to see its own counter value is greater than 2. Performs a write if current DB value is still 2

Consistency Problem Revisited

13

Page 14: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Memcached

command key value counter

set foo bar 1

set foo newBar 2

set foo newestBar 3

concurrent

1

2

3

set order in memcached

DynamoDB

command key value counter

set foo newBar 2

set foo newestBar 3

set foo bar 1

2

3

1

set #1 arrives, performs a get first, finds its counter value to be less than 2. Aborts write to DynamoDB.

Consistency Problem Revisited

14

Page 15: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Preliminary Performance Results

• Local testing using memaslap (memcached engine

running on Developer Desktop, DynamoDB in Oregon)

• Different workloads representing different set/get ratios

• In write-behind, all cases hit the DynamoDB write request limit

(causing many failed sets), except for the low set/get ratio case

0

5000

10000

15000

20000

Average Latency (us) Throughput (OPS/s)

Low Set/Get Ratio Workload

Baseline Write-Behind Write-Through

0

5000

10000

15000

20000

25000

30000

35000

40000

Average Latency (us) Throughput (OPS/s)

High Set/Get Ratio Workload

Baseline Write-Behind Write-Through

15

Page 16: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Future Work

Some critical items to turn the prototype into production:

• Extending the consistency solution (across memcached

and DynamoDB table) to write-through scenario as well

• Optimizing the DynamoDB operations in memcached to

reduce latency and increase throughput

• Design a highly concurrent C Client for DynamoDB

• Setting up proper credential management for memcached

engine to access DynamoDB tables

16

Page 17: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Open Questions

• Intern project invokes product definition questions: • Using DynamoDB as primary data storage, versus just as a backup

• Cache misses could trigger “get” requests to DynamoDB

• At startup, ElastiCache node could be warmed up by existing DynamoDB table(s)

• Backend configuration (write-behind versus write-through) • Write-Behind: asynchronous data reads and writes to DynamoDB

• Write-Through: synchronous reads and writes – more persistence! • Key/value not stored in memcached until write to DynamoDB succeeds

• Default behavior when an asynchronous DynamoDB requests fail (remove from memcached as well?)

• Is DynamoDB the correct backend support for ElastiCache? Currently there is a 64KB data limit, memcached’s limit is 1MB.

17

Page 18: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Appendix

18

Page 19: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Preliminary Performance Numbers

Low Set/Get Ratio Scenario

Average Latency (us) Standard Deviation (us) Throughput (OPS/s)

Baseline 929 2081.81 17191

Write-Behind 1081 11612.8 14782

Write-Through 4239 18704.88 3773

High Set/Get Ratio Workload

Average Latency (us) Standard Deviation (us) Throughput (OPS/s)

Baseline 1206 1156.5 13244

Write-Behind 1371* 16804.64* 11654*

Write-Through 37424 40095.22 428

*Error rate is very high (~88%) (either because DynamoDB task queues are too full, or memcached is out of memory slabs

19

Page 20: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Dispatcher Thread Class – Listens on server socket and dispatches connections

to Worker Thread via Libevent

Worker Thread Class – Handles incoming client connections and uses

Libevent to communicate with Connection Class

Slab Class – Handles low-level memory management

with slab allocation

Item Class – Handles high-

level item management

in memcached

Connection Class – Contains information on all the connections and

handles client requests

Libevent Class –

Listens on given file

descriptor for a set of

events, and

executes callback

on activation

Adds incoming connection to queue

Notifies Worker Thread of new connection queue item

Notifies Connection Class when a client request comes in

Original Memcached Design 20

Page 21: Amazon ElastiCache backed by DynamoDB - …dig.csail.mit.edu/2013/Talks/dig-seminar-0926-daniela.pdf · Amazon ElastiCache backed by DynamoDB Summer Internship Presentation Daniela

Dispatcher Thread Class – Listens on server socket and dispatches connections to Worker

Thread via Libevent

Worker Thread Class – Handles incoming client connections and uses Libevent to

communicate with Connection Class. Work with Connection Class to dispatch tasks to DB Worker Thread. Then use Libevent to notify.

Slab Class – Handles low-level memory management

with slab allocation

Item Class – Handles high-

level item management in

memcached Connection Class – Contains information on

all the connections and handles client requests

Libevent Class –

Listens on given file

descriptor for a set of events, and

executes callback on activation

Adds incoming connection to queue

Notifies Worker Thread of new connection queue item

Depending on write-behind vs. write-through configuration, notifies Connection Class when request can continue

DB Worker Thread Class – Handles read/write to DB and uses Libevent

to notify completion of tasks DB

Notifies DB Worker Thread to read/write from DB, DB Worker responds after completion

Extended Memcached Design 21


Recommended