AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)

Post on 24-Jan-2018

562 views 2 download

transcript

HOW CITUS ENABLESSCALABLE POSTGRES

ON AWS

WILL LEINWEBER@LEINWEBER

CITUSDATA.COM

ORCHESTRATION

CITUS

OPEN SOURCE POSTGRES EXTENSION

github.com/citusdata/citus

citusdata.com

DISTRIBUTED POSTGRES

WORKER NODES AND SHARDS

Worker 0

Shard 0

Shard 1

Shard 2

Worker 1

Shard 3

Shard 4

Shard 5

POSTGRES AND TABLES

PG Server 0

Table 0

Table 1

Table 2

PG Server 1

Table 3

Table 4

Shard 5

citus (coordinator node)=# \d

List of relations

Schema | Name | Type | Owner

--------+---------------+-------+----------

public | cw_metrics | table | citus

public | events | table | citus

citus (worker node)=# \d

List of relations

Schema | Name | Type | Owner

--------+---------------------+-------+----------

public | cw_metrics_102008 | table | citus

public | cw_metrics_102012 | table | citus

public | cw_metrics_102016 | table | citus

public | cw_metrics_102064 | table | citus

public | cw_metrics_102068 | table | citus

public | events_102104 | table | citus

public | events_102108 | table | citus

public | events_102112 | table | citus

public | events_102116 | table | citus

WRITES

citus=> select * from pg_dist_shard limit 10;

logicalrelid | shardid | shardstorage | shardalias | shardminvalue | shardmaxvalue

--------------+---------+--------------+------------+---------------+---------------

19395 | 102040 | t | (null) | -2147483648 | -2013265921

19395 | 102041 | t | (null) | -2013265920 | -1879048193

19395 | 102042 | t | (null) | -1879048192 | -1744830465

19395 | 102043 | t | (null) | -1744830464 | -1610612737

19395 | 102044 | t | (null) | -1610612736 | -1476395009

19395 | 102045 | t | (null) | -1476395008 | -1342177281

19395 | 102046 | t | (null) | -1342177280 | -1207959553

19395 | 102047 | t | (null) | -1207959552 | -1073741825

19395 | 102048 | t | (null) | -1073741824 | -939524097

19395 | 102049 | t | (null) | -939524096 | -805306369

MULTI-TENANT MODEL

System of record for

B2B SaSS apps scaling from

1 to 100,000s of tenants

MULTI-TENANT MODEL

easy to migrate

easy to maintain (HA/DR/Backup/etc)

easy to scale vertically and horizontally

MULTI-TENANT MODEL

shard all tables on

same column e.g. tenant_id

MULTI-TENANT MODEL

where org_id=3

MULTI-TENANT MODEL

where org_id=3

MULTI-TENANT MODEL

SQL on trillions of events for

high-value, real-time analytics

PARALLEL MODEL

postgres

lower cost

extendable & flexible

PARALLEL MODEL

PARALLEL MODEL

PARALLEL MODEL

PARALLEL MODEL

CITUS MX

https://www.citusdata.com/blog/2016/09/22/announcing-citus-mx/

>500,000 writes/second (YCSB benchmark)

>7 million bulk loading with COPY

32 node cluster

RUN YOURSELF OR HOSTED

HOSTED TRADEOFFS

HEROKU

SINATRA & SEQUEL

INTERACTIVE CONSOLE

[1] pry(main)> Resource.all

(MOSTLY) IMMUTABLE SERVERS

OUTSIDE-IN SSH&PSQL

(MICRO)SERVICES

ASIDE: CONWAY’S LAW

DOWNSIDES OF MICROSERVICES

CURRENTLY: TWO SERVICES

external console/dashboard

internal control plane

CONSOLE

Customer UI

Create / Destroy

Billing

Metrics

CONTROL PLANE

Server Provisioning

Monitoring

Healing

COMMONALITIES

CONTROL PLANE MODELS

FORMATION & SERVER

WAL-E

github.com/wal-e/wal-e

TIMELINES

REPLICAS

AFTER RECOVERY

SERVER REPLACEMENT

VIDEO GAME DEVELOPMENT

FEEL / TICK

STATE MACHINES

creating available

uncertain

unavailable

deprovisioned

ALWAYS CONVERGING

SIDEKIQ AND REDIS

m,

`$b

.ss, $$: .,d$

`$$P,d$P' .,md$P"'

,$$$$$bmmd$$$P^'

.d$$$$$$$$$$P'

$$^' `"^$$$' ____ _ _ _ _

$: ,$$: / ___|(_) __| | ___| | _(_) __ _

`b :$$ \___ \| |/ _` |/ _ \ |/ / |/ _` |

$$: ___) | | (_| | __/ <| | (_| |

$$ |____/|_|\__,_|\___|_|\_\_|\__, |

.d$$ |_|

QUEUE PROBLEMS

https://upload.wikimedia.org/wikipedia/commons/0/01/James_May's_Lego_House_a_long_queue_of_volunteers.jpg

SEMAPHORES & CONCURRENCY

class Formation < Actor

def add_node

add_server

servers.each do |s|

s.sem.incr(:configure)

end

end

end

SEMAPHORES & CONCURRENCY

class Server < Actor

def work_running

if sem.pos?(:configure)

self.state = :configuring

save_changes

end

end

def work_configuring

configure_stuff

sem.decr(:configure)

self.state = :running

save_changes

end

end

IN-DEPTH DETAILS

www.citusdata.com/blog/2016/08/12/state-machines-to-run-databases/

CUSTOMER-VISIBLE METRICS

citus=> select * from cw_metrics limit 1;

-[ RECORD 1 ]+-------------------------------------

server_id | f2239a4b-7297-4b66-b9e3-851291760b70

aws_id | i-723a927805464ac8b

name | NetworkOut

timestamp | 2016-07-28 14:13:00-07

sample_count | 5

average | 127505

sum | 637525

minimum | 111888

maximum | 144385

unit | BytesIndexes:

"cw_metrics_pkey" PRIMARY KEY, btree (server_id, "timestamp", aws_id, name)

ADMIN-METRICS

citus=> select * from events order by created_at desc limit 2;

-[ RECORD 1 ]-

id | 9a3dfdbd-c395-40bb-8d25-45ee7c913662

name | Timeout::Error

created_at | 2016-07-28 13:18:47.289917-07

data | {"id": "5747a999-9768-429c-b13c-c7c0947dd950", "class": "Server", "message": "execution expired"}

-[ RECORD 2 ]-

id | ba9d6a13-0832-47fb-a849-02f1362c9019

name | Sequel::DatabaseConnectionError

created_at | 2016-07-28 12:58:40.506267-07

data | {"id": "232835ec-31a1-44d0-ae5b-edafb2cf6978", "class": "Timeline", "message": "PG::

SELECT count(*), name, date_trunc('hour', created_at) as hour

FROM events

WHERE created_at > now()-'1 week'::interval

GROUP BY name, hour;

ONE MISADVENTURE

WILL LEINWEBER@LEINWEBER

CITUSDATA.COMthank you