5 Ways to Accelerate andScale Out PostgreSQL
Denis Magda
Apache Ignite Committer and PMC Chair
GridGain VP of Product Management
2019 © GridGain Systems @denismagda
Agenda
1
• Tapping into RAM with caching techniques
• Sharding and replication solutions
• Cache and scale out with in-memory data grids
• Q&A
2019 © GridGain Systems @denismagda2
Caching Techniques
2019 © GridGain Systems @denismagda
Ultimate Purpose of Caching
3
Speed up operations by reducing
disk access and computation (i.e. CPU)
2019 © GridGain Systems @denismagda
Computer Latency at Human Scale
4
System Event Actual Latency Scaled Latency
One CPU cycle 0.4 ns 1 s
Level 1 cache access 0.9 ns 2 s
Level 2 cache access 2.8 ns 7 s
Level 3 cache access 28 ns 1 min
Main memory access (DDR DIMM) ~100 ns 4 min
Intel Optane DC persistent memory access ~350 ns 15 min
Intel Optane DC SSD I/O < 10 µs 7 hrs
NVMe SSD I/O ~25 µs 17 hrs
SSD I/O 50-150 µs 1.5 - 4 days
Rotational disk I/O 1 – 10ms 1 – 9 months
Internet: SF to NY 65 ms 5 years
2019 © GridGain Systems @denismagda
Computer Latency at Human Scale
5
System Event Actual Latency Scaled Latency
One CPU cycle 0.4 ns 1 s
Level 1 cache access 0.9 ns 2 s
Level 2 cache access 2.8 ns 7 s
Level 3 cache access 28 ns 1 min
Main memory access (DDR DIMM) ~100 ns 4 min
Intel Optane DC persistent memory access ~350 ns 15 min
Intel Optane DC SSD I/O < 10 µs 7 hrs
NVMe SSD I/O ~25 µs 17 hrs
SSD I/O 50-150 µs 1.5 - 4 days
Rotational disk I/O 1 – 10ms 1 – 9 months
Internet: SF to NY 65 ms 5 years
2019 © GridGain Systems @denismagda
Basic Types of Caching in Postgres
6
• Query result caching
• Query plan caching
• Relation caching
– Data and indexes
2019 © GridGain Systems @denismagda
Relation caching: Shared Buffer and OS Buffer
7
• Postgres Shared Buffer Cache
– Allocated and managed by Postgres
• OS Buffer (aka. Page Cache)
– Caches chunks (pages) of files
• Suggestions/considerations:
– No silver bullet – select and tune
– Possible duplication between shared and OS caches
– Limited by local RAM capacity
Postgres
Shared Buffer
OS Buffer
Disk
Data flow on
reads/writes
2019 © GridGain Systems @denismagda8
Horizontal Scalability
2019 © GridGain Systems @denismagda
Defining Requirements for Solution
9
• Strong Consistency (ACID)
• Load Balancing
• High-Availability and Failover
2019 © GridGain Systems @denismagda
Pgpool 2 for Read-Heavy Workloads
10
• Pgpool coordinator
• Primary for reads and writes
• Hot replicas for reads
• Suggestions/considerations:
– Good for load balancing of read-
heavy workloads
– ACID enforces sync replication
and limits a number of replicas
– Primary machine capacity defines
your total cluster capacity
2019 © GridGain Systems @denismagda
Sharding With PostgreSQL-XL or CitusData
11
• Coordinator keeps metadata and distributes queries
• Data nodes store shards/partitions
• Supports data co-location and JOINs
• Suggestions/considerations:
– Suited for mixed workloads
– Total capacity is your cluster capacity
– Scaling and failover is not trivial
– Disk-based solution
2019 © GridGain Systems @denismagda12
Caching and Scaling With In-Memory Data Grids
2019 © GridGain Systems @denismagda
Apache Ignite/GridGain In-Memory Computing Platform
Mainframe NoSQL HadoopIgnite Persistence
Persistent Layer
RDBMS
Machine and Deep Learning
EventsStreamingMessagingTransactio
nsSQLKey-Value
Service GridCompute Grid
Application Layer
Web SaaS SocialMobile IoTR
olli
ng U
pgra
des
Security
& A
uditin
g
Monitoring &
Manag
em
ent
Se
gm
en
tatio
n P
rote
ctio
n
Data
Cente
rR
eplic
ation
Netw
ork
Backups
Full,
Incre
menta
l, C
ontinuous B
ackups
Poin
t-in
-Tim
e R
ecovery
Hete
rogeneous R
ecovery
In-Memory Data Store
GridGain Enterprise FeaturesApache Ignite Features
2019 © GridGain Systems @denismagda14
Primary Ignite Deployment Modes
Enhance Legacy Architecture - IMDG Simplified Modern Architecture - IMDB
Ignite In-Memory Computing Platform
Application Layer
Web-Scale Apps Mobile AppsIoT Social Media
Ignite In-Memory Computing Platform
External Database
NoSQLRDBMS Hadoop
Application Layer
Web-Scale Apps Mobile AppsIoT Social Media
Ignite Persistence
2019 © GridGain Systems @denismagda
How Postgres is Accelerated?
15
2019 © GridGain Systems @denismagda
Distributed SQL
Persistent Store
ANSI-99 SQL
Compute Grid
JDBC ODBC
C++.NETJava RESTBinary Protocal
(Thin client)
In-Memory Data StoreIndexes on
RAM or Disk
Ignite Distributed SQL SupportCross-platform
Compatibility
DDL & DML
Support
SELECT, UPDATE,
INSERT, MERGE,
CREATE, DELETE
& ALTER
Dynamic
Scaling
2019 © GridGain Systems @denismagda
Holy Grail of Distributed World: Affinity Collocation
17
• Related data is on the same node
– Countries and Cities
– Departments and Employees
• Collocated Processing
– Efficient Distributed JOINs
– Collocated Computations
– Reduced network traffic
– Performance boost!
2019 © GridGain Systems @denismagda18
Ignite SQL Queries
1. Initial Query
2. Query execution over local data
3. Reduce multiple results in one
Ignite Node
Canada
Toronto
Ottawa
Montreal
Calgary
Ignite Node
IndiaMumbai
New Delhi
1
2
23
2019 © GridGain Systems @denismagda
Life Without Stored Procedures: Compute Grid
GridGain Cluster
C1
R1
C2
R2
C = C1 + C2
R = R1 + R2
C = Compute
R = Result
in T/2 time
Automatic Failover
Load Balancing
Zero Deployment
In-Memory Data Store
Persistent Store
Server Node
In-Memory Data Store
Persistent Store
Server Node
2019 © GridGain Systems @denismagda
Transactions and Consistency
20
• Distributed Key-Value Transactions
– 2 phase commit protocol
– Spans to Postgres
• Transactional SQL (Beta)
– MVCC
• Strong or relaxed consistency
– Atomic and transactional tables
– Tunable Write-ahead-log settings
2019 © GridGain Systems @denismagda
Consistency Across Postgres and Ignite/GridGain
• Coordinator writes to the database first
• Commits in the cluster afterwards
• The database must be transactional
– Postgres!
2019 © GridGain Systems @denismagda22
Demo
2019 © GridGain Systems @denismagda
Apache Ignite – We’re Hiring ;)
23
• Rapidly Growing Community
• Great Way to Learn Distributed
Storages, Computing, SQL, ML,
Transactions
• How To Contribute:
– https://ignite.apache.org/
2019 © GridGain Systems @denismagda
-
50,000
100,000
150,000
200,000
Ap
r-1
4
Jun
-14
Au
g-1
4
Oct-
14
De
c-1
4
Fe
b-1
5
Ap
r-1
5
Jun
-15
Au
g-1
5
Oct-
15
De
c-1
5
Fe
b-1
6
Ap
r-1
6
Jun
-16
Au
g-1
6
Oct-
16
De
c-1
6
Fe
b-1
7
Ap
r-1
7
Jun
-17
Au
g-1
7
Oct-
17
De
c-1
7
Fe
b-1
8
Ap
r-1
8
Jun
-18
Au
g-1
8
Oct-
18
De
c-1
8
Apache Ignite Is a Top 5 Apache Project
Over 2M downloads per year
and 4M total downloadsTop 5 Dev Mailing Lists
1.
2.
3.
4.
5.
Top 5 User Mailing Lists
1.
2.
3.
4.
5.
Monthly Ignite/GridGain Downloads
From January 1, 2019 Apache Software Foundation Blog Post:
“Apache in 2018 – By The Digits”
A Top 5 Apache Software Foundation Project
2019 © GridGain Systems @denismagda25
Q&A
@apacheignite@gridgain