Migrating to PostgreSQL,the new story
Percona Live, 2016
Dimitri Fontaine @tapoueh
October 4, 2016
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 1 / 42
Dimitri Fontaine
PostgreSQL Major Contributor
• pgloader
• prefix, skytools
• apt.postgresql.org
• CREATE EXTENSION
• CREATE EVENT TRIGGER
• Bi-Directional Replication
• pginstall
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 2 / 42
PostgreSQL is YeSQL!
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 3 / 42
Load data into PostgreSQL. Fast.
http://pgloader.io/
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 4 / 42
Load data into PostgreSQL. Fast.
http://pgloader.io/
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 5 / 42
pgloader: Open Source, github
https://github.com/dimitri/pgloader
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 6 / 42
Let’s talk about MySQL for a minute
Just in the context of migrating from it, of course
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 7 / 42
Why Migrating from MySQL to PostgreSQL?
MySQL
• Storage Engine
• Single Application
• Data Loss with Replication
• Weak Data Types Validation
• Either transactions or
• Lack of
PostgreSQL
• Data Access Service
• Application Suite
• Durability and Availability
• Consistency
• Full Text Search, PostGIS
• Proper Documentation
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 8 / 42
The migration budget
What are the costs?
• Migrating the Data
• Migrating the Code
• Quality Assurance
• Opportunity Cost
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 9 / 42
The boring parts... are not
MySQL used not to be so serious about data consistency...
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 10 / 42
Type CASTing
Casting types from MySQL to PostgreSQL is... interesting.
• Empty Strings or NULL?
• Default Values as binary strings
• Which calendar are you using really? zero dates
• Integers and digits rather than bytes int(11)
• What is a float(20,2) anyway?
• Did you know a WHERE clause is a boolean?
• Then per column enum, and set
• Oh, and encodings too
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 11 / 42
Difficulties when migrating MySQL data
Dates and The Gregorian Calendar
MariaDB [talk]> create table dates(d datetime);
MariaDB [talk]> insert into dates
values(’0000-00-00’), (’0000-10-31’), (’2013-10-00’);
MariaDB [talk]> select * from dates;
+---------------------+
| d |
+---------------------+
| 0000-00-00 00:00:00 |
| 0000-10-31 00:00:00 |
| 2013-10-00 00:00:00 |
+---------------------+
3 rows in set (0.00 sec)
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 12 / 42
The God algorithm
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 13 / 42
The God algorithm
$ createdb pagila
$ pgloader mysql://user@localhost/sakila
pgsql:///pagila
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 14 / 42
pgloader mysql://root@localhost/sakila pgsql:///pagila
table name read imported errors total time read write
----------------------------- --------- --------- --------- -------------- --------- ---------
before load 3 3 0 0.011s
fetch meta data 86 86 0 0.429s
Create SQL Types 2 2 0 0.006s
Create tables 32 32 0 0.091s
Set Table OIDs 16 16 0 0.001s
Create MatViews Tables 14 14 0 0.044s
----------------------------- --------- --------- --------- -------------- --------- ---------
actor 200 200 0 0.009s 0.014s 0.008s
address 603 603 0 0.020s 0.030s 0.020s
...............
film_text 1000 1000 0 0.062s 0.040s 0.062s
inventory 4581 4581 0 0.050s 0.195s 0.050s
payment 16049 16049 0 0.218s 0.414s 0.218s
rental 16044 16044 0 0.315s 0.465s 0.314s
actor_info 200 200 0 0.014s 0.904s 0.014s
mv.customer_list 599 599 0 0.021s 0.078s 0.021s
mv.film_list 997 997 0 0.050s 0.141s 0.050s
mv.nicer_but_slower_film_list 997 997 0 0.035s 0.154s 0.035s
mv.sales_by_film_category 16 16 0 0.002s 0.143s 0.002s
----------------------------- --------- --------- --------- -------------- --------- ---------
COPY Threads Completion 69 69 0 1.779s
Create Indexes 41 41 0 0.431s
Index Build Completion 41 41 0 0.002s
Reset Sequences 0 13 0 0.026s
Primary Keys 1 1 0 0.014s
Create Foreign Keys 22 22 0 0.082s
Create Triggers 30 30 0 0.037s
Install Comments 0 0 0 0.000s
----------------------------- --------- --------- --------- -------------- --------- ---------
Total import time 50086 50086 0 2.750s 3.071s 1.008s
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 15 / 42
pgloader: what about loading data?
http://pgloader.io/
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 16 / 42
pgloader main features
pgloader is built around copy
• Error handling and reject files
• On the fly data transformations
• Very simple command line for simple use cases
• Advanced command language for advanced use cases
• Parallelism to benefit from async IO
• Lots of input formats
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 17 / 42
CSV
http://pgloader.io/howto/csv.html
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 18 / 42
COPY
http://pgloader.io/howto/quickstart.html
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 19 / 42
dBase III
http://pgloader.io/howto/dBase.html
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 20 / 42
SQLite
http://pgloader.io/howto/sqlite.html
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 21 / 42
MySQL
http://pgloader.io/howto/mysql.html
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 22 / 42
MS SQL Server
http://pgloader.io/
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 23 / 42
pgloader database migration process
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 24 / 42
pgloader database migration process
The data migration process, step by step
1 Fetch metadata from source database catalogs
2 Prepare PostgreSQL database
3 COPY data
4 Complete PostgreSQL database
5 Display summary (human readable, json, csv, copy)
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 25 / 42
Fetching Metadata
Currently supported metadata
• Schemas
• Tables
• Columns
• Default Values
• Indexes
• Constraints
• Comments
• Materializing Views
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 26 / 42
Prepare PostgreSQL database
Prepare PostgreSQL for receiving the data
• Schemas
• Tables
• Columns
• Rename indexes with table oids in memory
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 27 / 42
Copy Data
Copy the data from the source to the target
• For each table, COPY data in
• 3+ threads work in parallel (reader/transformer/writer)
• Then for each table, create all indexes in parallel
• max parallel create index
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 28 / 42
Complete PostgreSQL Database
Install constraints
• Reset Sequences
• Upgrade unique indexes into Primary Keys where required
• Foreign Keys
• Triggers
• Comments
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 29 / 42
Migrating columns
What do you mean columns?
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 30 / 42
A simple use case
Remember that example? we’ll see a more detailed one...
$ createdb pagila
$ pgloader mysql://user@localhost/sakila
pgsql:///pagila
$ pgloader migration.load
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 31 / 42
An avdanced use case 1/4
load database
from mysql://root@localhost/sakila
into postgresql:///sakila
WITH concurrency = 1, workers = 6,
max parallel create index = 4,
-- options to use to load into an existing schema
-- create no tables, include drop, truncate,
downcase identifiers, -- quote idenfifiers
-- data only, schema only,
-- create [ no ] indexes, reset [no ] sequences,
-- [ no ] foreign keys
SET maintenance_work_mem to ’128MB’, work_mem to ’12MB’,
search_path to ’sakila, public, "$user"’Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 32 / 42
An avdanced use case 2/4
-- MATERIALIZE VIEWS film_list, staff_list
MATERIALIZE ALL VIEWS
ALTER TABLE NAMES MATCHING ~/_list$/,
’sales_by_store’,
~/sales_by/
SET SCHEMA ’mv’
ALTER TABLE NAMES MATCHING ’sales_by_store’
RENAME TO ’sales_by_store_list’
ALTER TABLE NAMES MATCHING ’film’
RENAME TO ’films’
-- INCLUDING ONLY TABLE NAMES MATCHING ~/film/, ’actor’
-- EXCLUDING TABLE NAMES MATCHING ~<ory>
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 33 / 42
An avdanced use case 3/4
CAST type datetime to timestamptz
drop default drop not null
using zero-dates-to-null,
column bools.a to boolean drop typemod
using tinyint-to-boolean,
type char when (= precision 1)
to char keep typemod,
column ascii.s using byte-vector-to-bytea,
column enumerate.foo
using empty-string-to-null
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 34 / 42
An avdanced use case 4/4
BEFORE LOAD DO
$$ create schema if not exists sakila; $$,
$$ create schema if not exists mv; $$,
$$ alter database sakila
set search_path to sakila, mv, public;
$$;
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 35 / 42
pgloader: load data into PostgreSQL
http://pgloader.io/
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 36 / 42
And more to come
File formats with on-the-fly normalisation
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 37 / 42
Other database systems
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 38 / 42
You can become a sponsor!
http://pgloader.io/pgloader-moral-license.html
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 39 / 42
What sponsors has to say about it, iwoca
Thanks to pgloader we were able to migrate our maindatabase from MySQL to Postgres, which involvedmoving hundreds of tables used by our complex Djangoproject. Dimitri was very helpful. He implemented a newfeature for us quickly and smoothly.
http://pgloader.io/pgloader-moral-license.html
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 40 / 42
What sponsors has to say about it, Fusionbox
Fusionbox used pgloader on a project for a largegovernment agency. We needed to migrate a largeset of data from an existing SQL Server cluster toa new PostgreSQL solution. Pgloader greatlyreduced the time required to accomplish thiscomplex migration.
http://pgloader.io/pgloader-moral-license.html
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 41 / 42
Questions?
Now is the time to ask!
Dimitri Fontaine @tapoueh Migrating to PostgreSQL, the new story October 4, 2016 42 / 42