Open Source DatabaseEcosystem in 2016
Peter Zaitsev3 October 2016
2
Great things are happening with Open Source Databases
It is great Industry and Community to be a part of
3
Why ?
4
Data Continues Exponential Growth
Source: IDC, http://situationalintelligence.net/
5
It’s not humans it’s Devices
Source: IDC, http://situationalintelligence.net/
6
For Decades we…
Used Proprietary Relational Databases to Manage Structured Data
7
It does not work!
Too expensive
Too Inefficient
Too Inflexible
8
As Result
Top Internet Applications have embraced Open
Source Databases long ago
Traditional Enterprises are catching up too
9
Benefits of Open Source for Business
No Software Vendor Lock-
InMore
Flexibility More
Compatibility
Faster Innovation
10
Is not Open Source Free as in Beer ?
Free for Developer != Free for Business
Total Cost of Ownership reported to be 3-10x less
11
So
It is not surprise Open Source Databases are gaining momentum!
12
Gartner: State of Open Source RDBMS 2015
By 2018 70%+ of all Newly developed applications will run on Open Source Databases
80% of existing applications are candidates to be migrated to Open Source Database
50% of existing RDBMS instance will be converted to Open Source RDBMS
13
Black Duck Open Source Survey 2016
“Open Source Database Adoption is second only to Adoption of Open Source Operating Systems”
14
DB-Engines: Gap Is closing
15
New Categories Dominated by Open Source
16
Fast Change of Momentum
17
Truly International Innovation
All Top Open Source Database Systems have Globally Distributed Development Teams
18
Open Source Innovation at Percona Live
Keynotes from Open Source Innovators
• MySQL, MariaDB, Facebook
In-Depth Technical Presentations
• MongoDB, Redis, RocksDB, PostgreSQL
19
Trying Something New
Invited Developers and Ecosystem Members to talk about technologies which inspire them
PostgreSQLThe World’s Most Advanced Open Source Database
21
What does that mean for…
… people who work with PostgreSQL• It supports transactions in a proper way• It is flexible
…people who never work with PostgreSQL• It is old-school and difficult to use• You need to be a PostgreSQL hacker to use it
22
PostgreSQL Evolution
Inspired by https://momjian.us/main/writings/pgsql/past_present_future.pdf
23
2016 is a year of
• PostgreSQL 9.6 release• PostgreSQL 9.5 release is production ready• Both are very impressive in regard of performance and features• Postgres is in all kinds of industries• PostgreSQL community activity grows• More user-oriented conferences and meetups• Increased enterprise adoption gives a lot of feedback from users
MongoDBLeading Open Source Document Oriented Database
25
MongoDB Ecosystem
RethinkDBOpen Source Database for Real-Time Web
27
What is RethinkDB?
• Open-source database for building realtime web applications.• NoSQL database that stores schemaless JSON documents.• Distributed database that is easy to scale.• High availability database with automatic failover and robust fault tolerance.
• Supports both pull and push models – Changefeeds.• Map-reduce.• Geospatial queries and GeoJSON.• The second most popular database on GitHub.
28
RethinkDB is good for…
• Collaborative web and mobile apps.• Streaming analytics apps.• Multiplayer games.• Realtime marketplaces.• Connected devices.
29
Current State
• Initial release: July 2009• Open-sourced: November 2012• RethinkDB 2.3.x• Users and permissions.• TLS encrypted connections.• 10x better performance for distributed joins.• Windows beta.
• Latest release: 2.3.5• Improved the efficiency of the on-disk garbage collector to reduce the risk of excessive file growth.
• Improved the latency of read queries under heavy write loads.• Improved the Raft election timeout logic to avoid infinite Raft election loops.
30
Who uses RethinkDB in production?
• NASA• Jive• Narrative• Cmune• SocialRadar• Mediafly• Wise.io• Platzi
31
Agile web-development with RethinkDBtomorrow at 12:20pm
ClickHouseHigh-Performance Distributed DBMS for Analytics
(Blazing Fast Open Source Analytics Database for Petabytes of Data)
33
Faster that you can imagine
• Column-oriented• 100x faster than typical RDBMS• Distributed queries• Massively parallel• SQL
34
Linearly scalable
Features
• Petabytes of data in one cluster• Multi-Datacenter• Awesome data compression• High-availability
Main Yandex.Metrica Cluster
• 3 Pb• 6 Datacenters• 422 Nodes• 17.2 trillions of rows (17 200 000 000 000)• 20 billions rows inserted daily in realtime
35
Production proven
• More than 20 projects inside Yandex• 4+ Years in production• Highly reliable• No single point of failure• No major downtime events or data loss for years
36
Opensource
• Opensourced at June 2016• License: Apache 2.0• Tens of companies already using
ClickHouse• Ready to go!
https://github.com/yandex/clickhousehttps://clickhouse.yandex
TarantoolOpen Source NoSQL Database running in LUA application server
38
Tarantool: battering ram tool
Open source, open government
● simplified BSD license● first release October 2010● In-memory database and
application server● ACID transactions
39
Why another database?
Database visionary Jim Gray:It’s time for a complete rewrite
● lock-free transaction processing as in Gray et al paper circa 2008
● 1 000 000 transactions per second on a single core
● a database for the most volatile/hot data
40
An application server
Get your data in RAM. Get compute close to data. Enjoy the performance.
● OpenResty of the database world
● tons of modules: JSON, http, YaML, PostgreSQL, MySQL, GIS, MQTT, ect
41
Database features
● document data model● compression, lowest memory footprint● transactions, secondary keys● log streaming replication● online backup
42
The community
ProxySQLHigh Performance Open Source Proxy for MySQL and MariaDB
Also….We Have Announcement to Make
45
RocksDB is Fantastic!
Source: https://github.com/facebook/rocksdb/blob/master/USERS.md
46
MyRocks is coming to Percona Server
47
Thank You Sponsors!!
FOUNDATION