+ All Categories
Home > Technology > Sql no sql

Sql no sql

Date post: 14-Jan-2015
Category:
Upload: dave-stokes
View: 864 times
Download: 1 times
Share this document with a friend
Description:
SQL versus NoSQL and how MySQL's InnoDB/Memcached interface can keep you sane.
Popular Tags:
30
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8 LOGO Presenting with SQL & NoSQL: SQL & NoSQL: How 'Big Data' & MySQL Work Together How 'Big Data' & MySQL Work Together [email protected] [email protected] MySQL Community Manager MySQL Community Manager Dave Stokes MySQL Community Manager [email protected]
Transcript
Page 1: Sql no sql

1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

LOGO

Presenting with

SQL & NoSQL:SQL & NoSQL:How 'Big Data' & MySQL Work TogetherHow 'Big Data' & MySQL Work Together

[email protected]@Oracle.ComMySQL Community ManagerMySQL Community Manager

Dave StokesMySQL Community [email protected]

Page 2: Sql no sql

2 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Program Agenda

• WHY SQL / NoSQL• Alternatives to SQL• Big Data• Best of both worlds – InnoDB/memcached• MySQL Cluster – NDB/memcached• Q&A

Synopsis – How to use MySQL as a relational data store according to Codd & Date while gaining the ability to access schema-less data and looking cool while doing it.

Page 3: Sql no sql

3 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

SQL – Cod and Date

Images from Wikipedia.com

Page 4: Sql no sql

4 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Codd & Date

Wikipedia: Edgar Frank "Ted" Codd was an English computer scientist who invented the relational model for database management, the theoretical basis for relational databases. ...Codd continued to develop and extend his relational model, sometimes in collaboration with Chris Date

Page 5: Sql no sql

5 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Codd's Relational Model

The purpose of the relational model is to provide a declarative method for specifying data and queries: users directly state what information the database contains and what information they want from it, and let the database management system software take care of describing data structures for storing the data and retrieval procedures for answering queries. Wikipedia

Page 6: Sql no sql

6 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Not all data relational or easy to extract using SQL

Common NoSQL uses● Document Stores / Fuzzy schemas

● A 'Facebook Query' – find the friends of your friends … and then their friends

● Data size may be too large for RDMS or OS

● Coolness factor

Page 7: Sql no sql

7 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

NoSQL

● Database management without relational model, schema free

● Does not use SQL (some retrofitting)

● Usually not ACID● Eventually consistent data

● Distributed, fault-tolerant

● Large amounts of data

Page 8: Sql no sql

8 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Common NoSQL Types

● Key value stores

● Document databases

● Graph databases

● XML databases

● Distributed peers

● Object stores

Page 9: Sql no sql

9 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Infobright's Emerging Database Landscapehttp://bit.ly/emerging_db_landscape

Row Based Columnar NoSQL – Key Value store

NoSQL – Dcoument Store

NoSQL Columnar Store

Common uses Transaction processing

Historical data analysis, data warehousing, BI

Cache for storing frequently requested data

Web apps or app needing scaling w/o defined schema

Real-time data logging

Basic Description Data structured in rows

Data structured in columns

Data stored in memory

Persistent storage, some SQL-like querying

Very large data storage, MapReduce support

Strengths Capturing/inputting new records; Robust, proven technology

Fast query support on data sets; compression

Scalability, very fast storage/retrieval of data

Persistent store, scalable; better query support than key-value stores

Very high throughput; strong partitioning; random read-write access

Weaknesses Scale issues Import/export speed; heavy computing resource needed

Usually all data must fit into memory, no complex queries

Lack of sophisticated capabilities

Low level API; Inability to do complex queries; high query latency

Typical Database Size Range

Several GB to 50 TB

Several GB to several TB

Several TB to several PB

Several TB to several PB

Key Players MySQL, Oracle, SQL Server, Sybase ASE

Infobright, Aster Data, Sybase IQ, Vertica, ParAccel

Memacached, Amazon S3, Redis, Voldemort

MongoDB, CouchDB, SimpleDB

Hbase, Big Table, Cassandra

Page 10: Sql no sql

10 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

An example using MongoDB

db.cars.insert(

{make: 'Ford',model: 'F-150',cylinders: 8})

db.cars.find(cylinders: {$gte: 8})

Page 11: Sql no sql

11 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Hadoop

Part 1 – Reliable data storage using the Hadoop Distributed File System (HDFS)

Part 2 – Parallel data processing using map/reduce

● Can get expensive for hardware, not for all data

Page 12: Sql no sql

12 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Tilting relational databases on their side

Columnar databases align by column, not rows

● High compression possible

● OLAP & Data Warehousing

● MySQL engines● Calpont's InfiniDB

● Infobright

Page 13: Sql no sql

13 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

When not to use NoSQL

Your data is● Relational

– Some hierarchy

– schema● Need ACID

● Do not like lots of servers, disk farms

Page 14: Sql no sql

14 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

memcached before MySQL 5.6

memcached

is a general-purpose distributed memory

caching system--Wikipedia

Page 15: Sql no sql

15 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

MySQL's use of memcached for NoSQL

● Innodb or NDB storage engines

● Access same data (same disks) either through SQL or memcached

● 1,000,000,000+ transactions a minute for MySQL Cluster

● Many sites already using memcached

- already in use, well known, easy to implement

Page 16: Sql no sql

16 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Diagrammatic Overview

Page 17: Sql no sql

17 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Why this is cool?

● memcached as a daemon plugin of mysqld: both mysqld and memcached are running in the same process space, with very low latency access to data

● Direct access to InnoDB: bypassing SQL parser and optimizer

● Support standard protocol (memcapable): support both memcached text-based protocol and binary protocol; all 55 memcapable tests are passed

● Support multiple columns: users can map multiple columns into “value”. The value is separated by a pre-defined “separator” (configurable).

Page 18: Sql no sql

18 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Why is this cool? 2

● Optional local caching: three options – “cache-only”, “innodb-only”, and “caching” (both “cache” and “innodb store”). These local options can apply to each of four Memcached operations (set, get, delete and flush).

● Batch operations: user can specify the batch commit size for InnoDB memcached operations via “daemon_memcached_r_batch_size” and “daemon_memcached_w_batch_size” (default 32)

● Support all memcached configure options through MySQL configure variable “daemon_memcached_option

Page 19: Sql no sql

19 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

PHP Example of using memcached

function get_foo(int userid) {

data = db_select("SELECT * FROM users WHERE userid = ?", userid);

return data;

}

function get_foo(int userid) {

/* first try the cache */

data = memcached_fetch("userrow:" + userid);

if (!data) {

/* not found : request database */

data = db_select("SELECT * FROM users WHERE userid = ?", userid);

/* then store in cache until next get */

memcached_add("userrow:" + userid, data);

}

return data;

}

Rewriten tousememached

Page 20: Sql no sql

20 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Example InnoDB/memcached

● SQL● mysql> INSERT INTO demo_test

VALUES ('dave','it works', 10, 200, NULL)\g

● Memcached

Page 21: Sql no sql

21 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Getting it running● Install MySQL 5.6.6

● /scripts/innodb_memcached_config.sql

● Creates test.demo_test

– Key (c1) – CHAR/VARCAHR

– Value (c2) – CHAR/VARCHAR

– Flag (c3) – 32bit Integer

– CAS (c4) – 64bit Integer

– Exp (c5) – 32bit integer

● mysql> install plugin daemon_memcached soname “libmemcached.so”;

● mysql> set session TRANSACTION ISOLATION LEVEL read uncommitted; /* ignore batches */

Page 22: Sql no sql

22 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

MySQL Cluster – NDB and/or memcahced

Page 23: Sql no sql

23 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

MySQL Cluster quick review

Fault tolerant, auto sharding, shared nothing, data on redundant boxes, 99.999% up time, ACID, geographical replication between clusters, & no single point of failure

Page 24: Sql no sql

24 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Option 1 co-locate the memcached API with the data nodes

The applications can connect to any of the memcached API nodes – if one should fail just switch to another as it can access the exact same data instantly. As you add more data nodes you also add more memcached servers and so the data access/storage layer can scale out (until you hit the 48 data node limit).

Page 25: Sql no sql

25 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Separate Layer

For maximum flexibility, you can have a separate Memcached layer so that the application, the Memcached API & MySQL Cluster can all be scaled independently.

Page 26: Sql no sql

26 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

Co locate with Application

Another simple option is to co-locate the Memcached API with the application. In this way, as you add more application nodes you also get more Memcached throughput. If you need more data storage capacity you can independently scale MySQL Cluster by adding more data nodes. One nice feature of this approach is that failures are handled very simply – if one App/Memcached machine should fail, all of the other applications just continue accessing their local Memcached API.

Page 27: Sql no sql

27 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.

In all of the last three examples, there has been a single source for the data (it’s all in MySQL Cluster)

.

● If you choose, you can still have all or some of the data cached within the memcached server (and specify whether that data should also be persisted in MySQL Cluster) – you choose how to treat different pieces of your data. If for example, you had some data that is written to and read from frequently then store it just in MySQL Cluster, if you have data that is written to rarely but read very often then you might choose to cache it in memcached as well and if you have data that has a short lifetime and wouldn’t benefit from being stored in MySQL Cluster then only hold it in memcached. The beauty is that you get to configure this on a per-key-prefix basis (through tables in MySQL Cluster) and that the application doesn’t have to care – it just uses the memcached API and relies on the software to store data in the right place(s) and to keep everything in sync.

● Of course if you want to access the same data through SQL then you’d make sure that it was configured to be stored in MySQL Cluster.

Page 28: Sql no sql

28 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

OTHER OPTIONS

• There are other options for Big Dataand NoSQL that are beyond the

• Scope of this presentation ..• Although I can not think of • Any thing to point to as an• Example :-)

ORACLEPRODUCTLOGO

Page 29: Sql no sql

29 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Q&[email protected]@Stoker

slideshare.net/davestokes/presentations

Page 30: Sql no sql

Recommended