Introduction to Azure DocumentDB

Post on 16-Apr-2017

108 views 0 download

transcript

Alexander Zyl.NET Developer

azyl@scnsoft.com

A new child

NoSQL solutions in Azure

HBase

TS

• Column Family Store • Key/Value Store

RedisTS

Features:

• Fully managed• Schema agnostic• Scalable• Tunable consistency levels• Tunable indexing policies• Familiar SQL syntax for querying• JavaScript execution

DocumentDB

DocumentDB resource model

REST API

DocumentDB Infrastructure

Databases/dbs/{id}

DocumentDB Account

Databases/dbs/{id}

DocumentDB Account

Collections/colls/{id}

Databases/dbs/{id}

Users/users/{id}

Permissions/permissions/{id}

Collections/colls/{id}

Databases/dbs/{id}

Users/users/{id}

Permissions/permissions/{id}

Triggers/triggers/{id}

Functions/functions/{id}

Stored Procedures/sprocs/{id}

Attachments/attachments/{id}

Documents/docs/{id}

Collections/colls/{id}

DocumentDB Account Users

/users/{id}Databases/dbs/{id}

Permissions/permissions/{id}

Collections/colls/{id}

Attachments/attachments/{id}

Documents/docs/{id}

Triggers/triggers/{id}

Functions/functions/{id}

Stored Procedures/sprocs/{id}

Server ZServer CServer BServer A

Logical containers

Physical containers

DocumentDB Account

CollectionsCollections Collections Collections

What about cost?

¿ 𝑓 (𝑀𝑒𝑚𝑜𝑟𝑦 ,𝐶𝑃𝑈 , 𝐼𝑂 )RURequest Unit

Performance levels

RUs per second RUs per second20 20 20 RUs per second RUs per second250 RUs per second 1k RUs per second 2.5k RUs per second

How to model data?

{ "Id": 44, "ReleaseYear": "2014", "Make": "Aston Martin", "Vin": "2G1WT58KX79250102", "Model": "DBS", "Dealer": { "Name": "Atlant-M", "Address": "Some st. 9" }, "GPSLocation": { "Latitude": 44.6516185, "Longitude": -63.5820275 }, "StatusData": […]}

Vehicle Dealer

StatusData

VehicleIdPK PK

PK

ReleaseYear

Make

DealerId

Name

StatusId

VehicleIdEngineOnTimeStamp

Address

Vin

Model

LatitudeLongitude

Approaches to document modeling

Reference data Embed data

Modeling relations

Vehicle document:{ "Id": 44, "ReleaseYear": "2014", "Make": "Aston Martin", "Vin": "2G1WT58KX79250102", "Model": "DBS", "Dealer": { "Name": "Atlant-M", "Address": "Some st. 9" }, "GPSLocation": { "Latitude": 44.6516185, "Longitude": -63.5820275 }, "StatusData": [ { "Id": 1, "TimeStamp": "2014-07-04", "EngineOn": true, "FuelLevel": 40 }, { "Id": 2, "TimeStamp": "2014-07-04", "EngineOn": false, "FuelLevel": 33 },

{ "Id": 999, "TimeStamp": "2014-08-12", "EngineOn": true, "FuelLevel": 23 } ]}

Bad design

When to embed: One-to-few relations Infrequent changes Embedded data has

bounds

Modeling relations

Vehicle document:{ "Id": 44, "ReleaseYear": "2014", "Make": "Aston Martin", "Vin": "2G1WT58KX79250102", "Model": "DBS", "Dealer": { "Name": "Atlant-M", "Address": "Some st. 9" }, "GPSLocation": { "Latitude": 44.6516185, "Longitude": -63.5820275 }, "StatusData": [ { "Id": 1, "TimeStamp": "2014-07-04", "EngineOn": true, "FuelLevel": 40 }, { "Id": 2, "TimeStamp": "2014-07-04", "EngineOn": false, "FuelLevel": 33 },

{ "Id": 999, "TimeStamp": "2014-08-12", "EngineOn": true, "FuelLevel": 23 } ]}

When to embed: One-to-few relations Infrequent changes Embedded data has

bounds

When to reference: One-to-many relations Many-to-many relations Data changes frequently Unbounded reference

Modeling relations

Vehicle document:{ "Id": 44, "ReleaseYear": "2014", "Make": "Aston Martin", "Vin": "2G1WT58KX79250102", "Model": "DBS", "Dealer": { "Name": "Atlant-M", "Address": "Some st. 9" }, "GPSLocation": { "Latitude": 44.6516185, "Longitude": -63.5820275 }, "StatusData": [ { "Id": 1, "TimeStamp": "2014-07-04", "EngineOn": true, "FuelLevel": 40 }, { "Id": 2, "TimeStamp": "2014-07-04", "EngineOn": false, "FuelLevel": 33 },

{ "Id": 999, "TimeStamp": "2014-08-12", "EngineOn": true, "FuelLevel": 23 } ]}

Vehicle document:

{ "Id": 44, "ReleaseYear": "2014", "Make": "Aston Martin", "Vin": "2G1WT58KX79250102", "Model": "DBS", "Dealer": { "Name": "Atlant-M", "Address": "Some st. 9" }, "StatusData": [ { "Id": 1, "TimeStamp": "2014-07-04", "EngineOn": true, "FuelLevel": 40 }, { "Id": 2, "TimeStamp": "2014-07-04", "EngineOn": false, "FuelLevel": 33 }, { "Id": 3, "TimeStamp": "2014-07-04", "EngineOn": true, "FuelLevel": 23 },

{ "Id": 999, "TimeStamp": "2014-08-12", "EngineOn": true, "FuelLevel": 23 } ]}

VehicleLocation document:

{ "VehicleId": 44, "GPSLocation": { "Latitude": 44.651617, "Longitude": -63.582027 }}

Vehicle

{ "Id": 44, "ReleaseYear": "2014", "Make": "Aston Martin", "Vin": "2G1WT58KX79250102", "Model": "DBS", "Dealer": { "Name": "Atlant-M", "Address": "Some st. 9" }, "StatusData"95,: [ { "Id": 1, "TimeStamp": "2014-07-04", "EngineOn": true, "FuelLevel": 40 }, { "Id": 2, "TimeStamp": "2014-07-04", "EngineOn": false, "FuelLevel": 33 }, { "Id": 3, "TimeStamp": "2014-07-04", "EngineOn": true, "FuelLevel": 23 },

{ "Id": 999, "TimeStamp": "2014-08-12", "EngineOn": true, "FuelLevel": 23 } ]}

Vehicle document:

{ "Id": 44, "ReleaseYear": "2014", "Make": "Aston Martin", "Vin": "2G1WT58KX79250102", "Model": "DBS", "Dealer": { "Name": "Atlant-M", "Address": "Some st. 9" },}VehicleStatus documents:{ "Id": 1, "TimeStamp": "2014-07-04", "EngineOn": true, "FuelLevel": 40, "VehicleId": 44 },{ "Id": 2, "TimeStamp": "2014-07-04", "EngineOn": false, "FuelLevel": 33, "VehicleId": 44 },{ "Id": 3, "TimeStamp": "2014-07-04", "EngineOn": true, "FuelLevel": 23, "VehicleId": 44 },{ "Id": 4, "TimeStamp": "2014-07-05", "EngineOn": true, "FuelLevel": 10, "VehicleId": 44 },{ "Id": 5, "TimeStamp": "2014-07-06", "EngineOn": false, "FuelLevel": 55, "VehicleId": 44 },{ "Id": 999, "TimeStamp": "2014-08-12", "EngineOn": true, "FuelLevel": 23, "VehicleId": 44 }

Indexing

Documents in a collection{ "id": 16, "text": "Bonjour", "user": { "name": "Francois", "nickname": "@franky" }, "entities": { "hashtags": [ { "text": "#heof" } ] }}

{ "id": 4, "text": "Hello", "user": { "name": "Jerome", "nickname": "@juim" }, "entities": { "hashtags": [ { "text": "#rutib", "indices": [ 10, 26 ] } ] }}

Index tree

textid userentities

4 16 Hello Bonjourname nickname

Jerome Francois @juim @franky

hashtags

text indices

0

10 26

0

#rutib #heof

What we can controlAutomatic indexing Manual include Manual exclude

What we can control

Automatic indexing Manual include Manual exclude

What we can control

Indexing modes Consistent Lazy None

What we can control

Indexing modes Consistent Lazy None

Building paths{ "id": 16, "text": "Bonjour", "user": { "name": "Francois", "nickname": "@franky" }, "entities": { "hashtags": [ { "text": "#heof" } ] }}

_.text _.user.name _.entities.hashtags _.entities.hashtags[0].text

Building paths

Applicable wildcards:? – single selection

/text/?/user/nickname/?/entities/hashtags/[]/text/?

{ "id": 4, "text": "Hello", "user": { "name": "Jerome", "nickname": "@juim" }, "entities": { "hashtags": [ { "text": "#rutib", "indices": [ 10, 26 ] } ] }}

/text/?

Building paths: examples

{ "id": 4, "text": "Hello", "user": { "name": "Jerome", "nickname": "@juim" }, "entities": { "hashtags": [ { "text": "#rutib", "indices": [ 10, 26 ] } ] }}

/user/nickname/?

Building paths: examples

Building paths

Applicable wildcards:? – single selection* – recursive selection/text/?

/user/nickname/?/entities/hashtags/[]/text/?

/user/*/entities/*/entities/hashtags/[]/*

{ "id": 4, "text": "Hello", "user": { "name": "Jerome", "nickname": "@juim" }, "entities": { "hashtags": [ { "text": "#rutib", "indices": [ 10, 26 ] } ] }}

/user/*

Building paths: examples

{ "id": 4, "text": "Hello", "user": { "name": "Jerome", "nickname": "@juim" }, "entities": { "hashtags": [ { "text": "#rutib", "indices": [ 10, 26 ] } ] }}

/entities/hashtags/*

Building paths: examples

Indexing options:Include to index Exclude from index

Applying rules

Index kinds:Hash – equality queries

Applying rules

SELECT * FROM collection c WHERE c.prop = 'value'

Index kinds:Hash – equality queriesRange – range + OrderBy queries

Applying rules

SELECT * FROM collection c WHERE c.prop = 'value'SELECT * FROM collection c WHERE c.prop >= 15ORDER BY c.prop

Index kinds:Hash – equality queriesRange – range + OrderBy queriesSpatial – ST_DISTANCE, ST_WITHIN

Applying rules

SELECT * FROM collection c WHERE c.prop >= 15ORDER BY c.prop

SELECT *FROM collection cWHERE ST_DISTANCE(c.Location, { "type": "Point", "coordinates": [-122.19, 47.36]}) < 100 * 1000

Index precision:For numbers: 1-8 bytesFor strings: 1-100 bytes

Applying rules

Data consistency

•Strong consistency•Eventual consistency

DocumentDB

•Strong consistency•Eventual consistency•Session•Bounded staleness

Offered consistency models

Strong consistency

Strong consistency: Write operationVersion2

Version1

Version2

User

Replica A

Replica B

Replica CGateway

Async

Infra

Version2

Strong consistency: Read operationVersion2

Version1

Version2

User

Replica A

Replica B

Replica CGateway

Infra

Version2

Eventual consistency

Eventual consistency: Write operationVersion2

Version1

Version1

User

Replica A

Replica B

Replica CGateway

Async

Infra

Version2

Eventual consistency: Read operationVersion2

Version1

Version1

User

Replica A

Replica B

Replica CGateway

Infra

Version1

Session consistency

Session: Write operationVersion2

Version1

Version1

User

Replica A

Replica B

Replica CGateway

Async

Infra

Session Id

Version2

Session: Read operationVersion2

Version1

Version1

Replica A

Replica B

Replica CGateway

Infra

Version2User A

User B

Session Id

Version1

Bounded staleness

Bounded staleness: Write operationVersion2

Version1

Version1

User

Replica A

Replica B

Replica CGateway

Async

Infra

Version2

Bounded staleness: Read operationVersion3

Version1

Version2

User

Replica A

Replica B

Replica CGateway

Sync

Infra

Version2

Scalability issuesOut of space

>Data Collection

Scalability issues

Too many requestsOut of space

>Data Collection

Solutions?

Vertical scaling

Horizontal scalingVertical scaling

Solutions?

Collection = Partition

Collection

RequestPartitioning our data

Partition 1

Request

Request

Partition 2

Logical groupingPartitioning our data

Partitioning strategies

Hash partitioningPartition 1

Partition 3

Partition 2UserId: 14 Hash(14)=>

P3Infra

Logical grouping

Partitioning strategies

Range partitioningPartition 1

Partition 3

Partition 2Name: ͚QDavid͚R

Infra

K > ͚QDavid͚R>= A

A-I

K-Q

R-Z

Logical grouping

Partitioning strategies

Lookup partitioning

Region name Partition Id

Asia Partition1

Version3Europe Partition2

United States Partition3

Partition 1

Partition 3

Partition 2Region: ͚QEurope͚R

Infra

Logical grouping

Use Cases

Use Cases: user-defined data

Use Cases: storing and analyzing logs

Logs

Use Cases: storing materialized views