www.informatik-aktuell.de
The Microsoft data platform capabilities
Transform + analyze
Visualize + decide
Capture + manage
Data
The Microsoft data platform capabilities
Transform + analyze
Visualize + decide
Capture + manage
Data
SQL
Dedicated
Higher Cost
Shared
Lower Cost
Higher Administration Lower Administration
Hybrid Cloud
On Premises
Off Premises
SQL SQL
SQL SQL SQL
SQL SQL SQL
SQL
SQL DW
8
Key Benefits
Reduce project overhead
Speed time to market
Secure, redundant source
code
“Telenor saved 70% on test,
development and demo that could be
turned off when finished to minimize
their capital outlays,”
Marius Pedersen, Telenor Group
70% savings
Ready in hours, not weeks
No resource
limits
SQL Server Dev Tools On-Premises
Development Work Stations
SQL Server
On-Premises
Deploy
SQL Server in a
Windows Azure
Virtual Machine
Test
TFS in Windows Azure
Leads to greater revenue
What is SQL Database?
A relational database-as-a-service, fully managed by Microsoft.
For cloud-designed apps when near-zero administration and enterprise-grade capabilities are key.
Perfect for organizations looking to dramatically increase the DB:IT ratio.
Best for…
TCO
benefits
SQL Server in a VM Azure SQL Database
Scalability
Resources
Service tiers
Elastic scale & performance: Six performance levels across three tiers for scale up and down based on throughput needs. Better resource isolation Improved billing experience.
Business continuity & data protection: A spectrum of business continuity and data protection features from light-weight to mission-critical across the tiers. Customers can dial up the control over data recovery and failover.
Familiar & Self-managed: Unprecedented efficiencies as your applications scale with a near-zero maintenance service and a variety of familiar management tools & programmatic APIs.
SQL Database – ready for business-class apps
Increased from 99.9% to 99.99% uptime SLA
New service design point enables scale up of resources, delivering
predictable throughput & performance
SLA
Performance
Point-in-time-restore, geo-restore, and standard and active geo-
replication protect against human & environmental-initiated events
Azure certifications: ISO, HIPAA BAA, EU Model Clause
Auditing on SQL Database
Protection
Compliance
Hourly billing & broad set of price points Flexibility
* Based on Azure SQL Database Benchmark estimation and specific OLTP workload configuration
Pure max data size
Active portion of total data
Amount of transactional workload the app will generate
Largest amount of data that needs to live in the same
transactional space (i.e. database)
DTU (throughput) currently from 5 up to 1750 DTU ~1400 tx/sec*
DB size from 2GB to 1TB per node
Customer
dimensions to
consider
SQL Database
scale up limits
With SQL Database Elastic Scale technology, scale out to 10s of terabytes
Basic, Standard, Premium B, S0-S3, P1-P11, ..
https://azure.microsoft.com/de-de/pricing/details/sql-database/
Scale out
options
• Basic Standard Premium
Performance is easily scaled up or down to
meet changing workload and business needs
B S0 S1
S2 P1
P2
P11
Azure SQL Database – Elastic Scale
Library Components
1. Shard Map Management
2. Data Dependent Routing
3. Multi-Shard Queries
4. Split-Merge Service
•New Azure portal is available to create SQL Database databases and servers at version V12, with additional
SQL 2016 capabilities. In the portal you specify your SQL Database and then proceed to specify a SQL
Database server to host it.
•Choose a version of SQL Database server when you use the New Azure portal to create a new database. The
default is V12.
•Security enjoys the new feature of users in contained databases. Other features are row-level security,
dynamic data masking, Auditing, Thread detection, and transparent data encryption although some of these
are not yet at GA.
•Easier management of large databases to support heavier workloads with parallel queries (Premium only),
table partitioning, online indexing, worry-free large index rebuilds with 2GB size limit removed, and more
options on the ALTER DATABASE command.
•Support for key programmability functions to drive more robust application design with CLR integration,
Transact-SQL window functions, XML indexes, and change tracking for data.
•Breakthrough performance with support for in-memory columnstore index queries (Premium tier only) for
data mart and smaller analytic workloads.
•Monitoring and troubleshooting are improved with visibility into over 100 new table views in an expanded
set of Database Management Views (DMVs). In Preview: Index Tuning Advisor, Query Performance Insight.
•New S3 performance level in the Standard tier: offers more pricing flexibility between Standard and
Premium. S3 will deliver more DTUs (database throughput units) and all the features available in the Standard
tier. Plus elastic Scale for high-end OLTP transaction workloads.
Azure SQL Database – V12 Features
Market leading price and performance
Scale-out relational or non-relational
Powered by the cloud
Scale-out relational data warehouse
Introducing Azure SQL Data Warehouse
Scale-out to petabytes of data
Massively Parallel Processing
Instant-on compute scales up/down in seconds
Query relational / Hadoop
Up and running in minutes
No hardware to acquire, maintain, or tune
Pre-tuned for optimal performance and scale
No large CapEx acquisition
Pay what you need: spin up/down compute on-demand
Low costs to migrate on-prem DW without rewriting T-SQL
Scale-out Relational Data
warehouse
Introducing Azure SQL Data Warehouse
A relational data warehouse-as-a-service, fully managed by Microsoft.
Industries first elastic cloud data warehouse with enterprise-grade capabilities.
Support your smallest to your largest data storage needs while handling queries up to 100x faster.
Demo Azure SQL Data Warehouse
Not only SQL vs SQL overview
SQL Server Database Engine
Azure SQL Database
Relational (SQL) Non-relational (NoSQL)
Analytical
Azure managed data service
Operational
Microsoft Analytics Platform System
Fast, predictable performance
Tunable consistency
Elastic scale
DocumentDB overview A NoSQL document database-as-a-service, fully managed by Microsoft Azure.
For cloud-designed apps when query over schema-free data; reliable and predictable performance; and rapid development are key.
First of its kind database service to offer native support for JavaScript, SQL query and transactions over JSON documents.
Perfect for cloud architects and developers who need an enterprise-ready NoSQL document database.
Query JSON data with no secondary indices
Native JavaScript transactional processing
Familiar SQL-based query language
Build with familiar tools – REST, JSON, JavaScript
Easy to start and fully-managed
Enterprise-grade Azure platform
A document store Collections
Document 1 Document 2
Document 3 Document 4
DocumentDB
Application
{ "name": "John", "country": "Canada", "age": 43, "lastUse": "March 4, 2014" }
{ "name": "Lou", "country": "Australia", "age": 51, "firstUse": "May 8, 2013" }
{ "docCount": 3, "last": "May 1, 2014" }
{ "name": "Eva", "country": "Germany", "age": 25 }
JSON
Value proposition over MongoDB
• -
Capability Advantage
Managed service Spin up on demand with no setup and availability guarantee of 99.95%. Smooth
linear price curve without VM step functions. Integration with other managed Azure
services like HDInsight and Search.
SQL query language Leverage SQL experience and .NET LINQ
ACID transaction control
through stored procedures
Simpler programing model versus using state variables
JavaScript triggers Simple programing model for running JavaScript code as part of
insert/update/delete actions
Greater consistency control Four levels provide more options for consistency, availability, and performance
requirements
Access rights down to document
level
Greater control for access of all documents and attachments within collections
Open API with RESTful HTTP and
standards based
Open standards protocol for accessing and managing DocumentDB databases. Uses
JSON standard – no mapping of BSON to JSON needed
DocumentDB at Microsoft
over 425 million unique users
store 20TB of JSON document data
under 15ms writes and single digit ms reads
store for 40+ app / device combinations
available globally to serve all markets
user data store
Pricing for General Availability Standard pricing tier with hourly billing
S1, S2 and S3 units differentiated by performance (good, better, best)
Performance levels assigned during collection (data partition) creation
Performance levels can be adjusted based on application needs
Each collection includes 10GB of SSD storage
Limit of 100 collections (1 TB) for each account – can be lifted as needed
Rich query over JSON data
No forced, pre-defined indices allow for differentiated querying Build modern, scalable apps with robust
transactional querying and data processing on JSON documents. Unlike other document database options, DocumentDB provides a full-featured NoSQL document database service with transactional processing over multiple documents using a SQL-like query grammar and native JavaScript support.
Data Lake + Data Warehouse Better Together
What happened?
What is happening?
Why did it happen?
What are key relationships?
What will happen?
What if?
How risky is it?
What should happen?
What is the best option?
How can I optimize?
Data sources
Hadoop Distributed File System (HDFS) For The Cloud
Built from the ground-up as native HDFS
Integrated w/ HDInsight, Hortonworks, Cloudera
Accessible to all HDFS compliant projects (Spark, Storm, Flume, Sqoop, Kafka, R, etc.)
Unlimited Storage, Petabyte Files
Unlimited account sizes
Individual file sizes from GBs to PBs
Immediate read/write access
PB
TB GB
PB TB
Optimized for Massive Throughput
Built for running large analytic systems that require massive throughput
Automatically optimize for any throughput
Optimized for parallel computation over PBs of data
Manage and secure your data assets
Monitor performance, receive alerts, and audit usage
Azure Active Directory integration for identity and access management over all of your data
Deployed in minutes
Deployed with no hardware to install or tune
No hardware acquisition or maintenance costs
Up and running in a few clicks (and within minutes)
Scale-out to any amount of data on-demand Deployed with no hardware
Microsoft’s cloud Hadoop-as-a-Service offering
De-coupled Compute and Storage
100% open source Apache Hadoop – HDP
Fully supported by Microsoft
Built on the latest releases across Hadoop (2.6)
Up and running in minutes with no hardware to deploy
Harness existing .NET and Java skills
Utilize familiar BI tools for analysis including Microsoft Excel
https://spark.apache.org
An unified, open source, parallel, data processing framework for Big Data Analytics
Cluster Manager
Worker Node Worker Node Worker Node
HDFS
Driver Program SparkContext
Obviously does not apply to
persistent RDDs.
RDD RDD
RDD RDD
RDD
transformations Value actions
Demo HDInsight Spark Overview
End-to-End Architecture Overview
Data Source Collect Process Consume Deliver
Event Inputs - Event Hub
- Azure Blob
Transform - Temporal joins
- Filter
- Aggregates
- Projections
- Windows
- Etc.
Enrich
Correlate
Upcoming –
Call ML models
Outputs - SQL Database
- Blob Storage
- Event Hub
- Power BI
- Table Storage
- Service Bus Queue
- Service Bus Topic
Azure
Storage
• Temporal Semantics
• Guaranteed delivery
• Guaranteed up time
Azure Stream Analytics
Reference Data - Azure Blob
- …
Easily implement temporal functions
Tumbling Windows Repeating, non-overlapping, fixed interval windows
Hopping Windows Generic window, overlapping, fixed size
Sliding Windows Slides by an epsilon and produces output at the occurrence of an event
Scaling Functions • WITH
• PARTITION BY
Date and Time Functions • DATENAME
• DATEPART
• DAY
• MONTH
• YEAR
• DATETIMEFROMPARTS
• DATEDIFF
• DATADD
Windowing Extensions • Tumbling Window
• Hopping Window
• Sliding Window
Aggregate Functions • SUM
• COUNT
• AVG
• MIN
• MAX
• STDEV
• STDEVP
• VAR
• VARP
• CollectTOP
String Functions • LEN
• CONCAT
• CHARINDEX
• SUBSTRING
• PATINDEX
• LOWER
• UPPER
Analytic Functions • ISFIRST
• LAG
Conversion Functions • CAST
Multi-Tenant Service No Yes No
Deployment Model IaaS PaaS PaaS*
Extensibility Medium Low High
Deployment Complexity Medium Low Low*
Cost Medium Low Med
Open Source Support No No Yes
Programmability .NET / LINQ SQL* SparkSQL, Scala,
Python, Java…
Power BI Integration Rest API Yes, Native Yes, Native