Date post: | 19-Dec-2015 |
Category: |
Documents |
Upload: | nigel-atkins |
View: | 212 times |
Download: | 0 times |
Spark the future.
May 4 – 8, 2015Chicago, IL
Microsoft Analytics Platform SystemOverview
Matt UsherSenior Program Manager
VISUALIZE + DECIDE
MobileReports
Natural languagequeryDashboardsApplications
Streaming
CAPTURE + MANAGE
RelationalInternal & external
Non-relational NoSQL
TRANSFORM + ANALYZE
Orchestration
Machine learningModeling
Information management
Complex event processing
Data
The Microsoft data platform
Cloud first
Speed
Agility
ProvenFeedback
5
Data sources
OLTP ERP CRM LOB
ETL
Data warehouse
BI and analytics
Increasing data volumes
1
Real-time data
2
Non-relational data
Devices
Web Sensors
Social
New data sources and types
3
Cloud-born data
4
The traditional data warehouse
DATA MANAGEMENT AND PROCESSING
Data sources
OLTP ERP CRM LOB
Non-Relational Data
Devices
Web Sensors
Social
Microsoft’s modern data warehouse
Appliances CloudBox
Software
Hadoop
Appliances CloudBox Software
Relational Data Warehouse
DATA ENRICHMENT AND FEDERATED QUERY
BI & ANALYTICS
Self-service CollaborationCorporate PredictiveMobile
Extract, transform, loadSingle query model Data quality
Master data management
Data Platform
Azure SQL Data Warehouse
Analytics Platform System
Azure HDInsight
SQL Server 2016
Keep legacy investment
Buy new tier-one hardware appliance
Acquire Big Data solution
Acquire business intelligence
Roadblocks to evolving to a modern data warehouse
Limitedscalability and
ability to handle new data types
Significant training and data
silos
High acquisition
and migrationcosts
Complex with low adoption
Prebuilt & performance-tuned appliance
Linear scale-out to petabytes of data
MPP design & in-memory columnstorefor significant speed improvement
Dedicated region for Hadoop
Joining relational & non-relational datawith PolyBase
Analytics Platform SystemMicrosoft’s Big Data Appliance
MPP SQL Server
Hadoop
PolyBase
Next-generation performance at scale
Enterprise-ready Big Data
Engineered foroptimal value
Microsoft Analytics Platform SystemThe turnkey modern data warehouse appliance
High performance and tuned within the appliance
End-user authentication with Active Directory
Accessible insights for everyone with Microsoft BI tools
Managed and monitored using System Center
100-percent Apache Hadoop
SQL ServerParallel DataWarehouse
Hadoop
PolyBase
APS delivers enterprise-ready Hadoop with HDInsightManageable, secured, and highly available Hadoop integrated into the appliance
Move HDFS into the warehouse before analysis
HDFS (Hadoop)
ETL
WarehouseHDFS (Hadoop)
Learn new skills
T-SQL
Build Integrate ManageMaintainSupport
Hadoop alone is not the answer to all Big Data challengesSteep learning curve, slow and inefficient
Hadoop ecosystem
New data sources
Devices
Web Sensor Social
“New” data sourcesNew data sources
Devices
Web Sensor Social
Big Data insights for anyoneNew insights with familiar tools through native Microsoft BI integration
Minimizes ITintervention for discovering data with tools such as Microsoft Excel
Enables DBA and power users to join relational and Hadoop data with T-SQL
Offers Hadoop tools like MapReduce, Hive, and Pig for data scientists
Takes advantage of high adoptionof Excel, Power View, PowerPivot, and SQL Server Analysis Services
Power users
Data scientist
Everyone else using Microsoft BI tools
Provides a single T-SQL query model for PDW and Hadoop with rich features of T-SQL, including joins without ETL
Uses the power of MPP to enhance query execution performance
Supports Windows Azure HDInsight to enable new hybrid cloud scenarios
Provides the ability to query non-Microsoft Hadoop distributions, such as Hortonworks and Cloudera
SQL ServerParallel DataWarehouseMicrosoft Azure
HDInsight
PolyBase
Hadoop
Hortonworks for Windows and Linux
Cloudera
Connecting islands of data with PolyBaseBringing Hadoop point solutions and the data warehouse together for users and IT
Result set
Select…
Office 365
Unifying all of your data assets, cloud and on-premisesData Management Gateway for APS
Tier 1 enterprise data hub
Query on-premises data via Power BI
Advanced analytics with Power BI combined with APS PolyBase
Improved resiliency, high availability, and management
Intranet
Power BI site reports
AzureHDinsight
APS
Data Management Gateway
Other On Premise Data Sources
Azure
AzureDB
Data Management Gateway service
Azure Machine Learning
Azure Stream
Analytics
New for AU3: Connecting APS data with Power BI for Office 365 users
HDinsight
PDW
OracleHadoop SQL Server
OData Feeds
OData Feeds
Customer ScenarioYoung, High Growth
CompanyConsumer facing
devicesNear real-time data
Relational & Non-Relational
How to handle this growth?
2013 2014 2015 2016 2017 2018 -
2,000
4,000
6,000
8,000
10,000
12,000
14,000
16,000
Build a Modern Data Warehouse
Cloud-born data
Expected 15x Growth in 5 Years
Customer was headed down 2 paths
AzureScale-out
On-PremisesScale Up
What does the customer value?
Minimize Cost
Control over:• Performance• Reliability
Scalability
Min Admin Effort
Our Approach
Business problem
Financial Modeling
Collaboration Business value
Worked with Local Microsoft Team to Determine Pricing
DetermineSuccess Criteria
Introduced 3rd Option
Analytics Platform SystemAzure
APS
Best Option based on
Financial & Functional
Requirements
21 3
Based on expected growth – we ran the numbers...
SQL Scale-up
MDM application
Power BI
Power Pivot data discoveryParallel data
warehouse
Device files
Customer ArchitectureSource systems Delivery channels
Denote future phases
Transaction
data
Master data MDM
Hub
Staging server
StagingData
warehouse
Analysis server
Azure blobstorage
Windows Azure ML +
HDInsight
Big
data
PO
C P
hase 3
Cube
SharePoint ExtranetPhase 2
Power view reports and dashboards
SSRS standard reports
Excel services
In SummaryThree viable options
consideredWorked with Customer to determine success
criteriaWorked with customer’s strategic SI to steward
processWeighed Financial &
Functional Requirements
The Modern Data
Warehouse
Next-generation performance at scale
Enterprise-ready Big Data
Engineered foroptimal value
Microsoft Analytics Platform SystemThe turnkey modern data warehouse appliance
Performance limitations and scale with a traditional data warehouse
Diminishing scale as requirements grow
Scale up Rowstore
Sub-optimal performance for many data warehouse queries
Data
Page 1 Page 2 Page 3
Querying data by row
C1 C2 C3 C4
R1 R1 R1 R1
R2 R2 R2 R2
R3 R3 R3 R3
R4 R4 R4 R4
R5 R5 R5 R5
R6 R6 R6 R6
Forklift
Forklift
Scale out Multiple nodes with dedicated CPU, memory, and storage
Ability to incrementally add hardware for near-linear scale to multiple petabytes
Ability to handle query complexity and concurrency at scale
No “forklift” of prior warehouse to increase capacity
Ability to scale out HDInsight and PDW
Scaling out your data to petabytesScale-out technologies in the Analytics Platform System
22
PDW
0 terabytes 6 petabytes
PDW / HDInsight
PDW / HDInsight
PDW / HDInsight
PDW / HDInsight
PDW / HDInsight
PDW / HDInsight
Blazing-fast performanceMPP and In-Memory Columnstore for next-generation performance
• Store data in columnar format for massive compression
• Load data into or out of memory for next-generation performance with up to 60% improvement in data loading speed
• Updateable and clustered for real-time trickle loading
23
Up to 100x faster queries
Updateable clustered columnstore vs. table with customary indexing
Up to 15xmore compression
Columnstore index representation
C1
C3
C5
C4
C2
C6
Parallel query execution
Query
Results
Moving from SQL Server to APSWhy?Distributes data and query processing across multiple servers – eliminates bottlenecks inherent in SMP architecture
Integrated storage is more cost effective over most SAN systems – lowest DW Appliance $$/TB in the industry Linear scale out to 6 petabytes of usable storage for DW
Significantly less performance tuning over an SMP solution
Seamless RDBMS and Hadoop integration with PolyBase
APS
Compute Node
Compute Node
Compute Node
Compute Node
Compute Node
Compute Node
ManagementNode
ControlNode
Next-generation performance at scale
Enterprise-ready Big Data
Engineered foroptimal value
Microsoft Analytics Platform SystemThe turnkey modern data warehouse appliance
Lowest price per TB for data warehouse appliance with APSHigh performance using commodity hardwarePrice per terabyte for leading vendors Significantly lower
price per terabyte than the closest competitor
Price per terabyte for user-available storage (compressed)
NOTE: Orange line indicates average price per terabyte.
Thou
sands
Oracle EMC IBM Teradata Microsoft
$30
$25
$20
$15
$10
$5
$0
Lower storage costs with Windows Server 20124Storage Spaces
Reduce energy costs and usage
Reduce tuning efforts while retaining high performance
Simplify management with System Center Management Pack
Reduce the data center footprint
Value through a single flexible appliance solution Why Analytics Platform System when I have SQL Server?
Accelerate time to value and insights without a forklift for scaling out
SQL Server PDW
Hadoop
PolyBase
Hardware and software engineered togetherThe ease of an appliance
Co-engineered with HP, Dell, and Quanta best practices
Leading performance with commodity hardware
Pre-configured, built, and tuned software and hardware
Integrated support plan with a single Microsoft contact
SQL PDW
Hadoop
PolyBase
FDR Infiniband
2 x ProCurve 2810/48G switches
Base Scale Unit
Active Scale Unit
Active Scale Unit
Active Scale Unit
Active Scale Unit Active scale unit• 2 active server blocks• 1 storage block (1, 2, or 3 terabytes)
Base scale unit (minimum order)
• 1 rack block• 2 active server blocks• 2 passive server blocks• 1 storage block• PDU
HP ConvergedSystem 300 for Microsoft Analytics Platform
Active server with control node and management node functionality
Two servers and one optional passive spare server available
HP racks Base Active Compute Capacity increase Spare Total Raw disk: 1
TBRaw disk: 3 TB
Capacity (TB)
1/4 1 0 2 N/A 1 4 15.1 45.3 53–2271/2 1 1 4 100% 1 6 30.2 90.6 106–4533/4 1 2 6 50% 1 8 45.3 135.9 159–680 1 1 3 8 33% 1 10 60.4 181.2 211–906 1 1/4 2 3 10 25% 2 13 75.5 226.5 264–1133 1 1/2 2 4 12 20% 2 15 90.6 271.8 317–1359 2 2 6 16 33% 2 19 120.8 362.4 423–1812 2 1/2 3 7 20 25% 3 24 151 453 529–2265 3 3 9 24 20% 3 28 181.2 543.6 634–2718 4 4 12 32 33% 4 37 241.6 724.8 846–3624 5 5 15 40 25% 5 46 302 906 1057–4530 6 6 18 48 20% 6 55 362.4 1087.2 1268–5436 7 7 21 56 17% 7 64 422.8 1268.4 1480–6342
PDW scale from a quarter rack to multiple racks
Microsoft Analytics Platform SystemNo-compromise modern data warehouse solution
Meeting today’s Big Data analytics requirements
The modern data warehouse
Enterprise-ready Hadoop with HDInsight and the simplicity of PolyBase
Enterprise-ready Big Data
Optimized performance with MPP technology and In-Memory Columnstore
Performance at scale
Providing value with a low TCO
Optimal value
Visit Myignite at http://myignite.microsoft.com or download and use the Ignite Mobile App with the QR code above.
Please evaluate this sessionYour feedback is important to us!
Learn more at
http://www.microsoft.com/APS
© 2015 Microsoft Corporation. All rights reserved.
© 2015 Microsoft Corporation. All rights reserved.