+ All Categories
Home > Software > Webinar - Introduction to Azure Data Lake

Webinar - Introduction to Azure Data Lake

Date post: 13-Apr-2017
Category:
Upload: josh-lane
View: 39 times
Download: 1 times
Share this document with a friend
16
Consulting/Training Josh Lane Principal Architect, Wintellect https://github.com/jplane/data-lake-webinar An Introduction to Azure Data Lake
Transcript
Page 1: Webinar - Introduction to Azure Data Lake

Consulting/Training

Josh LanePrincipal Architect, Wintellect

https://github.com/jplane/data-lake-webinar

An Introduction to Azure Data Lake

Page 2: Webinar - Introduction to Azure Data Lake

Consulting/Training

Principal Architect at WintellectConsulting, training, content development

Almost 20 years as software architect and developerFocused primarily on .NET, Node.js, and cloud

Microsoft Azure MVPAzure-in-the-ATL meetup [email protected]@jplane

whois Josh-Lane

Page 3: Webinar - Introduction to Azure Data Lake

Consulting/Training

consultingWintellect helps you build better software, faster, tackling the tough projects and solving the software and technology questions that help you transform your business. Architecture, Analysis and Design Full lifecycle software development Debugging and Performance tuning Database design and development

trainingWintellect's courses are written and taught by some of the biggest and most respected names in the Microsoft programming industry. Learn from the best. Access the same training

Microsoft’s developers enjoy Real world knowledge and solutions on both

current and cutting edge technologies Flexibility in training options – onsite, virtual,

on demand

Wintellect is the only company that offers the combined value of world class consulting services along with onsite, virtual and on-demand developer training. We help companies build better software, faster, helping you maximize and protect your consulting and training investments through ongoing knowledge transfer.

who we are

About Wintellect

Page 4: Webinar - Introduction to Azure Data Lake

Consulting/Training

What is a “data lake”?

“A single store of all data… ranging from raw data (which implies exact copy of source system data) to transformed data which is used for various forms including reporting, visualization, analytics and machine learning”

Page 5: Webinar - Introduction to Azure Data Lake

Consulting/Training

3 Pillars of Azure Data Lake

QueryVisualizationADLS

ADLA

HDInsight

Page 6: Webinar - Introduction to Azure Data Lake

Consulting/Training

Comprehensive, cloud-based big data storage and analytics platform

Purpose-built from real-world experiencesOffice 365, Skype, Bing, etc.

Leverage existing skills and technologiesBenefits of an Azure-hosted service

Elastic, dynamically provisioned compute resources for varying query needsInfinite storage capacityFocus on extracting meaning from data, not on infrastructure

What is Azure Data Lake?

Page 7: Webinar - Introduction to Azure Data Lake

Consulting/Training

HDFS-as-a-serviceDurable, redundant storageA variety of data scenarios

Unlimited capacityHigh-volume + low-latency (IoT, etc.)High throughput (massively parallel query)

Store data in its native formatstructured, semi-structured, unstructured storage formats

Data Lake Store

Page 8: Webinar - Introduction to Azure Data Lake

Consulting/Training

Data Lake Store – Importing Data

Page 9: Webinar - Introduction to Azure Data Lake

Consulting/Training

Managed, cloud-scale Apache Hadoop-as-a-serviceFull complement of Apache technologies

Spark, Storm, HBase, etc.Focus on queries and data, not infrastructurePay for only what you need and useLeverage existing skills and toolchains

Hive, Pig, Sqoop, R, etc.

HDInsight

Page 10: Webinar - Introduction to Azure Data Lake

Consulting/Training

Low-barrier alternative (or complement) to HDInsight and Hadoop ecosystem

Scales dynamically to match data size and query complexity

Built on Apache YARNUnit of interaction is an analytics job

Elastic infrastructure management is abstracted awayU-SQL… query language rooted in SQL and C#

Data Lake Analytics

Page 11: Webinar - Introduction to Azure Data Lake

Consulting/Training

Based on SQL and C#C# expressions and typesTables, views, window functions, etc.User-defined functions/operators/aggregators in C#

Typical job1. Read data from named file/table/federated source2. Transform rowset in an ordered pipeline3. Output rowset to named table or file

U-SQL

Page 12: Webinar - Introduction to Azure Data Lake

Consulting/Training

Data Lake Analytics – U-SQL, Federated Queries, Power BI integration

Page 13: Webinar - Introduction to Azure Data Lake

Consulting/Training

Azure Ecosystem Integration

Azure Data Lake

Federated SQL

Data Import

Visualization

Data Discovery

Security

Page 14: Webinar - Introduction to Azure Data Lake

Consulting/Training

Data Lake Store$0.04 per GB per month for storage$0.07 per 1 million transactions50% preview discount

Data Lake Analytics$0.017 per ”Analytics Unit” per minute$0.025 per completed job50% preview discount

HDInsight - https://azure.microsoft.com/en-us/pricing/details/hdinsight/

Pricing

Page 15: Webinar - Introduction to Azure Data Lake

Consulting/Training

https://azure.microsoft.com/en-us/services/data-lake-analytics/https://azure.microsoft.com/en-us/services/data-lake-store/https://azure.microsoft.com/en-us/services/hdinsight/http://usql.io/http://azure.github.io/AzureDataLake/

References

Page 16: Webinar - Introduction to Azure Data Lake

Consulting/Training

Thank You!


Recommended