+ All Categories
Home > Documents > Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... ·...

Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... ·...

Date post: 28-May-2020
Category:
Upload: others
View: 9 times
Download: 0 times
Share this document with a friend
16
Azure Data Lake How to organize Jan Cordtz, Microsoft Denmark [email protected] Cloud Solution Architect
Transcript
Page 1: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

Azure Data Lake How to organize

Jan Cordtz, Microsoft Denmark

[email protected]

Cloud Solution Architect

Page 2: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

AzureSearch

HybridCloud

Backup

StorSimple

Azure SiteRecovery

Import/Export

Azure AD Health Monitoring

AD PrivilegedIdentity Management

OperationalAnalytics

Domain Services

SQL Database DocumentDB

Redis Cache

StorageTables

SQL DataWarehouse

SQL Server Stretch Database

Visual Studio

ApplicationInsights

VS Team ServicesXamarin

HockeyApp

MobileEngagement

Cognitive Services Bot Framework Cortana

Security & Management

Azure ActiveDirectory

Multi-FactorAuthentication

Automation

Portal

Key Vault

Store/Marketplace

VM Image Gallery& VM Depot

Azure ADB2C

Scheduler

Security Center

WebApps

MobileApps

API Apps

Notification Hubs

Cloud Services

ServiceFabric

Functions

BatchRemoteApp

Container Service

VM Scale Sets

BizTalkServices

Service Bus

Logic Apps

API Management

Content DeliveryNetwork

Media Services

Media Analytics

HDInsight/Databricks

MachineLearning Stream Analytics

Data Factory

EventHubs

Data LakeAnalytics Service

IoT Hub

Data Catalog

Power BI Embedded

Data Lake Store

Data Center

Infrastructure as a Service

Platform as a Service

Page 3: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

Trusted

HIPAA /

HITECH Act

FISC JapanCDSA Shared

Assessments

FACT UKPCI DSS

Level 1

MPAA

ENISA

IAF

Japan CS

Mark Gold

Japan My

Number ActSpain

ENS

Canada

Privacy Laws

Privacy

Shield

India

MeitY

Germany IT

Grundschutz

workbook

Spain

DPA

CSA STAR

Self-AssessmentSOC 2 Type 2 SOC 3

CSA STAR

Certification

CSA STAR

Attestation

FERPAGxP

21 CFR Part 11

GLBAMARS-E FFIECHITRUST IG Toolkit UK

Singapore

MTCS

UK

G-Cloud

Australia

IRAP/CCSLNew Zealand

GCIO

China

GB 18030

EU

Model ClausesArgentina

PDPA

China

TRUCS

China

DJCP

ISO 27001 SOC 1 Type 2ISO 27018 ISO 22301ISO 27017

GLO

BA

LIN

DU

ST

RY

REG

ION

AL

More certifications than any cloud provider

Page 4: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

AnalyticsData Cloud

Always been there but growing “rapidly”

Been their for a long time (BI) but getting much more advanced –Machine Learning/AI

“New” kid on the block• Unlimited compute/storage• Fast deployment• Pay-as-you go• Many services

Page 5: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

Open and hybrid

Page 6: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key
Page 7: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

Business needs

Mode 1

- Datawarehouse

- Reporting

Selfservice

- Dashboard

- Business Intelligence

Mode 2

- IOT

- Machine Learning

- Analytics

- Governance

- Organize

- Common understanding of data

- Trial: Error/Proceed

- Hot/Cold path

- No specific technology

- Flexible economy

- Hybrid

Central platform

”Data Lake” / ”Data Bank”

Page 8: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

• d

Built on Open Standards

Built on YARN

Store lets all HDFS compliant analytic applications

connect to it like Hortonworks, Cloudera, and

MapR

Microsoft HDInsight is 100% Apache Hadoop

Microsoft continues to contribute tens of thousands

of code and engineering hours to open sourceHDFS

YARN

U-SQL

Analytics

ServiceHDInsight

HDFS

Store

Page 9: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

A databank

SQL

Cube

Archive storage

Data Inges-tion

Operational System A

Operational System B

Operational System X

DW

DataMart

Machine Learning

Data Ingestion

External Data

Cosmos DB

Data Bricks

Dynamic SizeablePAAS Economics

Fixed SizeableIAAS Economics

Whatever Apps

Data Lake

Page 10: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

Data storage

Data LakeBlob DB

Rest FilesHortonWorks*

Cloudera*

MapR*

HDInsight

SQL NoSQL

Microsoft SQL (General)

MySQL (LAMP/PHP)

PostgresSQL (GIS)

Graph (Tinkerpop)

Documents (MongoDB)

Column-Value (Cassandra)

Key-Value (Table)

DataBricks

Machine Learning Studio

Data Science Virtual Machine

Azure

Machine Learning servicesSQL DB SQL DW

Cube

Page 11: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

Storage – from a functionality point of view

11

File storage Database CubeData Lake

Functionality and cost

ETLData Factory

ETLData Factory

Page 12: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

Principal regarding the Organization

• Is very simple to use for an end-user/application (=flat file/csv file)

• Is as cost-effective as sensible/possible.

• Do not compromise security.

• Fits well into a DevOps scenario

• “Automatic” meta-tagging

• Have a well-defined path for the information needed to be able to

support an effective auditing and logging process.

Page 13: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

Copy

Organizing the Azure Data Lake

Azure Data Lake

Landing Zone

Landing Zone System Account(s)

– read/write

Work System Account(s)

- read

Work Work System Account(s)

– read/write

Publish A

Users in Groups

Read/Write

Read Only – except Work System Account(s)

Folder per ”area”

Analytics

Users in Groups Read Only – except Work System Account(s)

Read/Write”All data”

Transform

Transform &

Anonymize

Archive

Data Catalog

Data Inges-tion

Publish B

Users in Groups

Read/Write

Read Only – except Work System Account(s)

Folder per ”area”

Publish X

Users in Groups

Read/Write

Read Only – except Work System Account(s)

Folder per ”area”

………

Page 14: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

Data Ingestion

”Gatekeeper”

Validation

Standardization SSIS,Event Hub,

Data Factory…….

Database,FTP,

File Storage…….

Firewall,AD control…….

Items like : Date formats (yyyymmdd),

number formats (,. or .,)

”Are you allowed to enter ?”

”Is the content you are coming with in

accordance with what we have agreed”

Push/

Pull

Hot/Cold

Path

Examples

Page 15: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

Copy

Azure Data Lake and DevOps

Azure Data Lake

Landing Zone

Work

Publish A

Users in Groups

Read/Write

Read Only – except Work System Account(s)

Folder per ”area”

Analytics

Users in Groups Read Only – except Work System Account(s)

Read/Write”All data”

Transform

Transform

Data Catalog

Data Inges-tion

Publish B

Users in Groups

Read/Write

Read Only – except Work System Account(s)

Folder per ”area”

Publish X

Users in Groups

Read/Write

Read Only – except Work System Account(s)

Folder per ”area”

………

Anonymize

Page 16: Azure Data Lake Customer Deckazurebootcampdk.azurewebsites.net/presentations/DataLake... · 2018-06-18 · Azure Active Directory Multi-Factor Authentication Automation Portal Key

Thank you


Recommended