+ All Categories
Home > Documents > files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter...

files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter...

Date post: 29-Jun-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
325
Cloud Fundamentals 191213
Transcript
Page 1: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloud Fundamentals

191213

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be re-produced or shared without prior written consent from Cloudera.

Page 2: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

IntroductionChapter 1

Page 3: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-2

Page 4: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Trademark Information

▪ The names and logos of Apache products mentioned in Cloudera trainingcourses, including those listed below, are trademarks of the Apache SoftwareFoundation

Apache Accumulo Apache Hive Apache PigApache Avro Apache Impala Apache RangerApache Ambari Apache Kafka Apache SentryApache Atlas Apache Knox Apache SolrApache Bigtop Apache Kudu Apache SparkApache Crunch Apache Lucene Apache SqoopApache Druid Apache Mahout Apache StormApache Flink Apache NiFi Apache TezApache Flume Apache Oozie Apache TikaApache Hadoop Apache ORC Apache ZeppelinApache HBase Apache Parquet Apache ZooKeeperApache HCatalog Apache Phoenix

▪ All other product names, logos, and brands cited herein are the property oftheir respective owners

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-3

Page 5: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Introduction

▪ About This Course

▪ Introductions

▪ About Cloudera

▪ About Cloudera Educational Services

▪ Course Logistics

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-4

Page 6: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Objectives

During this course, you will learn

▪ The advantages of deploying infrastructure as a service in the cloud

▪ How to estimate and optimize the cost of running services in the cloud

▪ How to secure cloud resources

▪ How to create and manage a network in the cloud

▪ How to deploy, modify, and delete new resources in the cloud

▪ How to deploy and manage compute resources

▪ How to store data in the cloud using object stores and databases

▪ How to create and work with cloud managed services

▪ How to deploy infrastructure programatically

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-5

Page 7: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Introduction

▪ About This Course

▪ Introductions

▪ About Cloudera

▪ About Cloudera Educational Services

▪ Course Logistics

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-6

Page 8: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Introductions

▪ About your instructor

▪ About you─ Currently, what do you do at your workplace?─ What is your experience with database technologies, programming, and

query languages?─ How much experience do you have with UNIX or Linux?─ What is your experience with big data?─ What do you expect to gain from this course? What would you like to be

able to do at the end that you cannot do now?

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-7

Page 9: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Introduction

▪ About This Course

▪ Introductions

▪ About Cloudera

▪ About Cloudera Educational Services

▪ Course Logistics

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-8

Page 10: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

About Cloudera

 

THE ENTERPRISE DATA CLOUD COMPANY

 ▪ Cloudera (founded 2008) and Hortonworks (founded 2011) merged in 2019

▪ The new Cloudera improves on the best of both companies─ Introduced the world’s first Enterprise Data Cloud─ Delivers an comprehensive platform for any data from the Edge to AI─ Leads in training, certification, support, and consulting for data professionals─ Remains committed to open source and open standards

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-9

Page 11: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloudera Data Platform

A suite of products to collect, curate, report, serve, and predict

▪ Cloud native or bare metaldeployment

▪ Powered by open source

▪ Analytics from the Edge to AI

▪ Unified data control plane

▪ Shared Data Experience (SDX)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-10

Page 12: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloudera Shared Data Experience (SDX)

▪ Full data lifecycle: Manages your data from ingestion to actionable insights

▪ Unified security: Protects sensitive data with consistent controls

▪ Consistent governance: Enables safe self-service access

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-11

Page 13: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Self-Serve Experiences for Cloud Form Factors

▪ Services customized for specific steps in the data lifecycle─ Emphasize productivity and ease of use─ Auto-scale compute resources to match changing demands─ Isolate compute resources to maintain workload performance

 

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-12

Page 14: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloudera DataFlow

▪ Data-in-motion platform

▪ Reduces data integrationdevelopment time

▪ Manages and securesyour data from edge toenterprise

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-13

Page 15: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloudera Machine Learning

▪ Cloud-native enterprise machine learning─ Fast, easy, and secure self-service data science in enterprise environments─ Direct access to a secure cluster running Spark and other tools─ Isolated environments for running Python, R, and Scala code─ Teams, version control, collaboration, and project sharing

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-14

Page 16: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloudera Data Hub

Customize your own experience in cloud form factors

▪ Integrated suite of analytic engines

▪ Cloudera SDX applies consistent security and governance

▪ Fueled by open source innovation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-15

Page 17: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Introduction

▪ About This Course

▪ Introductions

▪ About Cloudera

▪ About Cloudera Educational Services

▪ Course Logistics

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-16

Page 18: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloudera Educational Services

▪ We offer a variety of ways to take our courses─ Instructor-led, both in physical and virtual classrooms

─ Private and customized courses also available─ Self-paced, through Cloudera OnDemand

▪ Courses for all kinds of data professionals─ Executives and managers─ Data scientists and machine learning specialists─ Data analysts─ Developers and data engineers─ System administrators─ Security professionals

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-17

Page 19: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloudera Education Catalog

▪ A broad portfolio across multiple platforms─ Not all courses shown here─ See our website for the complete catalog

 Administrator Security NiFi AWS Fundamentals

for CDP

Data Analyst Hive 3 Kudu Cloudera Data WarehouseCDP

SparkSpark PerformanceTuning

Stream Developer Kaa Operaons Search | Solr

ArchitectureWorkshop

Private ClassPublic ClassOnDemand

Data Scienst Cloudera DS Workbench CML

DATA ANALYST

DEVELOPER &DATA ENGINEER

DATA SCIENTIST

ADMINISTRATOR CDH | HDP CDH|HDP CDF

CDH | CDP HDP CDH

CDH | HDP CDH CDF CDH CDH CDH

CDH|HDP|CDP CDH | HDP CDP

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-18

Page 20: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloudera OnDemand

▪ Our OnDemand catalog includes─ Courses for developers, data analysts, administrators, and data scientists,

updated regularly─ Exclusive OnDemand-only courses, such as those covering security and

Cloudera Data Science Workbench─ Free courses such as Essentials and Cloudera Director available to all with or

without an OnDemand account

▪ Features include─ Video lectures and demonstrations with searchable transcripts─ Hands-on exercises through a browser-based virtual environment─ Discussion forums monitored by Cloudera course instructors─ Searchable content within and across courses

▪ Purchase access to a library of courses or individual courses

▪ See the Cloudera OnDemand information page for more details or to make apurchase, or go directly to the OnDemand Course Catalog

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-19

Page 21: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Accessing Cloudera OnDemand

▪ Cloudera OnDemandsubscribers can accesstheir courses onlinethrough a web browser

        ▪ Cloudera OnDemand is also available through an

iOS app─ Search for “Cloudera OnDemand” in the iOS

App Store

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-20

Page 22: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloudera Certification

▪ The leader in Apache Hadoop-based certification

▪ Cloudera certification exams favor hands-on, performance-based problemsthat require execution of a set of real-world tasks against a live, workingcluster

▪ We offer two levels of certifications─ Cloudera Certified Associate (CCA)

─ CCA Spark and Hadoop Developer─ CCA Data Analyst─ CCA CDH Administrator and CCA HDP Administrator

─ Cloudera Certified Professional (CCP)─ CCP Data Engineer

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-21

Page 23: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Introduction

▪ About This Course

▪ Introductions

▪ About Cloudera

▪ About Cloudera Educational Services

▪ Course Logistics

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-22

Page 24: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Logistics

▪ Class start and finish time

▪ Lunch

▪ Breaks

▪ Restrooms

▪ Wi-Fi access

▪ Virtual machines

Your instructor will give you details on howto access the course materials for the class

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 01-23

Page 25: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

An Overview of the Cloud withClouderaChapter 2

Page 26: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-2

Page 27: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

An Overview of the Cloud with Cloudera

▪ Cloud Fundamentals

▪ Evolution from the Data Center to the Cloud

▪ Amazon Web Services (AWS)

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-3

Page 28: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloud Fundamentals Objectives

In this training, you will learn

▪ Fundamentals of cloud computing─ Key concepts of Amazon Web Services (AWS)─ Prerequisites to work with Cloudera products and services

▪ Step-by-step demonstrations and exercises

▪ History of Amazon Web Services

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-4

Page 29: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

An Overview of the Cloud with Cloudera

▪ Cloud Fundamentals

▪ Evolution from the Data Center to the Cloud

▪ Amazon Web Services (AWS)

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-5

Page 30: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

The Evolution to Big Data

▪ The need to organize data─ Analog era

▪ Digital era─ Spreadsheets, databases and even bigger databases

▪ Information explosion

▪ Big Data era─ Data beyond a manageable size

─ Single computing device─ Parallel computing

▪ Multiple machines in a data center

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-6

Page 31: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Big Data Era in the Corporate Data Center

▪ Required─ Large number of machines working in parallel─ Sizeable data repositories

▪ Potential drawbacks─ Large upfront capital expense─ Requires planning and approval─ May be over- or under-utilized─ Virtualization added flexibility

▪ Other options are available

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-7

Page 32: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

The Cloud

▪ What is cloud computing?─ Someone else’s computers─ In charge of the infrastructure─ Offered as services

▪ Different modalities─ Infrastructure-as-a-Service (IaaS)

─ Amazon Web Services─ Platform-as-a-Service (PaaS)

─ Heroku or OpenShift─ Software-as-a-Service (SaaS)

─ Cloudera CDP

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-8

Page 33: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

On-premises and Cloud Offerings

▪ Resource administration

▪ On-premises─ You manage everything

What IaaS PaaS SaaS

Applications You You AWS

Data You You AWS

Operating system You AWS AWS

Virtualization AWS AWS AWS

Servers AWS AWS AWS

Storage AWS AWS AWS

Networking AWS AWS AWS

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-9

Page 34: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Advantages of Cloud Computing (1)

▪ Flexible environment─ Adapts to your needs

▪ Wide number of services available in AWS

▪ Pay-as-you-go approach─ Cost savings─ Operating expense

─ Not a capital expense

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-10

Page 35: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

The Advantages of Cloud Computing (2)

▪ Near-infinite scalability─ Test and develop with a subset of resources

─ Grow as needed─ Cost is more commonly the limit

▪ Worldwide availability

▪ Focus on your applications and clusters─ Not infrastructure

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-11

Page 36: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Comparing Corporate Data Center and Cloud

▪ Analogy: Using a car

Modality Similar to Details

Owning Data center Requires large up-front investment

Renting Cloud On-demandPay only for what you use

Leasing Cloud Longer-term commitmentTake advantage of discounts

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-12

Page 37: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

An Overview of the Cloud with Cloudera

▪ Cloud Fundamentals

▪ Evolution from the Data Center to the Cloud

▪ Amazon Web Services (AWS)

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-13

Page 38: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Web Services (AWS)

▪ Prelude started in early 2000s─ Work began on merchant.com─ Intended for use by other retailers

▪ Realization─ Better service decoupling needed─ Vision paper published in 2003

▪ Officially launched in 2006─ Storage was the first service

─ Simple Storage Service (S3)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-14

Page 39: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Web Services (AWS)

▪ Largest by market cap─ Competitors catching up

▪ Provides infrastructure as a service─ Compute, storage, networking, databases, security related...

▪ Set up and run clusters─ In the cloud

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-15

Page 40: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

An Overview of the Cloud with Cloudera

▪ Cloud Fundamentals

▪ Evolution from the Data Center to the Cloud

▪ Amazon Web Services (AWS)

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-16

Page 41: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points

▪ Amazon Web Services (AWS)─ First commercial cloud─ Largest by market cap

▪ Concepts required─ Cloudera products and services─ General introduction to the cloud

▪ Infrastructure─ Required for processing large amounts of data─ Data center

─ Potential drawbacks

▪ Cloud provides an alternative─ Flexible, cost-effective, and near-infinite scalability

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 02-17

Page 42: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Getting Started with the CloudChapter 3

Page 43: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-2

Page 44: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Getting Started with the Cloud

▪ Getting Started with AWS

▪ AWS Management Console

▪ AWS Account and Resource Identifiers

▪ AWS Command Line Interface (CLI)

▪ Hands-On Exercise: Accessing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-3

Page 45: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Objectives

In this chapter, you will learn

▪ How to create an AWS account

▪ How to access the AWS management console

▪ How to obtain the unique AWS identifiers for your account

▪ How to work with the AWS API via the command line

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-4

Page 46: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Getting Started with AWS

▪ Begin by creating a primary AWS account─ Also known as a root account

▪ This account has full administrative privileges─ Access all services─ Account and administrative tasks─ Create resources and additional accounts

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-5

Page 47: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Creating the Primary Account

▪ Create an AWS account─ https://aws.amazon.com/

▪ Click Create an AWS Account

▪ Provide account information─ Email─ Password─ Account name

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-6

Page 48: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Account Information

▪ Contact information

▪ Phone number─ Reachable immediately

▪ Payment method─ Credit card, debit card, EFT, ACH, or SEPA (Europe)

▪ Support plan─ Free or paid

▪ Activate the account

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-7

Page 49: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Getting Started with the Cloud

▪ Getting Started with AWS

▪ AWS Management Console

▪ AWS Account and Resource Identifiers

▪ AWS Command Line Interface (CLI)

▪ Hands-On Exercise: Accessing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-8

Page 50: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Management Console

▪ Web application─ Manage AWS services─ Account information

─ Billing

▪ Starting point─ Work with resources─ Account related tasks

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-9

Page 51: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Management Console Home Screen

▪ Menu bar─ Customizable─ Links of interest

▪ Search area─ Find services

▪ List of all services─ Divided by type

▪ Additional resources─ Simple wizards and automated workflows

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-10

Page 52: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Management Console

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-11

Page 53: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

A Service from the Management Console

▪ Elastic Compute Cloud (EC2) as example─ Virtual machines in the cloud

▪ Left menu─ Navigation pane

▪ Resources in use─ Instances in the current region

▪ Launch instance button

▪ Health information

▪ Additional information

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-12

Page 54: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

A Service from the Management Console

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-13

Page 55: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Getting Started with the Cloud

▪ Getting Started with AWS

▪ AWS Management Console

▪ AWS Account and Resource Identifiers

▪ AWS Command Line Interface (CLI)

▪ Hands-On Exercise: Accessing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-14

Page 56: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Account Identifiers

▪ Each account has two unique identifiers (ID)─ Account ID

─ Used in Amazon Resource Names (ARN)─ Canonical User ID

─ Storage services (S3)

▪ Available─ My Account─ My Security Credentials

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-15

Page 57: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Resource Names (ARN)

▪ Uniquely identify resources across AWS

▪ A sample ARN format

arn:partition:service:region:account-id:resource-id

▪ Components and values vary by service─ Paths allowed

▪ The ARN for a storage resource

arn:aws:s3:::cloudfundamentals-dl

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-16

Page 58: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Account ID

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-17

Page 59: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Canonical User ID

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-18

Page 60: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Getting Started with the Cloud

▪ Getting Started with AWS

▪ AWS Management Console

▪ AWS Account and Resource Identifiers

▪ AWS Command Line Interface (CLI)

▪ Hands-On Exercise: Accessing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-19

Page 61: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Command Line Interface (CLI)

▪ Unified tool─ Manage all AWS services─ Command-line

▪ Text commands─ Ability to script─ Share and version

▪ Requires credentials

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-20

Page 62: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Command Line Interface (CLI)

▪ Available─ Unix-like shells

─ Linux, macOS, or Unix─ Windows

─ CMD and PowerShell─ Remotely

─ PuTTY, SSH, or AWS Systems Manager

▪ Written in Python

▪ Open-source─ https://github.com/aws/aws-cli

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-21

Page 63: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Command Line Interface (CLI)

▪ Install the CLI

▪ Execute and provide the necessary parameters

$ ./awsusage: aws [options] <command> <subcommand> [<subcommand> ...] [parameters]To see help text, you can run:

aws help aws <command> help aws <command> <subcommand> help

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-22

Page 64: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Getting Started with the Cloud

▪ Getting Started with AWS

▪ AWS Management Console

▪ AWS Account and Resource Identifiers

▪ AWS Command Line Interface (CLI)

▪ Hands-On Exercise: Accessing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-23

Page 65: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Hands-On Exercise: Accessing the Amazon Cloud

▪ In this exercise, you will create an account in AWS, access the managementconsole, and explore a service

▪ Please refer to the Hands-On Exercise Manual for instructions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-24

Page 66: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Getting Started with the Cloud

▪ Getting Started with AWS

▪ AWS Management Console

▪ AWS Account and Resource Identifiers

▪ AWS Command Line Interface (CLI)

▪ Hands-On Exercise: Accessing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-25

Page 67: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points

▪ Start by creating an account─ Provide request information─ Activation required

▪ Access via the management console─ Web application─ Access to all services

▪ Individual screens for services

▪ Account ID and canonical user id

▪ Amazon Resource Name (ARN)

▪ Command-line (CLI)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 03-26

Page 68: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Estimating, Managing, andMonitoring CostsChapter 4

Page 69: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-2

Page 70: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Estimating, Managing, and Monitoring Costs

▪ Cloud Economics: Understanding Costs

▪ Estimating Cost

▪ Controlling and Viewing Costs

▪ Hands-On Exercise: Estimating and Viewing Costs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-3

Page 71: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Objectives

In this chapter, you will learn

▪ How infrastructure-pricing works in the cloud

▪ How to estimate cost of infrastructure

▪ How to manage infrastructure cost

▪ How to monitor cost

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-4

Page 72: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cloud Economics: Understanding and Managing Cost

▪ Cloud changes how cost works─ Pay only for what you use

─ Operating expense─ Scale up and down as needed

▪ Costs in the cloud─ Computing, storage, bandwidth, and managed services─ Some services are free

▪ Important─ Estimate, monitor, and control cost

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-5

Page 73: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Computing Costs

▪ Compute power in the cloud (virtual machines)─ Amazon Elastic Cloud Compute (EC2)─ Tends to be a high cost

▪ Pay for running instances─ Resources

─ CPU, memory, and storage─ Different types of instances

─ Turn off when not in use─ Pay for storage

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-6

Page 74: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon EC2 Payment Options

▪ Different pricing schemas─ Free tier to try─ Requirements and commitment

─ Opportunity for cost savings

▪ Instance pricing

On-demand Pay for use, with no long term commitment

Reserved Significant discount, requires upfront payment and commitment

Spot Deeper discount on spare capacity, but not guaranteed

Dedicated Physical EC2 server dedicated for your use

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-7

Page 75: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Storage Costs

▪ Types of storage─ Attached to instances (virtual machines)

─ Instance store and block-level storage─ Object storage

▪ Charges vary─ Pay for what you use

─ Tiered, with discounts for larger utilization─ Provisioned capacity─ Storage medium

─ Magnetic (HDD) or solid state (SSD)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-8

Page 76: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Costs of Various Storage Types

Type How it is paid

Instance store Local storage at no extra cost

Not available for all machines

Potential drawbacks

Block store Billed by gigabyte-month (GB/m)

Provisioned capacity

Object store Pay for what you use

Tiered pricing

Location

Access frequency

Stand-alone

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-9

Page 77: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Bandwidth Costs

▪ Free─ Inbound─ Between virtual machines

─ Within the same geographical area (availability zone)

▪ Associated cost─ Outbound data─ Between AWS geographic regions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-10

Page 78: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Managed Services Costs

▪ Pay for a managed service─ Infrastructure─ Software

▪ License─ Bring your own license (BYOL)─ Included in the cost

▪ Examples─ Relational Database Service (RDS)─ DynamoDB

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-11

Page 79: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Other Costs

▪ Many different types of costs─ VPN connections─ Metrics─ Requests─ Queries

▪ High granularity

▪ Inspect costs of each service carefully

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-12

Page 80: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Estimating, Managing, and Monitoring Costs

▪ Cloud Economics: Understanding Costs

▪ Estimating Cost

▪ Controlling and Viewing Costs

▪ Hands-On Exercise: Estimating and Viewing Costs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-13

Page 81: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Estimating Cost

▪ Unlimited budget?─ Not a good practice

▪ Estimate costs upfront─ Based on your known requirements

─ Projections─ Fees may vary based on usage

▪ Tools available from AWS to estimate cost─ Simple Monthly Calculator

─ Legacy─ Pricing Calculator

▪ Estimate, save, export, and share

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-14

Page 82: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Simple Monthly Calculator

▪ Select one service─ Configuration─ Commitment

─ Not available for all services

▪ Add all other services

▪ Get an estimate

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-15

Page 83: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Simple Monthly Calculator

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-16

Page 84: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Pricing Calculator

▪ Estimate prices in two ways─ Quick estimate─ Advanced estimate

▪ Steps to follow─ Select and configure each service─ Add other services─ Get an estimated cost of the infrastructure in the cloud

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-17

Page 85: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Pricing Calculator

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-18

Page 86: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Estimating, Managing, and Monitoring Costs

▪ Cloud Economics: Understanding Costs

▪ Estimating Cost

▪ Controlling and Viewing Costs

▪ Hands-On Exercise: Estimating and Viewing Costs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-19

Page 87: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Controlling and Viewing Cost

▪ Cost management tools─ Explore cost─ Set budgets─ Create alarms─ Reports─ Account billing-related

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-20

Page 88: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Cost Explorer

▪ View, understand, and manage your costs─ Time intervals─ Filter and drill down

▪ Using the cost explorer tool is free─ Charge associated with programmatic calls (API)

▪ Reports─ Save and share

▪ Recommendations

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-21

Page 89: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Cost Explorer

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-22

Page 90: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Budgets

▪ Monitor cost

▪ Create custom budgets─ Alerts─ Notified when exceeded

▪ Types of budgets─ Cost─ Usage─ Reservation─ Savings plan

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-23

Page 91: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Budgets

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-24

Page 92: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Estimating, Managing, and Monitoring Costs

▪ Cloud Economics: Understanding Costs

▪ Estimating Cost

▪ Controlling and Viewing Costs

▪ Hands-On Exercise: Estimating and Viewing Costs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-25

Page 93: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Hands-On Exercise: Estimating and Viewing Costs

▪ In this exercise, you will estimate the cost of infrastructure in the cloud, viewcost, and create a budget

▪ Please refer to the Hands-On Exercise Manual for instructions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-26

Page 94: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Estimating, Managing, and Monitoring Costs

▪ Cloud Economics: Understanding Costs

▪ Estimating Cost

▪ Controlling and Viewing Costs

▪ Hands-On Exercise: Estimating and Viewing Costs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-27

Page 95: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points

▪ Cost in the cloud─ Operating expense

▪ Pay for─ Computing, storage, and bandwidth─ Managed services

▪ Estimate─ Simple monthly calculator─ Pricing calculator

▪ Monitor and control cost─ Cost explorer─ Budgets

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 04-28

Page 96: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Understanding Cloud Security:Amazon Web ServicesChapter 5

Page 97: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-2

Page 98: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Understanding Cloud Security: Amazon Web Services

▪ Security in the Cloud

▪ Security Credentials

▪ Permissions and Policies

▪ Identity Access Management (IAM)

▪ AWS Access Keys

▪ Amazon EC2 Key Pairs

▪ AWS Key Management Service (KMS)

▪ Hands-On Exercise: Securing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-3

Page 99: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Objectives

In this chapter, you will learn

▪ The steps required to secure your environment in the cloud

▪ Which actions your security credentials can perform

▪ How to provide permissions using policies

▪ How to use identity and access management (IAM) to provide limitedpermissions

▪ How to provide programmatic access using access keys

▪ How to create key-pairs to access virtual machines

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-4

Page 100: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Security in the Cloud

▪ Cloud data security is a top concern─ Highest priority at AWS─ High priority at your company too

▪ Shared responsibility model─ AWS protects the infrastructure

─ Security of the cloud─ Customer security responsibilities

─ Security in the cloud

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-5

Page 101: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Security Topics

▪ Security Credentials

▪ Permissions and Policies

▪ Identity and Access Management (IAM)─ Users, groups, and roles

▪ Access Keys

▪ Key Pairs

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-6

Page 102: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Understanding Cloud Security: Amazon Web Services

▪ Security in the Cloud

▪ Security Credentials

▪ Permissions and Policies

▪ Identity Access Management (IAM)

▪ AWS Access Keys

▪ Amazon EC2 Key Pairs

▪ AWS Key Management Service (KMS)

▪ Hands-On Exercise: Securing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-7

Page 103: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Security Credentials

▪ Proof of identity

▪ Who you are─ Authentication

▪ Whether you have permission─ Authorization

▪ Some actions do not require security credentials─ Public access

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-8

Page 104: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Credential Types

▪ Email and password─ Access via management console─ Registration

─ Root user credentials

▪ Access keys─ Programmatic access─ Command-line

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-9

Page 105: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Root Credentials

▪ Required for all AWS accounts

▪ Provide full access─ Do not share them─ Access cannot be limited

▪ Tasks that require root credentials─ Change support plan─ Billing and cost management─ Restoring user permissions─ Close account─ Other configuration settings

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-10

Page 106: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Root Credentials

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-11

Page 107: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Root Credentials Recommendations

▪ Do not use for everyday tasks

▪ Enable multi-factor authentication (MFA)─ Virtual device, U2F security key, hardware device, or SMS text message

▪ Delete root access keys─ Disable

▪ Recommendation─ Use identity and access management (IAM) instead

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-12

Page 108: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Understanding Cloud Security: Amazon Web Services

▪ Security in the Cloud

▪ Security Credentials

▪ Permissions and Policies

▪ Identity Access Management (IAM)

▪ AWS Access Keys

▪ Amazon EC2 Key Pairs

▪ AWS Key Management Service (KMS)

▪ Hands-On Exercise: Securing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-13

Page 109: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Permissions and Policies

▪ Who or what has access to which resources

▪ Permission─ Specify access

─ Allow or deny

▪ Policy─ Defines set of permissions─ Associates

─ An identity, like ClouderaJoe─ A resource, like EC2

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-14

Page 110: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Policy Types

▪ Identity-based policies

▪ Resource-based policies

▪ Permission boundaries

▪ Service control policy (SCP)

▪ Access control lists

▪ Session policies

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-15

Page 111: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Comparing Identity-based and Resource-based Policies

Identity-based Resource-based

ClouderaJoe▪ Can List, Read▪ On resource X

Resource X▪ ClouderaJoe▪ List, Read

ClouderaJane▪ Can Write▪ On resource X, Y

Resource X▪ ClouderaJane▪ Denied access

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-16

Page 112: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Create Policy Screen

▪ Two ways to create a policy in AWS─ Visual editor─ JSON

▪ Steps in the visual editor─ Select a service, actions, and resources─ Request conditions─ Additional permissions─ Name and description─ Review summary

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-17

Page 113: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Visual Editor

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-18

Page 114: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

JSON Policy

▪ Policy specified as a JSON text

{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:ListAllMyBuckets", "s3:ListJobs" ], "Resource": "*" } ]}

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-19

Page 115: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Understanding Cloud Security: Amazon Web Services

▪ Security in the Cloud

▪ Security Credentials

▪ Permissions and Policies

▪ Identity Access Management (IAM)

▪ AWS Access Keys

▪ Amazon EC2 Key Pairs

▪ AWS Key Management Service (KMS)

▪ Hands-On Exercise: Securing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-20

Page 116: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Identity Access Management (IAM)

▪ Service to securely control access─ Resources─ Use instead of root credentials

─ Root user creates IAM identities

▪ Manage access─ Create policies─ Attach to an IAM identity

─ IAM user─ IAM role─ IAM group

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-21

Page 117: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Identity and Access Management (IAM)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-22

Page 118: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

IAM User

▪ Represents a person or an AWS service─ Interact with AWS

─ Management console─ Programmatic requests

▪ Grant permission─ Add to a group

─ Permission policies attached

▪ Clone─ Member of same group─ Same policies

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-23

Page 119: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

IAM User

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-24

Page 120: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

IAM Groups

▪ Collection of IAM users─ Specify permissions

▪ Associates policies with users in the group─ Multiple users at a time

▪ Simplifies user management─ Add and remove users from a group─ Provide or take away permissions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-25

Page 121: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

IAM Groups

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-26

Page 122: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

IAM Roles

▪ Entity that defines a set of permissions for making service requests─ Identity with permission policies─ Specify access to resources

▪ Does not have any credentials─ Password or access keys

▪ Assummed by anyone who needs the role─ Take on permissions temporarily

─ To complete a task

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-27

Page 123: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

IAM Roles

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-28

Page 124: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cross-Account Role

▪ Grant access to your resources─ To another AWS account

▪ Create role

▪ Specify account─ Account ID─ External ID

▪ Create policy─ Attach permissions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-29

Page 125: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Cross Account Role

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-30

Page 126: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Understanding Cloud Security: Amazon Web Services

▪ Security in the Cloud

▪ Security Credentials

▪ Permissions and Policies

▪ Identity Access Management (IAM)

▪ AWS Access Keys

▪ Amazon EC2 Key Pairs

▪ AWS Key Management Service (KMS)

▪ Hands-On Exercise: Securing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-31

Page 127: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Access Keys

▪ Programmatic access and command-line─ Access Key ID─ Secret Access Key

▪ Used together─ Like a username and password─ CLI and API calls

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-32

Page 128: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Access Keys

▪ Safeguard them─ Used to create resources─ Which you will have to pay for

─ Even if stolen─ Delete if you suspect keys have been stolen

▪ Limit─ Two keys per IAM user

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-33

Page 129: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Access Keys

▪ Status─ Newly created

─ Active─ Deactivated

─ If no longer in use

▪ Access key age─ Rotated for security

▪ Last activity

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-34

Page 130: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

An Example Access Key

Access Key ID Secret Access Key

AKIAIOSFODNN7EEX4MPSR wJalrXUtnFEMI/K7MDENG/bPxRfiCYE4AMfL34EY

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-35

Page 131: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Access Keys

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-36

Page 132: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Understanding Cloud Security: Amazon Web Services

▪ Security in the Cloud

▪ Security Credentials

▪ Permissions and Policies

▪ Identity Access Management (IAM)

▪ AWS Access Keys

▪ Amazon EC2 Key Pairs

▪ AWS Key Management Service (KMS)

▪ Hands-On Exercise: Securing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-37

Page 133: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon EC2 Key Pairs

▪ EC2 uses public-key cryptography

▪ Encrypt and decrypt login information─ Securely access EC2 instances

─ Without a password─ Secure Shell (SSH)

▪ Key pair─ Public key─ Private key

▪ Not related to access keys

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-38

Page 134: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Public Key

▪ Digital signature

▪ Sample public key

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQClKsfkNkuSevGj3eYhCe53pcjqP3maAhDFcvBS7O6Vhz2ItxCih+PnDSUaw+WNQn/mZphTk/a/gU8jEzoOWbkM4yxyb/wB96xbiFveSFJuOp/d6RJhJOI0iBXrlsLnBItntckiJ7FbtxJMXLvvwJryDUilBMTjYtwB+QhYXUMOzce5Pjz5/i8SeJtjnV3iAoG/cQk+0FzZqaeJAAHco+CY/5WrUBkrHmFJr6HcXkvJdWPkYQS3xqC0+FmUZofz221CBt5IMucxXPkX4rWi+z7wB3RbBQoQzd8v7yeb7OzlPnWOyN0qFU0XA246RA8QFYiCNYwI3f05p6KLxEXAMPLE my-key-pair

* Modified for displayCopyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-39

Page 135: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Private Key

▪ Validate the signature

▪ Sample private key

-----BEGIN RSA PRIVATE KEY-----MIIEowIBAAKCAQEAh1PkSPTC5xnpi6fgU9Wz+mQy4HcXM96f9Vxj4gaEaOCqao5L1gwYEHLSBvXeMG9Ja1rBPqdR8hFM9tHDJEy8A22OReXuRays8NfaTRUdzFGMkJX7wCG+qgSOg6yIwgJVsdF4Y3eUH9cPewRk3UMr21NBKayhLVKc3PHz5/XlsXbmCA27wETaHnlF1i/WZHaxUc0YsuRzE8qMyMUATllUITgJTkoGYsu8XC/qocou3v0NQAWM/nGdyhaoFWO/haGklE06RgZf6G9UswlsLttI/+wfpVUKYFiCOS1fcKnjixSEky6mYrUqwo10bq5L8+tP3hOj4Uki7jlA5CkRQSyrwIDAQABAoIBAB3Go7A5yriyYa4r9LcuHMNSnGG41fwcGKvjdDAWZxCz3iRA1Sfa+NeMF9eMj0vwmx+4hh5NkxS2kmVvOFNkLSw+EDdD6HQZi3N75q9VSYurGQKBgFElLTssHXNSCUeecAcGC+0VElj2eY+oZytGhWQr3lU98e00w2zlwNJnHjZxw4AIaHrvss6YldVmDbatscBJ51rLmXDuyoVpTHOroMp3RMoRFYbHs8iq080zuZ7sUXVXSekeFj1LxDrVN0fOCguHQL0OKFmH5hgWWTT90pqX9pRAoGBALi1/fnyl5CcJPBiBli6mfMELdrO9c6qR7YqLF81csSPIF2IQDBO0xdykvD9h9p76j/MpSkgHqHMI+CXTdcbepbyvCSQ514cxTtJROq2W5l0pWncInmfu198hSCX+g2Us6/yyKq888q3SLLQvjbzPGyy2uBjLV5hk8lBz-----END RSA PRIVATE KEY-----

* Modified for displayCopyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-40

Page 136: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Creating a Key Pair

▪ Several ways to create a key pair─ From the console

─ EC2 navigation pane─ Using the CLI─ A key pair can be imported into AWS

▪ Download immediately after creation─ PEM file─ Cannot retrieve afterwards

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-41

Page 137: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon EC2 Key Pair

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-42

Page 138: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Understanding Cloud Security: Amazon Web Services

▪ Security in the Cloud

▪ Security Credentials

▪ Permissions and Policies

▪ Identity Access Management (IAM)

▪ AWS Access Keys

▪ Amazon EC2 Key Pairs

▪ AWS Key Management Service (KMS)

▪ Hands-On Exercise: Securing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-43

Page 139: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Key Management Service (KMS)

▪ Centralized location to manage encryption keys─ Create, describe, list, enable, disable, and delete master keys─ Customer or AWS managed keys

▪ Control access to your data─ Encrypt and decrypt data stored in AWS

─ Managed encryption─ Audit usage

▪ Integrated with multiple AWS services

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-44

Page 140: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Key Management Service (KMS)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-45

Page 141: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Understanding Cloud Security: Amazon Web Services

▪ Security in the Cloud

▪ Security Credentials

▪ Permissions and Policies

▪ Identity Access Management (IAM)

▪ AWS Access Keys

▪ Amazon EC2 Key Pairs

▪ AWS Key Management Service (KMS)

▪ Hands-On Exercise: Securing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-46

Page 142: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Hands-On Exercise: Securing the Amazon Cloud

▪ In this exercise, you will secure your cloud account using security credentials,permission policies, IAM identities, access keys, and key pairs

▪ Please refer to the Hands-On Exercise Manual for instructions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-47

Page 143: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Understanding Cloud Security: Amazon Web Services

▪ Security in the Cloud

▪ Security Credentials

▪ Permissions and Policies

▪ Identity Access Management (IAM)

▪ AWS Access Keys

▪ Amazon EC2 Key Pairs

▪ AWS Key Management Service (KMS)

▪ Hands-On Exercise: Securing the Amazon Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-48

Page 144: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points (1)

▪ Credentials are the proof of identity─ Authentication─ Username/password or access keys

▪ Policy defines set of permissions─ Access to resources─ Allow or deny

▪ IAM to control access─ Resources─ Users, groups, and roles

─ Cross account role

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-49

Page 145: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points (2)

▪ Access keys used for programmatic access─ Access key ID and secret access key

▪ Key pairs to login to EC2 instances─ Public and private key

▪ KMS to create and control encryption keys

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 05-50

Page 146: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Regions and Availability ZonesChapter 6

Page 147: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-2

Page 148: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Regions and Availability Zones

▪ Picking a Location: Regions and Availability Zones

▪ Hands-On Exercise: Working with Regions and AZs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-3

Page 149: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Objectives

In this chapter, you will learn

▪ How AWS provides wordwide coverage and resiliency using regions andavailability zones

▪ How to select and change regions

▪ How to check the health of services within a region

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-4

Page 150: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Picking a Location for Your Cloud Infrastructure

▪ Geographic location is an important consideration─ Having clusters close to your customers has many advantages

─ Performance, compliance, disaster recovery, and more

▪ Implementation of local clusters is difficult─ Multiple corporate data centers

─ Longer planning process─ Lengthier provisioning cycles

▪ The cloud provides a better way

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-5

Page 151: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS Global Infrastructure

▪ The cloud brings your clusters closer to your customers─ Multiple locations available worldwide─ Known as regions

*

* Source: AWS documentationCopyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-6

Page 152: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Region

▪ A region is a separate geographical area─ Northern Virginia, Oregon, Ireland, Tokyo...─ Most regions are readily available, but some require opt-in

▪ Each region hosts a collection of resources─ Independent from resources in other regions─ Resources in one region do not exist in other regions

─ There are some exceptions, like IAM

▪ View and work with resources per region

▪ A region is comprised of one or more availability zones

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-7

Page 153: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Availability Zones (AZ)

▪ Isolated location within a region─ Provides maximum resiliency─ One or more AZs per region

▪ Each AZ belongs to a single region─ Made up of multiple data centers, typically three

▪ Relationship between regions and availability zones

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-8

Page 154: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Regions and Availability Zones

▪ Network traffic among regions uses the AWS global network backbone─ Cost associated with data transfer

▪ AZs within a region are connected via low-latency links─ Data transfer is free within AZ

─ Using private IP─ Not free between AZs

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-9

Page 155: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Region and Availability Zone Identifiers

▪ Regions are represented using an identifier─ Northern Virginia is us-east-1 and Ohio is us-east-2─ Ireland is eu-west-1

▪ AZ uses the region code followed by a letter identifier─ Zone a for Northern Virginia is us-east-1a

▪ AZ naming varies by account for even distribution and load─ us-east-1a in one account may be us-east-1b in another

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-10

Page 156: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Creating Resources in a Region and Availability Zone

▪ Creating resources in a specific AZ may be restricted if capacity is low─ Priority given when you already have resources in the AZ─ Possible to create resources in a different AZ

▪ Pick the region that works best for you─ Cloudera products and services do not support all regions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-11

Page 157: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Regions and Availability Zones

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-12

Page 158: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Regions and Availability Zones

▪ Picking a Location: Regions and Availability Zones

▪ Hands-On Exercise: Working with Regions and AZs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-13

Page 159: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Hands-On Exercise: Working with Regions and AZs

▪ In this exercise, you will identify your current region, change region, and checkthe health of services per region

▪ Please refer to the Hands-On Exercise Manual for instructions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-14

Page 160: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Regions and Availability Zones

▪ Picking a Location: Regions and Availability Zones

▪ Hands-On Exercise: Working with Regions and AZs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-15

Page 161: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points

▪ Being close to your customers has many advantages─ A potential challenge with corporate data centers

▪ The cloud allows you to deploy resources worldwide─ In different geographic locations, called regions

▪ Each region is comprised of one or more availability zones─ Isolated, for resiliency

▪ Pick the region that works best for your scenario

▪ Create resources in a supported region

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 06-16

Page 162: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

NetworkingChapter 7

Page 163: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-2

Page 164: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Networking

▪ Private Networks in the Cloud: VPCs and Subnets

▪ External Networking: Route 53, Elastic IPs, and ELB

▪ Hands-On Exercise: Configuring Your Network

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-3

Page 165: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Objectives

In this chapter, you will learn

▪ How to create your private network in AWS using a VPC

▪ How to segment your VPC using subnets

▪ How AWS provides friendly names to IP addresses

▪ How to obtain static IP addresses

▪ How AWS balances load

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-4

Page 166: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Your Private Network in AWS

▪ Applications co-exist in the cloud─ Isolation is important

─ Security concerns

▪ Achieved with─ Virtual Private Cloud (VPC)

─ Subnets

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-5

Page 167: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Virtual Private Cloud (VPC)

▪ Virtual network─ Dedicated to your account─ Logically isolated from other VPCs

─ Even within your own account─ Spans all availability zones in a region

▪ Similar to an on-premises network─ More automation and scale

▪ Default VPC per region─ Create your own

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-6

Page 168: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Virtual Private Cloud (VPC)

▪ Launch resources into a VPC─ Machines, databases, and storage

▪ IP address associated to resources─ Range specified using CIDR block

▪ Secure resources─ Network ACLs─ Security groups

▪ Segmented into one or more subnets

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-7

Page 169: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Subnets

▪ Logical subdivision of a VPC─ One per availability zone

─ Cannot span zones

▪ Specify non overlapping IP ranges─ Within the VPC

▪ Two types of subnets─ Private, without internet access─ Public, with access to the internet

▪ Different scenarios─ Combinations of private and public subnets

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-8

Page 170: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

VPC and Subnets

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-9

Page 171: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Networking

▪ Private Networks in the Cloud: VPCs and Subnets

▪ External Networking: Route 53, Elastic IPs, and ELB

▪ Hands-On Exercise: Configuring Your Network

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-10

Page 172: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

External Networking

▪ Location in the internet─ Represented as an IP address

─ 13.57.68.67 points to a web site─ Convenient for machines

▪ Problem for humans─ Memorizing IP addresses is not practical

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-11

Page 173: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Domain Name System (DNS)

▪ Hierarchical and decentralized naming system─ Computers, services, and other resources─ Connected to the internet

▪ Phonebook of the internet

▪ Translates an IP address to a host name─ 13.57.68.67 corresponds to www.cloudera.com─ Human-readable names are useful for services

─ Example: my-ml-cluster.services.cloudera.com

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-12

Page 174: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Route 53

▪ AWS DNS service─ Designed to work with other AWS services─ Highly available and scalable

▪ Three functions─ Domain registration─ DNS routing─ Health checking

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-13

Page 175: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Route 53 Routing and Health Checking

▪ Route end-users─ Internet-facing applications and services─ Infrastructure running in AWS

▪ Policy routing─ Direct traffic based on rules

▪ Health checks for DNS failover─ CloudWatch metrics

▪ Balance load─ High-demand and fault-tolerance

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-14

Page 176: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Elastic Load Balancer (ELB)

▪ Load-balancing service─ Even distribution of load─ Applications and services

▪ Automatically distributes incoming traffic─ Across multiple resources and services

▪ Fault tolerance─ Across availability zones

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-15

Page 177: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Types of Load Balancers

▪ Application load balancer─ HTTP and HTTPS

▪ Network load balancer─ TCP, UDP, and TLS

▪ Classic load balancer─ Legacy applications

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-16

Page 178: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Load Balancer

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-17

Page 179: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Elastic IPs (EIP)

▪ Static IPv4 address─ Reachable from the internet

▪ Associated with your account─ Allocate

▪ Map to an instance─ Remap to mask failure

▪ Free as long as they are in use

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-18

Page 180: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Elastic IPs

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-19

Page 181: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Networking

▪ Private Networks in the Cloud: VPCs and Subnets

▪ External Networking: Route 53, Elastic IPs, and ELB

▪ Hands-On Exercise: Configuring Your Network

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-20

Page 182: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Hands-On Exercise: Configuring Your Network

▪ In this exercise, you will create a virtual private network and create subnetswithin a VPC

▪ Please refer to the Hands-On Exercise Manual for instructions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-21

Page 183: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Networking

▪ Private Networks in the Cloud: VPCs and Subnets

▪ External Networking: Route 53, Elastic IPs, and ELB

▪ Hands-On Exercise: Configuring Your Network

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-22

Page 184: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points

▪ Applications co-exist in the cloud─ Isolate using a VPC and subnets

▪ Memorizing IP addresses is not practical─ DNS to translate host names

─ Into human-readable names─ Route 53

─ Policy routing, health checks, and balance load

▪ Elastic Load Balancer (ELB)─ Even distribution of load─ Fault tolerance

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 07-23

Page 185: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Computing Power in AWSChapter 8

Page 186: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-2

Page 187: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Computing Power in AWS

▪ Computing Power with Elastic Cloud Compute (EC2)

▪ Running Containers with the Elastic Kubernetes Service (EKS)

▪ Monitoring EC2 Instances with CloudWatch Logs

▪ Hands-On Exercise: Launching EC2 Instances

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-3

Page 188: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Objectives

In this chapter, you will learn

▪ Which types of virtual machines are available in AWS

▪ How to launch and configure a virtual machine

▪ How to manage a virtual machine

▪ The similarities and differences between virtual machines and containers

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-4

Page 189: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Virtual Machine

▪ Emulation of a computer system─ Managed by a hypervisor

▪ Virtual machine inside a physical machine (host)

▪ Uses physical host resources─ CPU, memory, and storage

▪ Full copy of the operating system

▪ Multiple machines per host

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-5

Page 190: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Compute Power in the Cloud

▪ Virtual machines

▪ Used for applications and services─ Process large amounts of data

▪ Massively parallel processing engines─ Apache Impala or Apache Spark─ Tens, hundreds, or even thousands of machines

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-6

Page 191: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Elastic Cloud Compute (EC2)

▪ Scalable computing capacity─ Virtual machines

─ When you need them, in minutes

▪ Create instances─ Different ways to launch an instance

▪ Execute jobs or process data─ Multiple scenarios

▪ Deprovision when no longer required

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-7

Page 192: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

An Instance in the Cloud

▪ Instance lifecycle─ Create, register, launch, deregister, and copy

▪ Instance state─ Pending, running, shutting-down, terminated, stopping, or stopped

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-8

Page 193: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Machine Images (AMI)

▪ Multiple images available─ Known as an AMI─ Different configurations and operating systems

─ Red Hat, CentOS, Windows, and more

▪ Preconfigured services and applications available

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-9

Page 194: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Instance Type Families

▪ Instances are grouped together─ Based on their purpose─ Known as families

▪ Varying combinations of CPU, memory, storage, and networking capacity─ 1 to 96 vCPUs─ 2 to 488 GiB RAM─ Other configurations

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-10

Page 195: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Instance Type Families

▪ General purpose

▪ Compute optimized

▪ Memory optimized

▪ Accelerated computing

▪ Storage optimized

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-11

Page 196: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

General Purpose Family

▪ Several types of instances─ A1, T3, T3a, T2, M5, M5a, M5n, and M4

▪ T3 is the next generation burstable instance type─ Baseline level CPU performance─ Ability to increase CPU performance (burst)

─ Short periods of time─ Balance of compute, memory, and network resources

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-12

Page 197: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Available Sizes for the T3 Instance Type

Instance vCPU CPUCredits/hour

Mem(GiB)

Storage NetworkPerformance(Gbps)

t3.nano 2 6 0.5 EBS-Only Up to 5

t3.micro 2 12 1 EBS-Only Up to 5

t3.small 2 24 2 EBS-Only Up to 5

t3.medium 2 24 4 EBS-Only Up to 5

t3.large 2 36 8 EBS-Only Up to 5

t3.xlarge 4 96 16 EBS-Only Up to 5

t3.2xlarge 8 192 32 EBS-Only Up to 5

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-13

Page 198: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Creating An EC2 Instance

▪ Multiple steps to launch an instance manually from the console─ Step 1: Choose an Amazon Machine Image (AMI)─ Step 2: Choose an Instance Type─ Step 3: Configure Instance Details─ Step 4: Add Storage─ Step 5: Add Tags─ Step 6: Configure Security Group─ Step 7: Review Instance Launch

▪ Can launch via CLI or CloudFormation

▪ Some Cloudera products and services─ Launch instances automatically as needed

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-14

Page 199: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Creating An EC2 Instance

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-15

Page 200: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

An EC2 Instance Information

▪ Instance name, ID, type, and state─ AMI ID

▪ Availability zone

▪ Key pair name and security groups

▪ VPC and subnet ID

▪ Public and private IP and DNS─ One or more network interfaces

▪ Root and block devices

▪ And more

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-16

Page 201: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

An EC2 Instance Information

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-17

Page 202: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Connecting to EC2 Instances with Secure Shell (SSH)

▪ SSH─ A network protocol─ Secure way to connect to a computer or server─ Over an unsecured network

▪ Connect to Linux instances─ Useful for system administrators

▪ Command to connect

$ chmod 400 cf-kp.pem$ ssh -i "cf-kp.pem" [email protected]

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-18

Page 203: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Computing Power in AWS

▪ Computing Power with Elastic Cloud Compute (EC2)

▪ Running Containers with the Elastic Kubernetes Service (EKS)

▪ Monitoring EC2 Instances with CloudWatch Logs

▪ Hands-On Exercise: Launching EC2 Instances

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-19

Page 204: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Containers

▪ A way of packaging software─ Code, libraries, and dependencies─ Bundled together

▪ Only the OS is virtualized─ Share resources

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-20

Page 205: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Comparing Containers and Virtual Machines

▪ Virtual machine provides a full abstraction of a machine─ Container provides an abstract operating system─ All containers in a host share OS kernel

▪ Host can handle larger number of containers─ Than equivalent virtual machines

▪ Several advantages of containers─ Predictable, repeatable, and immutable─ Lightweight─ Faster startup

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-21

Page 206: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Kubernetes (k8s)

▪ Open-source container orchestration system─ Automating application deployment, scaling, and management─ Across clusters of hosts

▪ Containers in the cloud─ Package and deploy services─ Scale up or down quickly

─ Based on demand

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-22

Page 207: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Elastic Kubernetes Service (EKS)

▪ AWS-managed k8s service─ Certified Kubernetes conformant

─ Works with existing tools and services

▪ Run management infrastructure across multiple AZs─ Eliminate single points of failure─ Automatically detects and replaces unhealthy nodes

▪ Secure by default

▪ Fast

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-23

Page 208: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Computing Power in AWS

▪ Computing Power with Elastic Cloud Compute (EC2)

▪ Running Containers with the Elastic Kubernetes Service (EKS)

▪ Monitoring EC2 Instances with CloudWatch Logs

▪ Hands-On Exercise: Launching EC2 Instances

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-24

Page 209: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon CloudWatch Logs

▪ Centralize logs from all systems, applications, and services─ In a single, highly scalable service─ View events and search for specific codes or patterns

▪ Monitor, store, and access log files─ EC2 instances─ Route 53, CloudTrail, and other sources

▪ Start with free tier─ Paid tier, cost based on metrics and API calls

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-25

Page 210: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Computing Power in AWS

▪ Computing Power with Elastic Cloud Compute (EC2)

▪ Running Containers with the Elastic Kubernetes Service (EKS)

▪ Monitoring EC2 Instances with CloudWatch Logs

▪ Hands-On Exercise: Launching EC2 Instances

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-26

Page 211: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Hands-On Exercise: Launching EC2 Instances

▪ In this exercise, you will create, configure, and launch an EC2 instance

▪ Please refer to the Hands-On Exercise Manual for instructions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-27

Page 212: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Computing Power in AWS

▪ Computing Power with Elastic Cloud Compute (EC2)

▪ Running Containers with the Elastic Kubernetes Service (EKS)

▪ Monitoring EC2 Instances with CloudWatch Logs

▪ Hands-On Exercise: Launching EC2 Instances

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-28

Page 213: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points

▪ EC2 is scalable computing capacity─ Instance families and sizes─ Multiple configurations available for your needs

▪ Create machines as required─ Deprovision when no longer required

▪ Different cost structures based on commitment and needs─ On-demand, reserved, spot, and dedicated

▪ Run containers in the cloud with EKS

▪ CloudWatch Logs to monitor, store, and access EC2 log files

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 08-29

Page 214: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Protecting Your Infrastructure:Security Groups & Network ACLsChapter 9

Page 215: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-2

Page 216: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Protecting Your Infrastructure: Security Groups & NetworkACLs

▪ Protecting Your EC2 Instances: Security Groups

▪ Protecting Your Subnets: Network ACLs

▪ Comparing Security Groups and Network ACLs

▪ Hands-On Exercise: Setting Up Security Groups and Network ACLs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-3

Page 217: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Objectives

In this chapter, you will learn

▪ How to secure EC2 instances using security groups

▪ How to secure subnets using network access control lists (ACL)

▪ The differences between security groups and network ACLs

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-4

Page 218: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Protecting Your EC2 Instances

▪ Securing your instances is a top priority

▪ Possible with a private VPC─ Isolated from the outside world

─ Not from other instances─ Not suitable for most scenarios

▪ Need to control traffic to and from an EC2 instance

▪ Firewall─ Possible to configure per instance

─ Challenges

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-5

Page 219: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Security Groups

▪ Virtual firewall─ Control traffic─ Instance level

▪ No associated cost

▪ Launch an instance─ Assign a security group

─ Custom or default security group─ Associate with a network interface

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-6

Page 220: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Security Group Lifecycle

▪ Create─ Within a VPC─ Name cannot start with sg-

─ Must be unique within the VPC

▪ Rules─ Add, update, and delete

▪ Describe

▪ Delete

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-7

Page 221: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Security Group Rules

▪ Type, protocol, port range, source or destination, and description

▪ Outbound default is to allow all traffic─ Restrict

▪ Stateful

▪ Permissive─ "Allow" rules

─ Most permissive applies─ Cannot create "deny"rules

▪ Multiple security groups on one instance─ Rules aggregated

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-8

Page 222: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Sample Rules For a Security Group

▪ Inbound

Type Protocol Range Source Description

SSH TCP 22 My IP190.7.211.138/32

SSH allowed

MySQL TCP 3306 sg-0a75536f Manager database

▪ Outbound

Type Protocol Range Destination Description

All TCP TCP 0 - 65535 Anywhere0.0.0.0/0, ::/0

All TCP traffic

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-9

Page 223: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Security Group

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-10

Page 224: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Security Group Limits

▪ Default limits

Description Limit

Security groups per network interface 5

Inbound and outbound rules per security group 60

Security groups per region 2500

▪ Possible to request an increase

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-11

Page 225: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Protecting Your Infrastructure: Security Groups & NetworkACLs

▪ Protecting Your EC2 Instances: Security Groups

▪ Protecting Your Subnets: Network ACLs

▪ Comparing Security Groups and Network ACLs

▪ Hands-On Exercise: Setting Up Security Groups and Network ACLs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-12

Page 226: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Protecting Your Subnets with Network ACLs

▪ Security groups protect EC2 instances─ May need optional extra layer of security

▪ Network access control list (ACL)─ Firewall to control traffic in and out of a subnet

▪ Each VPC comes with a default network ACL─ Allows all inbound and outbound traffic

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-13

Page 227: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Network ACL Basics

▪ Create custom network ACL─ Denies all inbound and outbound traffic

─ Add rules to allow desired traffic

▪ One network ACL─ Associated with one or more subnets

▪ One subnet─ Associated to only one network ACL

▪ Stateless─ Explicit rule required for inbound and outbound traffic

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-14

Page 228: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Network ACL Rules

▪ Separate inbound and outbound rules─ Allow or deny traffic

▪ Parts of a rule─ Number, type, protocol, port range, source or destination, allow or deny

▪ Processed in order─ From 1 to 32766─ Asterisk (*)

─ If no other rule matches

▪ Rules may affect other AWS services

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-15

Page 229: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Sample Network ACL Rules

▪ Inbound

# Type Protocol Range Source Allow/Deny

100 HTTPS (443) TCP 443 0.0.0.0/0 ALLOW

101 SSH (22) TCP 22 190.7.211.138/32 ALLOW

* ALL Traffic ALL ALL 0.0.0.0/0 DENY

▪ Outbound

# Type Protocol Range Source Allow/Deny

100 Custom ICMP ICMP Echo Request 0.0.0.0/0 DENY

101 ALL TCP TCP 0 - 65535 0.0.0.0/0 ALLOW

* ALL Traffic ALL ALL 0.0.0.0/0 DENY

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-16

Page 230: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Network ACLs

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-17

Page 231: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Protecting Your Infrastructure: Security Groups & NetworkACLs

▪ Protecting Your EC2 Instances: Security Groups

▪ Protecting Your Subnets: Network ACLs

▪ Comparing Security Groups and Network ACLs

▪ Hands-On Exercise: Setting Up Security Groups and Network ACLs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-18

Page 232: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Comparing Security Groups and Network ACLs

Security Group Network ACL

Instance level Subnet level

Allow only Allow or deny

Stateful Stateless

All rules evaluated Rules processed in order

Applies to associated instances Applies to all instances in the subnet

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-19

Page 233: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Security Groups and Network ACLs

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-20

Page 234: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Protecting Your Infrastructure: Security Groups & NetworkACLs

▪ Protecting Your EC2 Instances: Security Groups

▪ Protecting Your Subnets: Network ACLs

▪ Comparing Security Groups and Network ACLs

▪ Hands-On Exercise: Setting Up Security Groups and Network ACLs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-21

Page 235: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Hands-On Exercise: Setting Up Security Groups and NetworkACLs

▪ In this exercise, you will create security groups and network ACLs

▪ Please refer to the Hands-On Exercise Manual for instructions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-22

Page 236: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Protecting Your Infrastructure: Security Groups & NetworkACLs

▪ Protecting Your EC2 Instances: Security Groups

▪ Protecting Your Subnets: Network ACLs

▪ Comparing Security Groups and Network ACLs

▪ Hands-On Exercise: Setting Up Security Groups and Network ACLs

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-23

Page 237: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points (1)

▪ Security groups─ Virtual firewall

─ Control traffic to and from EC2 instances─ Permissive rules

─ Allow─ Type, protocol, port range, source or destination

─ Stateful

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-24

Page 238: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points (2)

▪ Network ACLs─ Optional extra layer of security─ Firewall at subnet level─ Rules

─ Allow or deny─ Priority, type, protocol, port range, source or destination

─ Stateless

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 09-25

Page 239: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Storing Files and Objects:Instance Store, EBS and S3Chapter 10

Page 240: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-2

Page 241: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Files and Objects: Instance Store, EBS and S3

▪ Storing Files and Objects in the Cloud

▪ Amazon EC2 Instance Store

▪ Amazon Elastic Block Store (EBS)

▪ Amazon Elastic File System (EFS)

▪ Amazon Simple Storage Service (S3)

▪ Understanding and Comparing S3 and HDFS

▪ Hands-On Exercise: Storing Data in the Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-3

Page 242: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Objectives

In this chapter, you will learn

▪ Which are the storage mechanisms used in AWS to store data

▪ How to store data using local storage for virtual machines

▪ How to store data using block-level storage for virtual machines

▪ How to store data in an NFS file system

▪ How to store data in a large-object storage

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-4

Page 243: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Storage in the Cloud

▪ Storage─ Key component of any system

▪ Virtual machine disks─ OS and general files─ Intermediate location for processing

▪ Data storage─ Raw and processed data

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-5

Page 244: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Different Types of Storage

▪ Local storage─ Physically attached disks to host machines─ Block storage

▪ Network storage

▪ Object storage─ Repository for large amounts of data

─ Data lake─ Backup and archive

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-6

Page 245: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

The Storage Options Available in AWS

▪ Instance store or ephemeral

▪ Elastic block store (EBS)

▪ Elastic file system (EFS)

▪ Simple storage service (S3)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-7

Page 246: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Files and Objects: Instance Store, EBS and S3

▪ Storing Files and Objects in the Cloud

▪ Amazon EC2 Instance Store

▪ Amazon Elastic Block Store (EBS)

▪ Amazon Elastic File System (EFS)

▪ Amazon Simple Storage Service (S3)

▪ Understanding and Comparing S3 and HDFS

▪ Hands-On Exercise: Storing Data in the Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-8

Page 247: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon EC2 Instance Store

▪ Ephemeral storage

▪ Temporary block-level storage for EC2 instances─ Not supported by all instance types

▪ Physically attached disk to host computer

▪ Ideal for temporary storage─ Buffer, cache, or for intermediate processing─ Cost included in the instance

▪ Different volume types and sizes available─ Determined by instance type

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-9

Page 248: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon EC2 Instance Store Lifetime

▪ Attached only on launch─ Cannot detach and attach to another instance

▪ Data persists during instance lifetime─ Including reboot

▪ Data lost─ Disk failure─ Instance stopped or terminated

▪ Do not use for valuable data

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-10

Page 249: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon EC2 Instance Store

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-11

Page 250: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Files and Objects: Instance Store, EBS and S3

▪ Storing Files and Objects in the Cloud

▪ Amazon EC2 Instance Store

▪ Amazon Elastic Block Store (EBS)

▪ Amazon Elastic File System (EFS)

▪ Amazon Simple Storage Service (S3)

▪ Understanding and Comparing S3 and HDFS

▪ Hands-On Exercise: Storing Data in the Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-12

Page 251: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Elastic Block Store (EBS)

▪ Block storage service for EC2 instances─ Highly available block level storage volumes

▪ Mount one or many volumes─ Attached to only one instance at a time

▪ Data persists the instance lifetime

▪ Detach─ Attach to other instances

▪ Supports live configuration changes─ Volume size change up to petabytes of data

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-13

Page 252: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Elastic Block Store (EBS)

▪ Encryption─ At-rest and in-transit

▪ Snapshots─ Back up of critical workloads

▪ Replication─ Within availability zone for resiliency

▪ Volumes for different scenarios─ Price vs. performance

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-14

Page 253: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

EBS Volume Types

▪ SSD─ Optimized for transactional workloads with many read and write operations

Type Use Case

General Purpose (gp2) Price performance balance for frequent access workloads

Provisioned IOPS (io1) Highest performance for latency-sensitive workloads

▪ HDD─ Optimized for large workloads at a lower cost

Type Use Case

Throughput Optimized (st1) Low cost volume for frequently accessed workloads

Cold (sc1) Lowest cost for infrequently accessed workloads

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-15

Page 254: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Elastic Block Store (EBS)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-16

Page 255: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Files and Objects: Instance Store, EBS and S3

▪ Storing Files and Objects in the Cloud

▪ Amazon EC2 Instance Store

▪ Amazon Elastic Block Store (EBS)

▪ Amazon Elastic File System (EFS)

▪ Amazon Simple Storage Service (S3)

▪ Understanding and Comparing S3 and HDFS

▪ Hands-On Exercise: Storing Data in the Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-17

Page 256: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Elastic File System (EFS)

▪ NFS file system─ Scalable and fully managed

▪ Network file system─ Access files over the network─ Can be mounted in an on-premises machine

▪ Storage classes─ Standard─ Infrequent access

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-18

Page 257: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Create File System

▪ Configure file system access─ VPC and availability zones─ Security group

▪ Configure optional settings─ Tags, lifecycle management, and encryption─ Throughput mode and performance mode

▪ Review and create

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-19

Page 258: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Elastic File System (EFS)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-20

Page 259: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Files and Objects: Instance Store, EBS and S3

▪ Storing Files and Objects in the Cloud

▪ Amazon EC2 Instance Store

▪ Amazon Elastic Block Store (EBS)

▪ Amazon Elastic File System (EFS)

▪ Amazon Simple Storage Service (S3)

▪ Understanding and Comparing S3 and HDFS

▪ Hands-On Exercise: Storing Data in the Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-21

Page 260: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Simple Storage Service (S3)

▪ S3 is ’storage for the Internet’─ Large object repository

─ Data lake─ Text files, binary files, media, and more

▪ Object storage service─ Infinite scalability and high availability─ Secure, low latency, and cost efficient

▪ Does not need to be attached to a virtual machine

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-22

Page 261: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Simple Storage Service (S3)

▪ Feature set─ Focuses on simplicity and robustness

▪ Eventual data consistency─ Changes made to files on S3 may not be visible for some period of time─ S3Guard

─ Feature used to address the eventual consistency problem

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-23

Page 262: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

S3 Buckets

▪ Bucket is the base storage location─ Similar to a folder

─ Stores objects─ Subfolders

▪ Cannot nest buckets

▪ Region-specific─ Unique name within a region─ Optimize latency, minimize cost, and regulatory purposes

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-24

Page 263: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

S3 Objects

▪ Object is the basic storage unit─ Resides in a bucket─ File or collection of data

─ Key, version id, value, and metadata─ Subresources and access control information

▪ Store unlimited number of objects in a bucket

▪ Different tiers─ Storage class─ Vary in price

─ Depends on your use case

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-25

Page 264: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Storage Class

Class Use Case

Standard Frequently accessed data

Intelligent-Tiering Long-lived data with changing or unknown access patterns

Standard-IA Long-lived yet infrequently accessed data

One Zone-IA Long-lived yet infrequently accessed non-critical data

Glacier Data archival with varying retrieval options

Glacier Deep Archive Archival of rarely used data

Reduced Redundancy Frequently accessed but non-critical data

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-26

Page 265: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Simple Storage Service (S3)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-27

Page 266: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Files and Objects: Instance Store, EBS and S3

▪ Storing Files and Objects in the Cloud

▪ Amazon EC2 Instance Store

▪ Amazon Elastic Block Store (EBS)

▪ Amazon Elastic File System (EFS)

▪ Amazon Simple Storage Service (S3)

▪ Understanding and Comparing S3 and HDFS

▪ Hands-On Exercise: Storing Data in the Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-28

Page 267: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Hadoop Distributed File System (HDFS)

▪ Primary data storage used by Apache Hadoop─ Originally closely related with MapReduce─ Provides storage layer for many distributed processing frameworks

─ MapReduce and Apache Spark

▪ Breaks data into blocks─ Distributes blocks within the cluster nodes─ Replicates data

─ Fault-tolerant

▪ ’Bring compute to the data’─ Parallel processing

▪ Things changed with the cloud

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-29

Page 268: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Comparing S3 and HDFS in the Cloud Era (1)

▪ HDFS requires a running cluster

▪ S3 is independent from compute

▪ Data lake can be shared among multiple clusters─ Transient clusters─ Cost-efficient─ Higher number of machines

─ Working in parallel

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-30

Page 269: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Comparing S3 and HDFS in the Cloud Era (2)

HDFS S3

Distributed file system Object store

Bound by cluster available storage Infinite storage

Higher cost Cost effective

Requires a running cluster Available for transient clusters

Replication makes it fault tolerant 99.999999999% durability & 99.99% availability

Bring the compute to the data As many instances as needed can access data

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-31

Page 270: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Files and Objects: Instance Store, EBS and S3

▪ Storing Files and Objects in the Cloud

▪ Amazon EC2 Instance Store

▪ Amazon Elastic Block Store (EBS)

▪ Amazon Elastic File System (EFS)

▪ Amazon Simple Storage Service (S3)

▪ Understanding and Comparing S3 and HDFS

▪ Hands-On Exercise: Storing Data in the Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-32

Page 271: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Hands-On Exercise: Storing Data in the Cloud

▪ In this exercise, you will work with the different cloud storage services

▪ Please refer to the Hands-On Exercise Manual for instructions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-33

Page 272: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Files and Objects: Instance Store, EBS and S3

▪ Storing Files and Objects in the Cloud

▪ Amazon EC2 Instance Store

▪ Amazon Elastic Block Store (EBS)

▪ Amazon Elastic File System (EFS)

▪ Amazon Simple Storage Service (S3)

▪ Understanding and Comparing S3 and HDFS

▪ Hands-On Exercise: Storing Data in the Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-34

Page 273: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points (1)

▪ Different types of storage

▪ Instance store─ Physically attached disks─ Buffer, cache, or intermediate processing

─ Data can be lost on certain scenarios

▪ EBS─ Highly available block level storage

─ Persistent and supports live configuration changes─ gp2, io1, st1, and sc1

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-35

Page 274: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points

▪ EFS─ Scalable and fully managed NFS file system─ Mounted on an instance─ Also available for on-premises

▪ S3─ Large object repository with infinite scalability and high availability─ Buckets and objects, with eventual consistency

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 10-36

Page 275: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Storing Relational and Key-Value Data: Amazon RDS andDynamoDBChapter 11

Page 276: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-2

Page 277: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Relational and Key-Value Data: Amazon RDS andDynamoDB

▪ Databases and Key Value Stores

▪ Storing Relational Data: Amazon RDS

▪ Storing Key-Value Data: Amazon DynamoDB

▪ Hands-On Exercise: Setting up RDS and DynamoDB

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-3

Page 278: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Objectives

In this chapter, you will learn

▪ Which are the relational and key-value stores available in AWS

▪ How to deploy a managed database using RDS

▪ How to create a DynamoDB table

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-4

Page 279: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Databases and Key-Value Stores

▪ Data is an integral part of an application

▪ Different types of data─ Raw or processed for analysis─ Configuration and metadata to control applications and services

▪ AWS databases and key-value stores─ Relational Database Service (RDS)─ DynamoDB

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-5

Page 280: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Relational and Key-Value Data: Amazon RDS andDynamoDB

▪ Databases and Key Value Stores

▪ Storing Relational Data: Amazon RDS

▪ Storing Key-Value Data: Amazon DynamoDB

▪ Hands-On Exercise: Setting up RDS and DynamoDB

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-6

Page 281: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Relational Databases

▪ Store structured data─ Configuration and state information─ Health and task progress

▪ Use a relational database─ Cloudera Manager, Ambari, and CDP─ Oozie Server, Sqoop Server, Activity Monitor, and Reports Manager─ Hive Metastore Server, Hue Server, and Sentry Server

▪ Install and administer your database─ Or use a managed database

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-7

Page 282: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Relational Database Service (RDS)

▪ Databases in the cloud─ As a managed service

▪ No need to worry about─ Set up, operation, and scale─ Backups, replication, and software patching─ Failure detection and recovery

▪ License─ Included─ Bring your own license (BYOL)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-8

Page 283: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon Relational Database Service (RDS)

▪ Scale, resiliency, and fault-tolerance─ Multi-AZ deployment (high availability)─ Read-only replicas

▪ Security group to control access─ IP address ranges─ EC2 instances

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-9

Page 284: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Instance Types

▪ Optimized for memory, performance, or I/O─ Instance class

▪ Scale each component─ As required─ Independently

▪ Several supported database engines─ MariaDB, MySQL, Oracle, and PostgreSQL

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-10

Page 285: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Database Instance

▪ Basic building block─ EC2 instance

─ Resources according to requirements

▪ Isolated database environment─ Supports multiple user-created databases─ Accessed with existing tools and applications

▪ Management console

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-11

Page 286: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Creating an RDS Database (1/3)

▪ Create database─ Standard Create or Easy Create

▪ Engine options─ Database type─ Edition and version

▪ Template─ Production, Dev/Test, or Free Tier

▪ Settings─ Instance identifier─ Credential settings

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-12

Page 287: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Creating an RDS Database (2/3)

▪ DB instance size─ Standard, memory optimized, or burstable

▪ Storage─ Type, allocated storage, and provisioned IOPS─ Autoscaling options

▪ Availability and durability─ Multiple availability zone deployment

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-13

Page 288: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Creating an RDS Database (3/3)

▪ Connectivity─ VPC─ Cannot be changed after creation

▪ Database authentication─ Password─ IAM users and roles

▪ Additional configuration

▪ Estimated costs

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-14

Page 289: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Relational Database Service (RDS)

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-15

Page 290: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Relational and Key-Value Data: Amazon RDS andDynamoDB

▪ Databases and Key Value Stores

▪ Storing Relational Data: Amazon RDS

▪ Storing Key-Value Data: Amazon DynamoDB

▪ Hands-On Exercise: Setting up RDS and DynamoDB

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-16

Page 291: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Key-Value Stores

▪ Relational databases can solve many problems─ Not suitable for all─ Alternatives required for other types of problems

▪ Key value store─ Type of nonrelational database─ NoSQL store

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-17

Page 292: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

How Key-Value Stores Works

▪ Data stored as key-value pairs─ Key is the unique identifier

▪ Keys and values─ Simple or complex objects

▪ Highly partitionable

▪ Horizontal scaling

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-18

Page 293: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon DynamoDB

▪ Managed key-value store

▪ Serverless

▪ Enterprise ready

▪ Utilized for S3Guard─ Consistent store of metadata─ For objects in an S3 bucket

─ Eventual consistency

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-19

Page 294: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Create Table

▪ Table name

▪ Partition key─ String, binary, number

▪ Settings─ Default─ Custom

─ Secondary indexes and read/write capacity mode─ Provisioned capacity, auto scaling, encryption, and tags

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-20

Page 295: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Amazon DynamoDB

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-21

Page 296: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Relational and Key-Value Data: Amazon RDS andDynamoDB

▪ Databases and Key Value Stores

▪ Storing Relational Data: Amazon RDS

▪ Storing Key-Value Data: Amazon DynamoDB

▪ Hands-On Exercise: Setting up RDS and DynamoDB

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-22

Page 297: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Hands-On Exercise: Setting up RDS and DynamoDB

▪ In this exercise, you will create an RDS instance and a DynamoDB table

▪ Please refer to the Hands-On Exercise Manual for instructions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-23

Page 298: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Storing Relational and Key-Value Data: Amazon RDS andDynamoDB

▪ Databases and Key Value Stores

▪ Storing Relational Data: Amazon RDS

▪ Storing Key-Value Data: Amazon DynamoDB

▪ Hands-On Exercise: Setting up RDS and DynamoDB

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-24

Page 299: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points

▪ Clusters require data stores─ Store data for analysis─ Configuration and metadata

▪ Databases and key value stores

▪ Relational Database Service (RDS)─ Managed relational database─ Focus on your data

─ Not on the database administration─ Cloudera supported databases available─ DB instance clases

─ Scale as needed

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-25

Page 300: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points

▪ DynamoDB─ Managed key value store─ Utilized for S3 Guard

─ Metadata about S3 objects─ Solves the "eventual consistency" problem

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 11-26

Page 301: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Migrating Data to the CloudChapter 12

Page 302: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-2

Page 303: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Migrating Data to the Cloud

▪ Migrating Data to the Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-3

Page 304: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Objectives

In this chapter, you will learn

▪ Which are the traditional methods to migrate data to the cloud

▪ Which tools AWS provides for online data migration

▪ Which methods AWS provides for offline bulk data migration

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-4

Page 305: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Migrating to the Cloud

▪ Cloud adoption─ Clusters

─ Lift-and-shift─ Cloud-native

▪ Transfer data to the cloud─ Data migration

─ To data lake or other storage type

▪ Different ways to migrate data to AWS─ Online or offline

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-5

Page 306: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Traditional Online Data Migration

▪ Traditional tools─ SFTP, SCP, Rsync, and other tools

▪ Inbound data transfer is free─ Upload data

─ Not the most efficient mechanism─ Especially for large amounts of data

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-6

Page 307: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Online Data Migration using AWS Direct Connect

▪ Dedicated network connection─ On-premises to AWS

▪ Compatible with all AWS services

▪ Reduces bandwidth costs

▪ Private, secure, and cost-efficient

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-7

Page 308: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Online Data Migration using AWS DataSync

▪ AWS Data Sync─ Automate moving data from on-premises

─ S3 and EFS

▪ Deploy agent as VM on-premises─ In charge of moving your data from on-prem to AWS─ Connect to your storage─ Specify destination─ Process starts

─ Preserves metadata─ Integrity checks

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-8

Page 309: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Offline Data Migration with AWS Snowball

▪ AWS Snowball─ Petabyte-scale data transport solution─ Secure appliances─ Transport data in and out of AWS─ Solves many data transfer limitations

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-9

Page 310: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Offline Data Migration with AWS Snowball Edge

▪ AWS Snowball Edge─ Similar to Snowball─ Provides on-board storage and computing capabilities─ Processing at the edge

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-10

Page 311: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Offline Data Migration with AWS Snowmobile

▪ Snowmobile─ Exabyte-scale data transport solution─ 45 foot shipping container pulled by a semi-trailer truck─ Extremely cost-efficient for very large uploads─ Requires a custom engagement

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-11

Page 312: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Migrating Data to the Cloud

▪ Migrating Data to the Cloud

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-12

Page 313: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points

▪ Cloud adoption─ Clusters─ Data

▪ Upload data using traditional methods─ Inbound data is free

▪ Online and offline data migration tools─ Online

─ Direct Connect and DataSync─ Offline

─ Snowball, Snowball Edge, and Snowmobile

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 12-13

Page 314: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Modeling Infrastructure UsingAWS CloudFormationChapter 13

Page 315: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Course Chapters

▪ Introduction▪ An Overview of the Cloud with Cloudera▪ Getting Started with the Cloud▪ Estimating, Managing, and Monitoring Costs▪ Understanding Cloud Security: Amazon Web Services▪ Regions and Availability Zones▪ Networking▪ Computing Power in AWS▪ Protecting Your Infrastructure: Security Groups & Network ACLs▪ Storing Files and Objects: Instance Store, EBS and S3▪ Storing Relational and Key-Value Data: Amazon RDS and DynamoDB▪ Migrating Data to the Cloud▪ Modeling Infrastructure Using AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 13-2

Page 316: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Modeling Infrastructure Using AWS CloudFormation

▪ Modeling Infrastructure with AWS CloudFormation

▪ Hands-On Exercise: Deploying Infrastructure with CloudFormation

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 13-3

Page 317: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Objectives

In this chapter, you will learn

▪ How to model infrastructure-as-code in AWS using CloudFormation

▪ How to deploy infrastructure using Text Files with CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 13-4

Page 318: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Modeling Infrastructure with AWS CloudFormation

▪ Model and provision infrastructure─ Using text files

▪ Treat your infrastructure as code

▪ Automate deployments─ No need to manually create resources

▪ Same result, every time─ Quickly replicate infrastructure─ Consistent and repeatable fashion

▪ Free service

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 13-5

Page 319: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Describing and Creating Clusters With CloudFormation (1)

▪ Create a template─ JSON or YAML

─ Add to source control, version, and share

▪ Specify resources─ Created in the correct order

▪ CloudFormation makes all the necessary underlying API calls

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 13-6

Page 320: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Describing and Creating Clusters With CloudFormation (2)

▪ Collection of resources managed as a unit─ Called a stack

▪ Delete the stack when it is no longer required─ All resources decommissioned

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 13-7

Page 321: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

AWS CloudFormation

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 13-8

Page 322: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Modeling Infrastructure Using AWS CloudFormation

▪ Modeling Infrastructure with AWS CloudFormation

▪ Hands-On Exercise: Deploying Infrastructure with CloudFormation

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 13-9

Page 323: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Hands-On Exercise: Deploying Infrastructure withCloudFormation

▪ In this exercise, you will deploy infrastructure with text files usingCloudFormation

▪ Please refer to the Hands-On Exercise Manual for instructions

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 13-10

Page 324: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Chapter Topics

Modeling Infrastructure Using AWS CloudFormation

▪ Modeling Infrastructure with AWS CloudFormation

▪ Hands-On Exercise: Deploying Infrastructure with CloudFormation

▪ Essential Points

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 13-11

Page 325: files.ondemand.cloudera.com...Storing Relational and Key-Value Data: Amazon RDS and DynamoDB Chapter 11. Course Chapters Introduction An Overview of the Cloud with Cloudera Getting

Essential Points

▪ Describe and provision infrastructure─ As code

─ JSON or YAML─ Add to source control, version, and share

─ Replicate infrastructure─ Consistent and repeatable fashion

▪ Stack─ Specify resources

─ Single unit─ Created in the correct order

─ Deprovisioned together

Copyright © 2010–2020 Cloudera. All rights reserved. Not to be reproduced or shared without prior written consent from Cloudera. 13-12


Recommended