+ All Categories
Home > Technology > AWS Webcast - Design for Availability

AWS Webcast - Design for Availability

Date post: 18-Jul-2015
Category:
Upload: amazon-web-services
View: 605 times
Download: 0 times
Share this document with a friend
Popular Tags:
109
Design for Availability Joel Williams, Solutions Architect, AWS March 18, 2015
Transcript

Design for Availability

Joel Williams, Solutions Architect, AWS

March 18, 2015

Designing for Availability

ME: Joel Williams– Solutions Architect at Amazon Web Services

YOU: here to learn more about designing your applications for high

availability on AWS

TODAY: about best practices and things to think about when building a

highly available application on AWS

33

What is High Availability?Availability: Percentage of time an application operates during its work cycle

Loss of availability is known as an outage or downtime

• App is offline, unreachable, or partially available

• App is slow to use

• Planned and unplanned

Goal

• No downtime

• Always available

44

Availability is related to

Scalability

• Ability of an application to accommodate growth without changing design

• If app cannot scale, availability may be impacted

• Scalability doesn’t guarantee availability

Fault Tolerance

• Built-in redundancy so apps can continue functioning when components fail

• Fault tolerance is crucial to HA

AWS democratizes High Availability

• Multiple servers, isolated redundant data centers, regions across the globe, Fault

Tolerant services, etc.

AWS GLOBAL

INFRASTRUCTURE

Global Infrastructure

AWS Regions and Availability Zones

Customer Decides Where Applications and Data Reside

Compute Storage

AWS Global Infrastructure

Database

App Services

Deployment & Administration

Networking

Reference Model

AWS BUILDING BLOCKS

Inherently Highly Available and

Fault Tolerant Services

Highly Available

with the right

architecture

Amazon S3

Amazon DynamoDB

Amazon CloudFront

Amazon Route53

Elastic Load Balancing

Amazon SQS

Amazon SNS

Amazon SES

Amazon SWF

Amazon EC2

Amazon EBS

Amazon RDS

Amazon VPC

1. DESIGN FOR FAILURE

2. MULTIPLE AVAILABILITY ZONES

3. SCALING

4. SELF-HEALING

5. LOOSE COUPLING

Principles of Designing for Availability

LET’S BUILD A

HIGHLY AVAILABLE SYSTEM

Vertical Scaling

From $0.02/hrElastic Compute Cloud (EC2)Basic unit of compute capacity

Range of CPU, memory & local disk options

42 Instance types available from 16 different families

Feature Details

Flexible Run windows or Linux distributions

Scalable Wide range of instance types from micro to

cluster compute

Machine Images Configurations can be saved as machine

images (AMIs) from which new instances can

be created

Full control Full root or administrator rights

Secure Full firewall control via Security Groups

Monitoring Publishes metrics to Cloud Watch

Inexpensive On-demand, Reserved and Spot instance types

VM Import/Export Import and export VM images to transfer

configurations in and out of EC2

Compute

Compute Storage

AWS Global Infrastructure

Database

App Services

Deployment & Administration

Networking

** MANY NEW INSTANCE TYPES

Amazon EC2 instances

Web Server EC2

Web Server EC2

RDS DB

instance

Web Server EC2

RDS DB

instance

Internet gateway

Elastic IP

Web Server EC2

RDS DB

instance

Internet gateway

Elastic IP

Route

53

user DNS

Resolution

www.example.com

#1DESIGN FOR FAILURE

●○○○○

« Everything failsall the time »

Werner Vogels

CTO of Amazon

AVOID SINGLE POINTS OF FAILURE

AVOID SINGLE POINTS OF FAILURE

ASSUME EVERYTHING FAILS,

AND WORK BACKWARDS

YOUR GOALApplications should continue to function

Web Server EC2

RDS DB

instance

Internet gateway

Elastic IP

Route

53

user DNS

Resolution

www.example.com

Web Server EC2

RDS DB

instance

Internet gateway

Elastic IP

Route

53

user DNS

Resolution

www.example.com

Web Server EC2

RDS DB

instance

Internet gateway

Elastic IP

Route

53

user DNS

Resolution

www.example.com

EC2

AMAZON EBSELASTIC BLOCK STORE

Elastic Block StoreHigh performance block storage device

1GB to 1TB in size

Mount as drives to instances

Feature Details

High performance

file system

Mount EBS as drives and format as required

Flexible size Volumes from 1GB to 1TB in size

Secure Private to your instances

Performance Use provisioned IOPS to get desired level of IO

performance

Available Replicated within an Availability Zone

Backups Volumes can be snapshotted for point in time

restore

Monitoring Detailed metrics captured via Cloud Watch

Storage

Compute Storage

AWS Global Infrastructure

Database

App Services

Deployment & Administration

Networking

EBS

snapshot

EC2

Web Server EC2

RDS DB

instance

Internet gateway

Elastic IP

Route

53

user DNS

Resolution

www.example.com

EBS

Web Server EC2

RDS DB

instance

Internet gateway

Elastic IP

Route

53

user DNS

Resolution

www.example.com

EBS

Web Server EC2

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

EBS

EC2

Elastic IP

Web Server EC2

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

EBS

EC2

Elastic IP

AMAZON ELBELASTIC LOAD BALANCING

Elastic Load BalancingCreate highly scalable applications

Distribute load across EC2 instances in multiple

availability zones

Feature Details

Auto-scaling Automatically scales to handle request volume

Available Load balance across instances in multiple

availability zones

Health checks Automatically checks health of instances and

takes them in or out of service

Session stickiness Route requests to the same instance

Secure sockets layer Supports SSL offload from web and application

servers with flexible cipher support

Monitoring Publishes metrics to Cloud Watch

Compute Storage

AWS Global Infrastructure

Database

App Services

Deployment & Administration

Networking

** NEW CONNECTION DRAINING

AND NEW ACCESS LOGS

ComputeElastic Load

Balancing

EC2 EC2

Auto Scaling Group

Web Server EC2

RDS DB

instance

Internet gateway

Elastic IP

Route

53

user DNS

Resolution

www.example.com

Web

ServersEC2

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

EC2EC2

HEALTH CHECKS

Web

ServersEC2

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

EC2EC2

Web

ServersEC2

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

EC2EC2

Web

ServersEC2

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

EC2EC2

Health Checks

#2MULTIPLE

AVAILABILITY ZONES●●○○○

AMAZON RDS

MULTI-AZ

Relational Database ServiceDatabase-as-a-Service

No need to install or manage database instances

Scalable and fault tolerant configurations

Feature Details

Platform support Create MySQL, SQL Server, Postgres and

Oracle RDBMS

Preconfigured Get started instantly with sensible default

settings

Automated patching Keep your database platform up to date

automatically

Backups Automatic backups and point in time recovery

and full DB backups

Provisioned IOPS Specify IO throughput depending on

requirements

Failover Automated failover to slave hosts in event of a

failure

Replication Easily create read-replicas of your data and

seamlessly replicate data across availability

zones

Database

Compute Storage

AWS Global Infrastructure

Database

App Services

Deployment & Administration

Networking

RDS DB

instance

RDS DB

instance standby

(Multi-AZ)

RDS DB

instance read

replica

Web

ServersEC2

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

EC2EC2

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Web

ServersEC2

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

EC2EC2

Availability Zone A Availability Zone B

RDS DB

instance

Web

ServersEC2

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

EC2EC2

Availability Zone A Availability Zone B

RDS DB

instance

Synchronous Replication

AMAZON ELB AND

MULTIPLE AZs

Web

ServersEC2

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

EC2EC2

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

EC2EC2

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

EC2EC2

#3SCALING

●●●○○

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

EC2EC2

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

EC2EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

EC2 EC2 EC2 EC2

AUTO SCALINGSCALE UP/DOWN EC2 CAPACITY

Auto Scaling Automatic re-sizing of compute clusters based upon demand

Feature Details

Control Define minimum and maximum instance pool

sizes and when scaling and cool down occurs

Integrated to

CloudWatch

Use metrics gathered by CloudWatch to drive

scaling

Instance types Run auto scaling for on-demand instances and

spot. Compatible with VPC

as-create-auto-scaling-group MyGroup

--launch-configuration MyConfig

--availability-zones eu-west-1a

--min-size 4

--max-size 200

Compute – Auto Scaling

Compute Storage

AWS Global Infrastructure

Database

App Services

Deployment & Administration

Networking

** NEW CONSOLE

Auto Scaling Group

EC2 EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

EC2 EC2 EC2 EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

EC2 EC2 EC2 EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

EC2 EC2 EC2 EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

AMI

Auto Scaling Policy fires

EC2 EC2 EC2 EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scalinglaunching launching

EC2 EC2 EC2 EC2EC2 EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

EC2 EC2 EC2 EC2EC2 EC2

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

Web

Servers

EC2 EC2 EC2 EC2EC2 EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scalingterminating terminating

EC2 EC2 EC2 EC2EC2 EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

EC2 EC2 EC2 EC2

ScalingData Tier

RDS - Push-Button Scaling

scale up or down to thedesired instance class

scale up to an 8-coreserver with 244 GB of RAM

with the cr1.8xlarge

Use CasesReporting and ETL

Discrete read/write transactions (browsers vs buyers)

Scale-out with one or more read servers master-slave

architecture

scalingREADS

• Optimize master for OLTP and read slaves for table scans

• Resize slaves as needed to boost reporting performance

• Use short-term slaves to save cost during monthly reporting

• Promote to standalone server.

• NEW - Cross Region Read Replicas with MySQL

scalingREADS Tech tips

Scaling for Writes on the Data Tier

At large scale, you may start to run into issues with your database around contention on writes to the master.

How can you solve it?

Federation ( splitting into multiple DBs based on function)

Sharding ( splitting one data set up across multiple hosts)

Moving some functionality to other types of DBs ( NoSQL )

Database Federation

Split up Databases by function/purpose

Harder to do cross function queries

Essentially delaying the need for something like sharding / NoSQL until much further down the line

Won’t help with single huge functions/tables

ForumsDB

UsersDB

ProductsDB

Sharded Horizontal Scaling

More complex at the application layer

ORM support can help

No practical limit on scalability

Operation complexity/sophistication

Shard by function or key space

RDBMS or NoSQL

User ShardID

002345 A

002346 B

002347 C

002348 B

002349 A

A

B

C

#4SELF-HEALING

●●●●○

HEALTH CHECKS

+AUTO SCALING

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

EC2 EC2 EC2 EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

EC2 EC2 EC2 EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

EC2 EC2 EC2 EC2

launching

EC2

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

EC2 EC2 EC2EC2

HEALTH CHECKS

+AUTO SCALING

=

SELF-HEALING

DEGRADED MODE

AMAZON S3 STATIC WEBSITE

+AMAZON ROUTE 53

DNS Failover

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

EC2 EC2 EC2 EC2

S3 Static Website – www.example.com

Web

Servers

RDS DB

instance

Internet gateway

Route

53

user DNS

Resolution

www.example.com

Elastic Load

Balancing

Availability Zone A Availability Zone B

RDS DB

SlaveSynchronous Replication

Auto Scaling Group

Auto

Scaling

EC2 EC2 EC2 EC2

S3 Static Website – www.example.com

#5LOOSE

COUPLING●●●●●

BUILD LOOSELY COUPLED SYSTEMS

The looser they are coupled, the bigger they scale,

the more fault tolerant they get…

Services Oriented Architecture - SOA

Move services into their own

tiers/modules. Treat each of these

as 100% whole-y separate pieces

of your infrastructure and scale

them independently.

Amazon.com and AWS do this

extensively! It offers flexibility and

greater understanding of each

component.

Loose coupling sets you free!

The looser they're coupled, the bigger they scale

• Independent components

• Design everything as a black box

• Decouple interactions

• Favor services with built in redundancy and scalability than building your

own

AMAZON SQSSIMPLE QUEUE SERVICE

Amazon SQSReliable, highly scalable, queue service

for storing messages as they travel

between instances

Feature Details

Reliable Messages stored redundantly across

multiple availability zones

Simple Simple APIs to send and receive messages

Scalable Unlimited number of messages

Secure Authentication of queues to ensure

controlled access

Application Services

Compute Storage

AWS Global Infrastructure

Database

App Services

Deployment & Administration

Networking

SQS

messages

get

message

instanceput

messageinstance

Amazon SNS topic

publish

notification

queue is subscribed

to topic

PUBLISH&

NOTIFYRECEIVE CREATE THUMBS

PUBLISH&

NOTIFYRECEIVE CREATE THUMBS

SQS SQS

S3 Bucket

Route

53

user

www.example.com

Webservers / CMS

SQS

Workers

Photo CMS with SQS

1) User / browser posts photo

to S3 and is redirected to

form on webservers

2) User completes form for

photo and submits

3) Message is sent to SQS

4) Worker long polling SQS

grabs message and

creates different size photo

assets

5) Thumbs are uploaded to

S3 bucket

6) Worker updates database

with photo assets

1

2

3

4

5

6

VISIBILITY TIMEOUT

S3 Bucket

Route

53

user

www.example.com

Webservers / CMS

SQS

Workers

1

2

3

4

5

6

Photo CMS with SQS

message

1) User / browser posts photo

to S3 and is redirected to

form on webservers

2) User completes form for

photo and submits

3) Message is sent to SQS

4) Worker long polling SQS

grabs message and

creates different size photo

assets

5) Thumbs are uploaded to

S3 bucket

6) Worker updates database

with photo assets

S3 Bucket

Route

53

user

www.example.com

Webservers / CMS

SQS

Workers

1

2

3

5

6

Photo CMS with SQS

Message reappears

in queue

4

1) User / browser posts photo

to S3 and is redirected to

form on webservers

2) User completes form for

photo and submits

3) Message is sent to SQS

4) Worker long polling SQS

grabs message and

creates different size photo

assets

5) Thumbs are uploaded to

S3 bucket

6) Worker updates database

with photo assets

S3 Bucket

Route

53

user

www.example.com

Webservers / CMS

SQS

Workers

1

2

3

5

6

Photo CMS with SQS

4message

1) User / browser posts photo

to S3 and is redirected to

form on webservers

2) User completes form for

photo and submits

3) Message is sent to SQS

4) Worker long polling SQS

grabs message and

creates different size photo

assets

5) Thumbs are uploaded to

S3 bucket

6) Worker updates database

with photo assets

CLOUDWATCH METRICSFOR AMAZON SQS

+AUTO SCALING

S3 Bucket

Route

53

user

www.example.com

Webservers / CMS

SQS

Workers

Photo CMS – Scaling with SQS

1

2

3

4

5

6

backlog of

messages

Auto Scaling Group

Auto Scaling Group1) User / browser posts photo

to S3 and is redirected to

form on webservers

2) User completes form for

photo and submits

3) Message is sent to SQS

4) Worker long polling SQS

grabs message and

creates different size photo

assets

5) Thumbs are uploaded to

S3 bucket

6) Worker updates database

with photo assets

LambdaEvent driven compute

Connective tissue for AWS services

Feature Details

Stateless Request driven code called Lambda functions

triggered by events

Easy Fixed OS and language - JavaScript

Management AWS owns and manages the infrastructure

Scaling Implicit scaling; just make requests

Compute Storage

AWS Global Infrastructure

Database

App Services

Deployment & Administration

Networking

ComputeS3 Bucket

Lambda

Push: Event

notification

DynamoDB

Pull: DynamoDB

Stream

Kinesis

Pull:

Kinesis Stream

S3 Bucket

Route

53

user

www.example.com

Webservers / CMS

Photo CMS with Lambda

1) User / browser posts photo

to S3 and is redirected to

form on webservers

2) The redirected user

completes form for photo

and submits

3) At the same time as the

redirect, S3 event

notifications fire off and are

received by Lambda

4) Lambda creates different

size photo assets and

uploads them to S3

5) Lambda updates database

with photo assets

1

2

43

5

Lambda

1. DESIGN FOR FAILURE

2. MULTIPLE AVAILABILITY ZONES

3. SCALING

4. SELF-HEALING

5. LOOSE COUPLING

1. DESIGN FOR FAILURE

2. MULTIPLE AVAILABILITY ZONES

3. SCALING

4. SELF-HEALING

5. LOOSE COUPLING

1. DESIGN FOR FAILURE

2. MULTIPLE AVAILABILITY ZONES

3. SCALING

4. SELF-HEALING

5. LOOSE COUPLING

1. DESIGN FOR FAILURE

2. MULTIPLE AVAILABILITY ZONES

3. SCALING

4. SELF-HEALING

5. LOOSE COUPLING

1. DESIGN FOR FAILURE

2. MULTIPLE AVAILABILITY ZONES

3. SCALING

4. SELF-HEALING

5. LOOSE COUPLING

1. DESIGN FOR FAILURE

2. MULTIPLE AVAILABILITY ZONES

3. SCALING

4. SELF-HEALING

5. LOOSE COUPLING

YOUR GOALApplications should continue to function

IT’S ALL ABOUT

CHOICEBALANCE COST & AVAILABILITY REQUIREMENTS

AWS Architecture Centerhttp://aws.amazon.com/architecture

AWS Whitepapershttp://aws.amazon.com/whitepapers

AWS Bloghttp://aws.amazon.com/blogs/aws

Thanks for attending!

- Joel Williams


Recommended