Lassoing the Clouds: Best Practices on AWS › s.deshong.net › ... · Lassoing the Clouds: Best...

Post on 07-Jun-2020

0 views 0 download

transcript

Lassoing the Clouds: Best Practices on AWS

Brian DeShong May 26, 2017

Who am I?

Worked primarily in web developmentPHP (17 years!!!), MySQL, Oracle, Linux, ApacheHighly-trafficked, scalable web applicationsFrequent speaker at PHP conferences, Atlanta PHP user groupiOS / Mac development for ~5 yearsFloodWatch for iOSYahoo!, Half Off Depot

Who am I?

Worked primarily in web developmentPHP (17 years!!!), MySQL, Oracle, Linux, ApacheHighly-trafficked, scalable web applicationsFrequent speaker at PHP conferences, Atlanta PHP user groupiOS / Mac development for ~5 yearsFloodWatch for iOSYahoo!, Half Off Depot

Who am I?

Worked primarily in web developmentPHP (17 years!!!), MySQL, Oracle, Linux, ApacheHighly-trafficked, scalable web applicationsFrequent speaker at PHP conferences, Atlanta PHP user groupiOS / Mac development for ~5 yearsFloodWatch for iOSYahoo!, Half Off Depot

Who am I?

Worked primarily in web developmentPHP (17 years!!!), MySQL, Oracle, Linux, ApacheHighly-trafficked, scalable web applicationsFrequent speaker at PHP conferences, Atlanta PHP user groupiOS / Mac development for ~5 yearsFloodWatch for iOSYahoo!, Half Off Depot

Agenda

Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?

Agenda• Running web servers

Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?

Agenda• Running web servers

• Serving static content

Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?

Agenda• Running web servers

• Serving static content

• Security-related concerns

Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?

Agenda• Running web servers

• Serving static content

• Security-related concerns

• Databases

Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?

Agenda• Running web servers

• Serving static content

• Security-related concerns

• Databases

• Logging

Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?

Regions + Availability Zones

https://aws.amazon.com/about-aws/global-infrastructure/

Region: comprised of 2 or more data centersEach data center is called an Availability ZoneAZs typically separated by many milesLow latency links between them

Operating Web Servers

Amazon Machine

Images (AMIs)

What is an Amazon Machine Image?

What is an Amazon Machine Image?

• Provides information required to launch an EC2 instance

What is an Amazon Machine Image?

• Provides information required to launch an EC2 instance

• You specify the AMI to use when launching a new instance

What is an Amazon Machine Image?

• Provides information required to launch an EC2 instance

• You specify the AMI to use when launching a new instance

• Amazon Linux by default

What is an Amazon Machine Image?

• Provides information required to launch an EC2 instance

• You specify the AMI to use when launching a new instance

• Amazon Linux by default

• CentOS, Ubuntu, Windows, etc.

What is an Amazon Machine Image?

• Provides information required to launch an EC2 instance

• You specify the AMI to use when launching a new instance

• Amazon Linux by default

• CentOS, Ubuntu, Windows, etc.

• Some options on using AMIs…

“Just Enough OS”

“Just Enough OS”• Startup a bare instance, JeOS

“Just Enough OS”• Startup a bare instance, JeOS

• Install what you need at initial boot time

• “User data”: shell script that runs at initial boot

“Just Enough OS”• Startup a bare instance, JeOS

• Install what you need at initial boot time

• “User data”: shell script that runs at initial boot

• Avoids AMI creation and maintenance

“Just Enough OS”• Startup a bare instance, JeOS

• Install what you need at initial boot time

• “User data”: shell script that runs at initial boot

• Avoids AMI creation and maintenance

• Instances take longer to be ready for service

“Just Enough OS”

web01Amazon Linux

vim

openssh

openssl

kernel

sudo

. . .

User data script runs only at initial boot timeInstall Apache, PHP 7Setup Apache to start at bootSetup a groups, filesystemStart Apache

This all takes time!Benefits such that, if you want to upgrade to PHP 7.1, you just change the user data script and launch new instancesVirtual machines are by their nature disposable

“Just Enough OS”

web01Amazon Linux

vim

openssh

openssl

kernel

sudo

. . .

#!/bin/bashyum update -yyum install -y httpd24 php70chkconfig httpd ongroupadd wwwusermod -a -G www ec2-userchown -R root:www /var/wwwchmod 2775 /var/wwwfind /var/www -type d -exec chmod 2775 {} +find /var/www -type f -exec chmod 0664 {} +// Download and put application code into placeservice httpd start

User data script

User data script runs only at initial boot timeInstall Apache, PHP 7Setup Apache to start at bootSetup a groups, filesystemStart Apache

This all takes time!Benefits such that, if you want to upgrade to PHP 7.1, you just change the user data script and launch new instancesVirtual machines are by their nature disposable

Fully Baked AMIweb01

Amazon Linux

vim

openssh

openssl

kernel

sudo PHP 7.1

ImageMagick

nginx

Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!

Fully Baked AMIweb01

Amazon Linux

vim

openssh

openssl

kernel

sudo PHP 7.1

ImageMagick

nginx

• Added users, groups

Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!

Fully Baked AMIweb01

Amazon Linux

vim

openssh

openssl

kernel

sudo PHP 7.1

ImageMagick

nginx

• Added users, groups• Filesystem setup

Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!

Fully Baked AMIweb01

Amazon Linux

vim

openssh

openssl

kernel

sudo PHP 7.1

ImageMagick

nginx

• Added users, groups• Filesystem setup• Nginx starts at boot

Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!

Fully Baked AMIweb01

Amazon Linux

vim

openssh

openssl

kernel

sudo PHP 7.1

ImageMagick

nginx

• Added users, groups• Filesystem setup• Nginx starts at boot• Places code on disk at boot

Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!

Fully Baked AMIweb01

Amazon Linux

vim

openssh

openssl

kernel

sudo PHP 7.1

ImageMagick

nginx

• Added users, groups• Filesystem setup• Nginx starts at boot• Places code on disk at boot• Monitoring scripts, packages

Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!

Fully Baked AMIweb01

Amazon Linux

vim

openssh

openssl

kernel

sudo PHP 7.1

ImageMagick

nginx

• Added users, groups• Filesystem setup• Nginx starts at boot• Places code on disk at boot• Monitoring scripts, packages• etc…

Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!

Fully Baked AMIweb01

Amazon Linux

vim

openssh

openssl

kernel

sudo PHP 7.1

ImageMagick

nginx

• Added users, groups• Filesystem setup• Nginx starts at boot• Places code on disk at boot• Monitoring scripts, packages• etc…

web02Amazon Linux

vim

openssh

openssl

kernel

sudo PHP 7.1

ImageMagick

nginx

web03Amazon Linux

vim

openssh

openssl

kernel

sudo PHP 7.1

ImageMagick

nginx

web04Amazon Linux

vim

openssh

openssl

kernel

sudo PHP 7.1

ImageMagick

nginx

Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!

Considerations

Bundling an AMI takes time — on the order of an hour or moreMight not always make sense to invest this time if you’re seldom going to need to add new machinesBut there’s a risk that if you do need them quickly, you have to wait for them to start

Considerations• How quickly do you need it in the event of a failure?

• …or due to an increase in demand?

Bundling an AMI takes time — on the order of an hour or moreMight not always make sense to invest this time if you’re seldom going to need to add new machinesBut there’s a risk that if you do need them quickly, you have to wait for them to start

Considerations• How quickly do you need it in the event of a failure?

• …or due to an increase in demand?

• Do you have the time and resources to maintain a fully baked AMI?

Bundling an AMI takes time — on the order of an hour or moreMight not always make sense to invest this time if you’re seldom going to need to add new machinesBut there’s a risk that if you do need them quickly, you have to wait for them to start

Considerations• How quickly do you need it in the event of a failure?

• …or due to an increase in demand?

• Do you have the time and resources to maintain a fully baked AMI?

• My recommendation:

• Start with “JeOS” + configure at boot

• You can always create custom AMIs later, if needed

Bundling an AMI takes time — on the order of an hour or moreMight not always make sense to invest this time if you’re seldom going to need to add new machinesBut there’s a risk that if you do need them quickly, you have to wait for them to start

Fault Tolerance

No Single Point of Failure!

Have at least two of everythingSpread web servers between AZsIf one AZ loses power, you stay up!But more resources has a cost associated with it!

No Single Point of Failure!• Do not have a single point of failure!

• This can be a server, AZ, or even a region

Have at least two of everythingSpread web servers between AZsIf one AZ loses power, you stay up!But more resources has a cost associated with it!

No Single Point of Failure!• Do not have a single point of failure!

• This can be a server, AZ, or even a region

• Always have at least two of everything

Have at least two of everythingSpread web servers between AZsIf one AZ loses power, you stay up!But more resources has a cost associated with it!

No Single Point of Failure!• Do not have a single point of failure!

• This can be a server, AZ, or even a region

• Always have at least two of everything

• If an EC2 instance dies, the other remains in service

Have at least two of everythingSpread web servers between AZsIf one AZ loses power, you stay up!But more resources has a cost associated with it!

No Single Point of Failure!• Do not have a single point of failure!

• This can be a server, AZ, or even a region

• Always have at least two of everything

• If an EC2 instance dies, the other remains in service

• Holy grail: spread out across multiple regions

Have at least two of everythingSpread web servers between AZsIf one AZ loses power, you stay up!But more resources has a cost associated with it!

Use AZs Effectively

There’s a trade-off between redundancy and cost, though!More hardware in more AZs means more money

Use AZs Effectively

web

us-east-1a

web web

web db (write)

db (read)

There’s a trade-off between redundancy and cost, though!More hardware in more AZs means more money

Use AZs Effectively

There’s a trade-off between redundancy and cost, though!More hardware in more AZs means more money

Use AZs Effectively

us-east-1c

web web web

web db (read)

us-east-1b

web web web

web db (read)

db (standby)

web

us-east-1a

web web

web db (write)

db (read)

There’s a trade-off between redundancy and cost, though!More hardware in more AZs means more money

Leverage Auto Scaling

web web web

Leverage Auto Scaling

web web web

• Machine resources • If CPU > 80% for 5 minutes, scale up

Leverage Auto Scaling

web web web

• Machine resources • If CPU > 80% for 5 minutes, scale up

web web

Leverage Auto Scaling

web web

• Machine resources • If CPU > 80% for 5 minutes, scale up

• Threshold of unhealthy instances • Want 5 web servers minimum? • If one dies, replace it

web web web

Leverage Auto Scaling

web web

• Machine resources • If CPU > 80% for 5 minutes, scale up

• Threshold of unhealthy instances • Want 5 web servers minimum? • If one dies, replace it

• Scale down • If CPU < 50% for X minutes, scale down

web web web

Leverage Auto Scaling

web web

• Machine resources • If CPU > 80% for 5 minutes, scale up

• Threshold of unhealthy instances • Want 5 web servers minimum? • If one dies, replace it

• Scale down • If CPU < 50% for X minutes, scale down

web web

Load Balancers

Don’t use Sticky Sessions!

Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load

Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load

• “Sticky sessions” pin your user to an instance behind ELB

Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load

• “Sticky sessions” pin your user to an instance behind ELB

• Uses a cookie to route the same client to a consistent target

Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load

• “Sticky sessions” pin your user to an instance behind ELB

• Uses a cookie to route the same client to a consistent target

• If instance fails, ELB stops routing to that instance; chooses another

Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load

• “Sticky sessions” pin your user to an instance behind ELB

• Uses a cookie to route the same client to a consistent target

• If instance fails, ELB stops routing to that instance; chooses another

• But you want to spread traffic around!

Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load

• “Sticky sessions” pin your user to an instance behind ELB

• Uses a cookie to route the same client to a consistent target

• If instance fails, ELB stops routing to that instance; chooses another

• But you want to spread traffic around!

• As your pool of machines grows, the requests are balanced between them

Sticky Sessions UI

SSL Termination

web web web

web web web

Load Balancer

HTTP, port 80

Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem

SSL Termination

web web web

web web web

Load Balancer

HTTP, port 80

Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem

SSL Termination

web web web

web web web

Load Balancer

HTTP, port 80 HTTPS, port 443

Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem

SSL Termination

web web web

web web web

Load Balancer

mod_ssl mod_ssl

mod_ssl mod_ssl mod_ssl

mod_ssl

HTTP, port 80 HTTPS, port 443

Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem

SSL Termination

web web web

web web web

Load Balancer

HTTP, port 80 HTTPS, port 443

Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem

SSL Termination

web web web

web web web

Load Balancer

HTTP, port 80

Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem

SSL Termination

web web web

web web web

Load BalancerAWS Certificate Manager

SSL Cert

www.foo.com

HTTP, port 80

Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem

Use AWS Certificate Manager

Can be used on ELBs, with API Gateway, CloudFront

Use AWS Certificate Manager• AWS Certificate Manager is amazing for managing SSL certificates, and free!

Can be used on ELBs, with API Gateway, CloudFront

Use AWS Certificate Manager• AWS Certificate Manager is amazing for managing SSL certificates, and free!

• Uses WHOIS contact information

Can be used on ELBs, with API Gateway, CloudFront

Use AWS Certificate Manager• AWS Certificate Manager is amazing for managing SSL certificates, and free!

• Uses WHOIS contact information

• Automatically renews your certificate

• A single click, and it’s renewed

• Updated everywhere it’s used

Can be used on ELBs, with API Gateway, CloudFront

Use AWS Certificate Manager• AWS Certificate Manager is amazing for managing SSL certificates, and free!

• Uses WHOIS contact information

• Automatically renews your certificate

• A single click, and it’s renewed

• Updated everywhere it’s used

• Can import external certificates, too

Can be used on ELBs, with API Gateway, CloudFront

Use AWS Certificate Manager• AWS Certificate Manager is amazing for managing SSL certificates, and free!

• Uses WHOIS contact information

• Automatically renews your certificate

• A single click, and it’s renewed

• Updated everywhere it’s used

• Can import external certificates, too

• SSL all the things!

Can be used on ELBs, with API Gateway, CloudFront

Serving Static Content

Serve from static storage!

These assets don’t change between releasesIt’s okay for them to be cached in the end user’s browser

Serve from static storage!

• Never serve static content from your web servers

These assets don’t change between releasesIt’s okay for them to be cached in the end user’s browser

Serve from static storage!

• Never serve static content from your web servers

• JavaScript, CSS, images, fonts, etc…

These assets don’t change between releasesIt’s okay for them to be cached in the end user’s browser

Serve from static storage!

• Never serve static content from your web servers

• JavaScript, CSS, images, fonts, etc…

• Don’t use your computing resources

These assets don’t change between releasesIt’s okay for them to be cached in the end user’s browser

Serve from static storage!

• Never serve static content from your web servers

• JavaScript, CSS, images, fonts, etc…

• Don’t use your computing resources

• Get the content to the end user as quickly as possible

These assets don’t change between releasesIt’s okay for them to be cached in the end user’s browser

AWS Simple Storage Service (S3)

AWS Simple Storage Service (S3)• AWS’s object storage service

AWS Simple Storage Service (S3)• AWS’s object storage service

• You pay by storage utilized, number of requests, and bandwidth

AWS Simple Storage Service (S3)• AWS’s object storage service

• You pay by storage utilized, number of requests, and bandwidth

• S3 storage is made up of buckets of objects

AWS Simple Storage Service (S3)• AWS’s object storage service

• You pay by storage utilized, number of requests, and bandwidth

• S3 storage is made up of buckets of objects

• Perfect for storing static assets

AWS Simple Storage Service (S3)• AWS’s object storage service

• You pay by storage utilized, number of requests, and bandwidth

• S3 storage is made up of buckets of objects

• Perfect for storing static assets

• Store content at build time

S3

All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this

Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery

RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process

S3

All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this

Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery

RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process

S3Standard Storage Class

All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this

Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery

RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process

S3Standard Storage Class

All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this

Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery

RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process

S3Standard Storage Class Standard - Infrequently Accessed

All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this

Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery

RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process

S3Standard Storage Class Standard - Infrequently Accessed Reduced Redundancy Storage

All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this

Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery

RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process

Use CloudFront

CloudFront operates dozens of POPs around the globePlace CloudFront in front of S3 bucketGlobal users retrieve content from nearest POPOptimized network path from POPs back to RegionsCan be served over both HTTP and HTTPSDDoS protection, too

Your content lives at its home — “origin”End user retrieves content from the POP closest to themIf POP doesn’t have content, retrieves from originIf it does have content, returns itCloudFront caches it, respecting cache headersWhen you push a new build to Production, it’s good to preface static content with a unique valueBut leave your previous releases in place, in case emails or other external things refer to assets from old releases

Use CloudFront

CloudFront operates dozens of POPs around the globePlace CloudFront in front of S3 bucketGlobal users retrieve content from nearest POPOptimized network path from POPs back to RegionsCan be served over both HTTP and HTTPSDDoS protection, too

Your content lives at its home — “origin”End user retrieves content from the POP closest to themIf POP doesn’t have content, retrieves from originIf it does have content, returns itCloudFront caches it, respecting cache headersWhen you push a new build to Production, it’s good to preface static content with a unique valueBut leave your previous releases in place, in case emails or other external things refer to assets from old releases

Use CloudFront

CloudFront operates dozens of POPs around the globePlace CloudFront in front of S3 bucketGlobal users retrieve content from nearest POPOptimized network path from POPs back to RegionsCan be served over both HTTP and HTTPSDDoS protection, too

Your content lives at its home — “origin”End user retrieves content from the POP closest to themIf POP doesn’t have content, retrieves from originIf it does have content, returns itCloudFront caches it, respecting cache headersWhen you push a new build to Production, it’s good to preface static content with a unique valueBut leave your previous releases in place, in case emails or other external things refer to assets from old releases

Use CloudFront

CloudFront operates dozens of POPs around the globePlace CloudFront in front of S3 bucketGlobal users retrieve content from nearest POPOptimized network path from POPs back to RegionsCan be served over both HTTP and HTTPSDDoS protection, too

Your content lives at its home — “origin”End user retrieves content from the POP closest to themIf POP doesn’t have content, retrieves from originIf it does have content, returns itCloudFront caches it, respecting cache headersWhen you push a new build to Production, it’s good to preface static content with a unique valueBut leave your previous releases in place, in case emails or other external things refer to assets from old releases

Use CloudFront

CloudFront operates dozens of POPs around the globePlace CloudFront in front of S3 bucketGlobal users retrieve content from nearest POPOptimized network path from POPs back to RegionsCan be served over both HTTP and HTTPSDDoS protection, too

Your content lives at its home — “origin”End user retrieves content from the POP closest to themIf POP doesn’t have content, retrieves from originIf it does have content, returns itCloudFront caches it, respecting cache headersWhen you push a new build to Production, it’s good to preface static content with a unique valueBut leave your previous releases in place, in case emails or other external things refer to assets from old releases

Security

Identity and Access Management

How many of you using AWS have access key values in your code? Example: user Bob can access an S3 bucket named “data-exports” and create objectsExample: determine who can launch EC2 instances, DynamoDB database tables they can access, etc.Roles are not associated with a user or groupTrusted entities assume roles

Identity and Access Management

• Controls AWS services a user can access

How many of you using AWS have access key values in your code? Example: user Bob can access an S3 bucket named “data-exports” and create objectsExample: determine who can launch EC2 instances, DynamoDB database tables they can access, etc.Roles are not associated with a user or groupTrusted entities assume roles

Identity and Access Management

• Controls AWS services a user can access

• Which actions they can perform on those services

How many of you using AWS have access key values in your code? Example: user Bob can access an S3 bucket named “data-exports” and create objectsExample: determine who can launch EC2 instances, DynamoDB database tables they can access, etc.Roles are not associated with a user or groupTrusted entities assume roles

Identity and Access Management

• Controls AWS services a user can access

• Which actions they can perform on those services

• Which resources are available

How many of you using AWS have access key values in your code? Example: user Bob can access an S3 bucket named “data-exports” and create objectsExample: determine who can launch EC2 instances, DynamoDB database tables they can access, etc.Roles are not associated with a user or groupTrusted entities assume roles

Identity and Access Management

• Controls AWS services a user can access

• Which actions they can perform on those services

• Which resources are available

• Concepts of “Users” and “Roles”

How many of you using AWS have access key values in your code? Example: user Bob can access an S3 bucket named “data-exports” and create objectsExample: determine who can launch EC2 instances, DynamoDB database tables they can access, etc.Roles are not associated with a user or groupTrusted entities assume roles

Use IAM Roles on EC2 Instances

API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources

Use IAM Roles on EC2 Instances• Assign an IAM Role to an EC2 instance

API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources

Use IAM Roles on EC2 Instances• Assign an IAM Role to an EC2 instance

• Enables you to obtain temporary access keys

• Can be used to access AWS resources

API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources

Use IAM Roles on EC2 Instances• Assign an IAM Role to an EC2 instance

• Enables you to obtain temporary access keys

• Can be used to access AWS resources

• AWS SDKs make requests with credentials from IAM Role

API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources

Use IAM Roles on EC2 Instances• Assign an IAM Role to an EC2 instance

• Enables you to obtain temporary access keys

• Can be used to access AWS resources

• AWS SDKs make requests with credentials from IAM Role

• No storing keys in your code base

API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources

Use IAM Roles on EC2 Instances• Assign an IAM Role to an EC2 instance

• Enables you to obtain temporary access keys

• Can be used to access AWS resources

• AWS SDKs make requests with credentials from IAM Role

• No storing keys in your code base

• Much more flexible and maintainable

API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources

Security Groups

Controls who can get in, and what can go outhttp://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Security.html#VPC_Security_Comparison

Example: don’t let public hit web servers directly, but DO allow ELBs to connect to them

Security Groups

• Virtual firewall to control inbound and outbound traffic

Controls who can get in, and what can go outhttp://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Security.html#VPC_Security_Comparison

Example: don’t let public hit web servers directly, but DO allow ELBs to connect to them

Security Groups

• Virtual firewall to control inbound and outbound traffic

• Typically attached to EC2 instances, load balancers, RDS instances

Controls who can get in, and what can go outhttp://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Security.html#VPC_Security_Comparison

Example: don’t let public hit web servers directly, but DO allow ELBs to connect to them

Security Groups

• Virtual firewall to control inbound and outbound traffic

• Typically attached to EC2 instances, load balancers, RDS instances

• Only allow traffic in on the necessary ports

Controls who can get in, and what can go outhttp://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Security.html#VPC_Security_Comparison

Example: don’t let public hit web servers directly, but DO allow ELBs to connect to them

Security Groups

• Virtual firewall to control inbound and outbound traffic

• Typically attached to EC2 instances, load balancers, RDS instances

• Only allow traffic in on the necessary ports

• Restrict internal tool access to known IP addresses

Controls who can get in, and what can go outhttp://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Security.html#VPC_Security_Comparison

Example: don’t let public hit web servers directly, but DO allow ELBs to connect to them

Principle of Least Privilege

Principle of Least Privilege• Give users access to only the resources they need

Principle of Least Privilege• Give users access to only the resources they need

• This applies to internal and external users

Principle of Least Privilege• Give users access to only the resources they need

• This applies to internal and external users

• Examples:

• Don’t let an IAM role access every single S3 bucket! Specify each

• Don’t allow every port in on a Security Group! Only what needs to be public

Relational Databases

AWS Relational Database Service

AWS Relational Database Service

• Removes the usual maintenance associated with running databases

AWS Relational Database Service

• Removes the usual maintenance associated with running databases

• Eases burden of software patches

AWS Relational Database Service

• Removes the usual maintenance associated with running databases

• Eases burden of software patches

• Backups / snapshots are incredibly convenient

AWS Relational Database Service

• Removes the usual maintenance associated with running databases

• Eases burden of software patches

• Backups / snapshots are incredibly convenient

• Can scale instances up and down in size

Operate in Multiple AZs

Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary

Operate in Multiple AZs

Primary

us-east-1a

Standby

us-east-1b

Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary

Operate in Multiple AZs

Primary

us-east-1a

Standby

us-east-1b

Synchronous replication

Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary

Operate in Multiple AZs

Primary

us-east-1a

Standby

us-east-1b

Synchronous replication

your-name.cluster-abc123.us-east-1.rds.amazonaws.com

Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary

Operate in Multiple AZs

Primary

us-east-1a

Standby

us-east-1b

Synchronous replication

your-name.cluster-abc123.us-east-1.rds.amazonaws.com

Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary

Operate in Multiple AZs

Primary

us-east-1a

Standby

us-east-1b

Synchronous replication

your-name.cluster-abc123.us-east-1.rds.amazonaws.com

Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary

Operate in Multiple AZs

Primary

us-east-1a

Standby

us-east-1b

Patch: standby first

Synchronous replication

your-name.cluster-abc123.us-east-1.rds.amazonaws.com

Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary

Operate in Multiple AZs

Primary

us-east-1a

Standby

us-east-1b

Patch: standby firstPatch: primary next

Synchronous replication

your-name.cluster-abc123.us-east-1.rds.amazonaws.com

Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary

Distribute read operations

Writer

us-east-1a

There’s a cost associated with it, though!And you should have some sort of a need for itRead-only hostname, round-robin between read replicasDon’t have to change application code as you add new reader instances

Distribute read operations

Writer

us-east-1a

Reader

us-east-1b

Reader

us-east-1c

Reader

us-east-1d

Reader

us-east-1e

There’s a cost associated with it, though!And you should have some sort of a need for itRead-only hostname, round-robin between read replicasDon’t have to change application code as you add new reader instances

Distribute read operations

Writer

us-east-1a

Reader

us-east-1b

Reader

us-east-1c

Reader

us-east-1d

Reader

us-east-1e

your-name.cluster-ro-abc123.us-east-1.rds.amazonaws.com

There’s a cost associated with it, though!And you should have some sort of a need for itRead-only hostname, round-robin between read replicasDon’t have to change application code as you add new reader instances

AWS Aurora!

Storage allotment will grow as your data set grows, in 10 GB incrementsEach 10 GB chunk replicated 6 ways, across 3 availability zonesLaunch a read replica in minutesBit of a price premium, depending on your needs

AWS Aurora!• Completely re-imagined storage of data

• http://bit.ly/atlphpaurora

Storage allotment will grow as your data set grows, in 10 GB incrementsEach 10 GB chunk replicated 6 ways, across 3 availability zonesLaunch a read replica in minutesBit of a price premium, depending on your needs

AWS Aurora!• Completely re-imagined storage of data

• http://bit.ly/atlphpaurora

• Greatly reduces replica lag to single-digit milliseconds

Storage allotment will grow as your data set grows, in 10 GB incrementsEach 10 GB chunk replicated 6 ways, across 3 availability zonesLaunch a read replica in minutesBit of a price premium, depending on your needs

AWS Aurora!• Completely re-imagined storage of data

• http://bit.ly/atlphpaurora

• Greatly reduces replica lag to single-digit milliseconds

• Read replicas launch in minutes

Storage allotment will grow as your data set grows, in 10 GB incrementsEach 10 GB chunk replicated 6 ways, across 3 availability zonesLaunch a read replica in minutesBit of a price premium, depending on your needs

AWS Aurora!• Completely re-imagined storage of data

• http://bit.ly/atlphpaurora

• Greatly reduces replica lag to single-digit milliseconds

• Read replicas launch in minutes

• Run MySQL or PostgreSQL engines on top

Storage allotment will grow as your data set grows, in 10 GB incrementsEach 10 GB chunk replicated 6 ways, across 3 availability zonesLaunch a read replica in minutesBit of a price premium, depending on your needs

Logging

Centralize Application Logs

Centralize Application Logs

• Web servers (PHP error log, Apache logs)

Centralize Application Logs

• Web servers (PHP error log, Apache logs)

• Cron jobs

Centralize Application Logs

• Web servers (PHP error log, Apache logs)

• Cron jobs

• Asynchronous processes

Centralize Application Logs

• Web servers (PHP error log, Apache logs)

• Cron jobs

• Asynchronous processes

• You need to be able to access these at any time

CloudWatch Logs

CloudWatch Logs• CloudWatch Logs agent runs on EC2 instance

CloudWatch Logs• CloudWatch Logs agent runs on EC2 instance

• Polls local log files on disk and copies to CW Logs

CloudWatch Logs• CloudWatch Logs agent runs on EC2 instance

• Polls local log files on disk and copies to CW Logs

• Broken up into Log Groups and Log Streams

• Log Group: Apache access log, error log, PHP error log

• Log Stream: log entries from a specific instance

CW Logs Agent Install

[messages]file = /var/log/messageslog_group_name = /var/log/messageslog_stream_name = {instance_id}datetime_format = %b %d %H:%M:%S

$ sudo yum install awslogs$ sudo service awslogs start

CW Logs Console

Make Them Searchable

Make Them Searchable• Elasticsearch (Amazon ES)

• OSS utilities to “tail” Elasticsearch indexes

Make Them Searchable• Elasticsearch (Amazon ES)

• OSS utilities to “tail” Elasticsearch indexes

• Amazon ES includes Kibana

Make Them Searchable• Elasticsearch (Amazon ES)

• OSS utilities to “tail” Elasticsearch indexes

• Amazon ES includes Kibana

• Allows you to spot trends over time

Make Them Searchable• Elasticsearch (Amazon ES)

• OSS utilities to “tail” Elasticsearch indexes

• Amazon ES includes Kibana

• Allows you to spot trends over time

• Dig through data for specific entries, time periods, etc.

Kibana

Kibana

Proactively monitor and alert!

Don’t let your boss or your customers find a problem first!

Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%

Proactively monitor and alert!• Logs should really be empty day-to-day

Don’t let your boss or your customers find a problem first!

Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%

Proactively monitor and alert!• Logs should really be empty day-to-day

• If they’re not right now, fix that first

Don’t let your boss or your customers find a problem first!

Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%

Proactively monitor and alert!• Logs should really be empty day-to-day

• If they’re not right now, fix that first

• CloudWatch Alerts for log entries over threshold

Don’t let your boss or your customers find a problem first!

Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%

Proactively monitor and alert!• Logs should really be empty day-to-day

• If they’re not right now, fix that first

• CloudWatch Alerts for log entries over threshold

• Amazon Simple Notification Service: get paged, wake up!

Don’t let your boss or your customers find a problem first!

Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%

Proactively monitor and alert!• Logs should really be empty day-to-day

• If they’re not right now, fix that first

• CloudWatch Alerts for log entries over threshold

• Amazon Simple Notification Service: get paged, wake up!

• Develop as to avoid being woken up by pages

Don’t let your boss or your customers find a problem first!

Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%

Recap

Operating Servers

Operating Servers

• Choose an EC2 AMI strategy that suits your needs

Operating Servers

• Choose an EC2 AMI strategy that suits your needs

• Don’t have an SPOF

Operating Servers

• Choose an EC2 AMI strategy that suits your needs

• Don’t have an SPOF

• Spread resources over AZs and/or Regions

Operating Servers

• Choose an EC2 AMI strategy that suits your needs

• Don’t have an SPOF

• Spread resources over AZs and/or Regions

• Keep SSL simple

Static Content

Static Content

• Don’t serve it from your web servers!

Static Content

• Don’t serve it from your web servers!

• Utilize S3 for all static content storage

Static Content

• Don’t serve it from your web servers!

• Utilize S3 for all static content storage

• Leverage CloudFront for better global performance

Security

Security

• Leverage IAM Roles to grant access to types of servers

Security

• Leverage IAM Roles to grant access to types of servers

• Limit Security Groups to only what’s needed in and outbound

Security

• Leverage IAM Roles to grant access to types of servers

• Limit Security Groups to only what’s needed in and outbound

• Principle of Least Privilege is a great guide

Databases

Databases

• Again, spread across AZs

Databases

• Again, spread across AZs

• Distribute read operations to slaves

Databases

• Again, spread across AZs

• Distribute read operations to slaves

• More sleep: automatic failover is a great asset

Logging

Logging

• Use CloudWatch Logs for central logging

Logging

• Use CloudWatch Logs for central logging

• Don’t just write the logs, monitor them!

Logging

• Use CloudWatch Logs for central logging

• Don’t just write the logs, monitor them!

• Alert on anomalies

Logging

• Use CloudWatch Logs for central logging

• Don’t just write the logs, monitor them!

• Alert on anomalies

• Find your bug and errors before your users do!

Thanks to our Sponsors!

PHP[TEK] 2017

Thanks!brian@deshong.net

http://www.deshong.net/

@bdeshong

http://www.shootproof.com

We’re Hiring: http://www.shootproof.com/about/careers