Intro to Cloud Computing and Amazon EC2
Michael T. Conigliaro
http://conigliaro.org
2010-09-09
The Old Days…
Need many physical servers for redundancy & future growth
One operating system per physical server
What’s The Problem?
• Hardware is expensive!• And don’t forget the electric bill!
• Long lead time to procure new hardware (days? weeks?)• Encourages overestimation of hardware needs
• Static infrastructure leads to underutilization• According to some studies, the average server is only
~20% utilized!
• Application consolidation? • Be careful not to “put all your eggs in one basket”• Complicates upgrades
Static infrastructure is wasteful
Enter Virtualization
Many virtual “instances” per physical serverMore efficient utilization of hardware
Rapid provisioning (measured in minutes or seconds!)
Live migration of virtual instances between physical servers!
Management API
Problems with Virtualization
Still need to purchase physical hardware
Less wasteful, but infrastructure is still mostly static
“Agile” instances require shared storage (SAN) which can be very expensive
Note: These may not be problems for large enterprises or SMBs, but they certainly are for startups!
Enter Cloud Computing
Cloud computing is virtualization on someone else’s hardware
Utility model - Pay only for what you’re using (per hour, GB, etc.)
A management API for your entire infrastructure!Specific API features depend on your cloud provider
Dynamic infrastructure creates less waste
Why Amazon EC2?
Very popular
Very mature management tools and APIs
Integration with other Amazon servicesS3 (simple online storage)CloudFront (content delivery network)
Other nice featuresElastic Block StorageElastic Load Balancers
Amazon EC2 Pricing (1/2)
Not necessarily cheaper!
Save money by reducing time to provision new serversAPIs are an absolute must-have!
Save money by reducing wasteMake infrastructure follow your utilization curveProvision entire environments on demand, then
terminate when not in use (e.g. testing)
Amazon EC2 Pricing (2/2)
Instance size Specs Price/Hour Price/Month
Micro 1 core, 613MB $0.02 $14.04
Standard (S) 1 core, 1.7GB $0.085 $61.20
Standard (L) 2 core, 7.5GB $0.34 $244.80
Standard (XL) 4 core, 15GB $0.68 $489.60
High Mem (XL) 2 core, 17.1GB $0.50 $360.00
High Mem (2XL)
4 core, 34.2GB $1.00 $720.00
High Mem (4XL)
8 core, 68.4GB $2.00 $1440.00
High CPU (M) 2 core, 1.7GB $0.17 $122.40
High CPU (XL) 8 core, 7GB $0.68 $489.60On-demand Linux instances for US-East. Price/Month assumes running 24/7.
Getting Started with Amazon EC2
Create an account at http://aws.amazon.com/ec2
Make note of your “access key id” and “secret access key”Used for REST-based requests (e.g. many UI-based tools)
Create and download an X.509 certificate and private keyUsed for SOAP-based requests (e.g. the command line
tools)
Create an SSH key pair, and download the private keyUsed to connect to your instances via SSH
Amazon EC2 Management Tools
AWS Management Console https://console.aws.amazon.com
Elasticfox extension for Firefox http://s3.amazonaws.com/ec2-downloads/elasticfox.xpi
Amazon API tools (command line) http://s3.amazonaws.com/ec2-downloads/ec2-api- tools.zip “apt-get install ec2-api-tools”
Language specific libraries Java, Ruby, Python, Etc.
Too many to list!
Amazon EC2 Dashboard
Demo: EC2 Instance Lifecycle
1. Launch instance• ec2-run-instances ami-2d4aa444 --instance-type
m1.small --key ec2-keypair
2. Describe instanceec2-describe-instances <instance_id>
3. Connect to instance ssh -i
/Users/mikec/projects/socialmedia/infrastructure/ec2-keypair.pem ubuntu@<address>
4. Terminate instanceec2-terminate-instances <instance_id>
Demo: Start “Staging” Environment on Demand
1. Issue EC2 commands to start staging instances
2. Wait a few minutes for instances to come up
3. Interact with staging environment
4. Issue EC2 commands to stop staging instances
Amazon EC2 Concepts (1/5)
New instances consist of two main things:
Amazon Machine Image (AMI) usually provided by Amazon or the community:
Operating system (Linux distro, version, etc.)
Datacenter location (East/West coast, Europe, Asia, etc.)
Instance “type”:
Processor architecture (x86 or x86_64)
Number of virtual CPUs (1 - 8)
Amount of memory (600MB - 70GB)
Amount of local disk storage (up to 1690GB)
I/O performance (“low,” “moderate,” or “high”)
Amazon EC2 Concepts (2/5)
Data is ephemeral by default!Rebooting is harmless, but termination of an instance
results in loss of all data on the disk!
Solution: Elastic Block Store (EBS) volumesCreate a new EBS volumeAttach EBS volume to an instanceWhen instance terminates, EBS volume is left behind
Many AMIs now have EBS boot volumes!Allows instances to be “stopped” Highly recommended for critical and “long-running”
instances
Amazon EC2 Concepts (3/5)
IP addresses are temporary and assigned randomly via DHCP!
Solution 1: Elastic IPsCreate a new elastic IPAssign it to an instance Instance can now be accessed reliably via the elastic
IP address
Solution 2: Dynamic DNS (e.g. Dyn.com)Each instance updates its own public DNS records via
an APISee: http://rubygems.org/gems/dynect4r
Amazon EC2 Concepts (4/5)
No way of knowing where your instances are physically located!
Example: Launch two new instances at the same time. Will they end up running… On the same physical server? On different servers in the same rack? On different servers in different racks?
Why does this matter? Latency-sensitive applications may not do well on EC2.
Amazon EC2 Concepts (5/5)
On-demand instance requests are not guaranteed to be fulfilled immediately, or even at all!EC2 quotas Insufficient EC2 capacity
Solution: “Reserved instances”Pay for instances up front rather than by the hourRequires one-to-three year contract with Amazon
1-year: save ~34% 3-year: save ~49%
Some Best Practices
Use “security groups” to secure instances from each other
Prevent unintentional termination of critical instancesec2-modify-instance-attribute --disable-api-
termination true <instance id>
Avoid “image sprawl”Start with one minimal “golden master image”Use a configuration management system to add
“roles” as necessary (e.g. Web server, Database server, NFS server, etc.)
What’s Next?
Use a configuration management systemChef, Puppet, Cfengine, etc. “Infrastructure as Code”
Automate your “bootstrap” process
Plan for failure!Every EC2 instance is as likely to fail as any otherSince failure cannot be prevented, have a good
recovery plan
Command and ControlCapistrano, Fabric, etc.
Example Chef Recipe
package "apache" do package_name "apache2” action :installend
service "apache" do service_name "apache2" action [:enable, :start]end
template "apache2.conf" do source "apache2.conf.erb" path "/etc/apache2/apache2.conf" notifies :restart, resources(:service => ”apache")end
Summary
Cloud Computing is virtualization on someone else’s hardware
Cloud computing allows your infrastructure to be dynamic, and thus, less wasteful
Cloud APIs promote automation and allow you to be more productive
Configuration management systems allow you to treat your infrastructure as code
More Information
EC2 Getting Started Guidehttp://docs.amazonwebservices.com/AWSEC2/latest/
GettingStartedGuide/
EC2 Developer Guidehttp://docs.amazonwebservices.com/AWSEC2/latest/
DeveloperGuide/
Chef wikihttp://wiki.opscode.com/