Date post: | 06-Dec-2014 |
Category: |
Technology |
Upload: | jesse-anderson |
View: | 1,763 times |
Download: | 1 times |
EC2 PERFORMANCE, SPOT INSTANCE ROI AND EMR SCALABILITY
Jesse Anderson
AMAZON WEB SERVICES (AWS)
Elastic Cloud Compute (EC2) Virtual Machine in Cloud
Simple Storage Service (S3) Network Share in Cloud
Elastic MapReduce (EMR) Cluster of EC2 instances for Hadoop cluster
EC2 PRICE TYPES
Spot Instances System for bidding on unused instances Same Performance Go away (abruptly) if outbid
On Demand Ad Hoc starting
Reserved Not Covered
SPOT INSTANCE SAVINGS
MILLION MONKEYS PROJECT
Randomly recreated Shakespeare Open source Good metric for CPU and memory
EC2 SPECIFICATIONS
Instance Name
Memory
EC2 Compute Units/Cores
Platform
I/O Performance
Small 1.7 GB 1 EC2 on 1 Core 32-bit Moderate
Large 7.5 GB 4 EC2 on 2 Cores 64-bit High
Extra Large 15 GB 8 EC2 on 8 Cores 64-bit High
High-CPU Medium
1.7 GB 5 EC2 on 2 Cores 32-bit Moderate
High-CPU Large 7 GB 20 EC2 on 8 Cores 64-bit High
Quad XL 23 GB 33.5 on 8 Cores 64-bit Very High
EC2 Compute Unit (ECU) – One EC2 Compute Unit (ECU) provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.
EC2 PERFORMANCE
My Core 2 Duo 2.66 GHZ did 50,000,000,000 character groups
EC2 COST PER HOUR ON DEMAND/SPOT
PRICE PER UNIT
EMR (HADOOP) CLUSTERING
Tests of 1, 2, 3, 4, 5, 10, 20 node clusters
Price Scalability
EMR COST
PRICE PER UNIT IN A CLUSTER
CLUSTERED CHARACTER GROUPS
EMR/HADOOP SCALABILITY PERCENTAGE
EMR/HADOOP SCALABILITY ABSOLUTE
BREAKDOWNS
Original project would have run in 3 days 9 hours Took 1.5 months before
20 node cluster costs $45.44 per day 5 day run cost $317 11 day run cost $528
ENGINEERING FOR THE CLOUD
Establish if a good fit Test the EC2 performance Figure out a unit or widget Find the most cost efficient EC2
performer with price per unit/widget Engineer with Spot Instances in mind
CONCLUSIONS
Spot Instance Saves From $2.20 to $1.30 per hour Saved $1,000 in one run
Hadoop/EMR Scalability 95% efficiency at 2-5 nodes 87% efficiency at 10 nodes 84% efficiency at 20 nodes
MORE INFORMATION
http://www.jesse-anderson.com/2012/02/ec2-performance-spot-instance-roi-and-emr-scalability/
@jessetanderson