Topics
1. AWS Global Infrastructure 2. Foundation Services
1. Compute 2. Storage 3. Database 4. Network
3. AWS Economics
Regions and Availability Zones AWS resources are either
– Global – Tied to a region – Tied to an availability zone
Regions are completed isolated from each other. Availability zones are data centers within a region.
https://aws.amazon.com/about-aws/globalinfrastructure/
Edge Locations Content delivery network (CDN)
– Goal: serve content with low latency, high availability. – Solution: cache content in multiple geographically
distributed data centers so a DC is near each user.
Traditional Content Delivery Content Delivery Network
Edge Locations
Compute
Elastic Compute Cloud (EC2) – Create virtual servers in the cloud in seconds. – Setup with any OS and software. – Manage with administrative access.
Auto Scaling – Create and remove EC2 instances based on triggers. – Time and date based triggers. – Resource based triggers.
Amazon EC2 is
• A web service that enables you to launch and manage server instances
• Designed to make web-scale computing easier for developers.
• A simple web service interface that provides programmable control of your cloud resources
EC2 Features
Elastic—Allows you to instantiate one to thousands of server instances either manually or automatically. Flexible—Choice of multiple instance types, OS, and software packages. Available—SLA commitment 99.95% availability in each region. Pay as You Go—Pay for resources as you need them, though reserved instances offer lower pricing for longer commitments.
Amazon Machine Image
Virtual root disk image – Contains OS – Contains most applications
Start a VM by – Booting an AMI – Creates an instance
Catalog of pre-built AMIs – OS: Linux (many distros), OpenSolaris, Windows – Software: Apache, MySQL, Oracle, WordPress, etc. – Available at http://aws.amazon.com/amis
Instance
• An instance is a VM running the OS and software on an AMI.
• You can launch many instances of the same AMI.
• Other users can launch instances of that AMI too.
• Each instance is a separate and independent virtual server.
EC2 Instance Types General Purpose
– Balanced compute, memory, network resources. – Useful for typical server applications.
Compute Optimized – Many vCPUs with lowest cost per vCPU. – High traffic web apps, video encoding, analytics.
GPU Instances – Provide access to GPUs with hundreds of CUDA cores. – Gaming and 3D graphics.
Memory Optimized – High memory with lowest cost per GiB of RAM. – Databases and distributed caches.
Storage Optimized – Large storage and high speed storage (SSD) versions. – Large databases and fileservers.
General Purpose (m1) Instance Types
Small Instance
1.7GiB RAM
1 Virtual Core
1 EC2 Compute Unit
160GB instance storage
32-bit or 64-bit
Large Instance
7.5GiB RAM
2 Virtual Cores
4 EC2 Compute Units
2 x 420GB instance storage
64-bit platform
Extra Large Instance
15GiB RAM
4 Virtual Cores
8 EC2 Compute Units
4 x 420GB instance storage
64-bit platform
1 EC Compute Unit = Early 2006 1.7 GHz Xeon CPU
Access Identifiers
AWS uses a set of different access identifiers – Use public key cryptography – Public identifier kept on service on instance
• Can be shared with anyone
– Private identifier kept on your PC • Must keep secret
Elastic Block Store Volume
• An addressable virtual disk • Can be attached to an instance
– Format – Mount – Store files
• Volumes have lifetime independent of instance – Disk storage persists even if instance terminated
Block Device Mapping
Map system devices to AWS block storage.
– VM Device Name – AWS Volume ID – Status – Timestamp – DeleteOnTermination
Security Group
• A Security Group defines the set of permitted inbound connections for an instance. – Each group is a named access control list. – Entries specify allowed protocols, ports, and IPs. – Essentially a firewall.
• A single Security Group can be applied to multiple instances.
• Multiple Security Groups can be applied to a single instance.
S3 and EBS Instance Lifecycles
http://shlomoswidler.com/2009/07/ec2-instance-life-cycle.html
Data remains accessible if instance is rebooted or (EBS-only) stopped. Data cannot be recovered after an instance is terminated.
S3-backed Instance EBS-backed Instance
EC2 Resources
Persistent Resources
• Elastic IP Addresses • Elastic Block Storage
Volumes • Elastic Load Balancers • Security Groups • Amazon Machine Images
Ephemeral Resources
• Instances, including – Instance memory state – Instance disk state – Non-elastic IP address – DNS name
How can you maintain a running system if your servers are transient and unreliable?
AMI Types
Public—AMIs made available by Amazon and the EC2 community. Private—AMIs that you own and create; may be developed from Public AMIs. Shared—AMIs built by developers and shared with the EC2 community. Paid—AMIs that you purchase or that come with a service contract from a company such as Red Hat.
Security Credentials
Credentials to Administer Instances – AWS Management Console: Amazon account – Query and Third Party UIs: Secret access key – SOAP, EC2 CLI: X.509 certificate and private key
Credentials to Connect to an Instance – Amazon EC2 key pair – Windows administrator password
Credentials to Build Instances – UNIX: X.509 certificate and private key – Windows: Amazon account
Instance Network Addresses
EC2 instances assigned 2 IPs at launch – Private RFC1918 IP address for internal use – Public IP address NAT-mapped to private IP
EC2 instances assigned 2 DNS names at launch – Internal: resolves only inside EC2 – Public: associated with instance until stopped
Elastic IP addresses – Static IP addresses you map to an instance – Can keep and remap elastic IP addresses – Charged only for allocated but unused elastic IPs
Using Tags
Can tag – AMIs – Instances – EBS Volumes – EBS Snapshots
but not – Elastic IPs – Key pairs – Security groups
Storage Elastic Block Store (EBS) provides
– Off-instance storage – Persistence beyond instance lifetime – High availability and reliability – Attach and detach from running instance – Exposure as device with an instance
Simple Storage Service (S3) provides – Highly available and reliable storage for objects. – Objects can be up to 5TB in size. – Objects are accessible simply via a URL.
Amazon Glacier provides – Cheap, reliable long term backup with 24 hour turnaround.
Elastic Block Store (EBS)
EBS Volumes are up to 1TB in size – Attach to any EC2 instance in same AZ – Create snapshots at any time – Create new volumes based on snapshots
Reliability – Annual Failure Rate (AFR) of 0.1-0.5% – Commodity hard disk AFR is ~4% – About as reliable as a RAID set – Use snapshots for backups
Pricing per GB-month
EBS Snapshots
Snapshots saved to S3 – Not visible by S3 API. – Snapshots are EBS
volumes themselves.
Snapshots are fast – Use Copy on Write
(CoW), i.e. – Only changed blocks
since last snapshot need to update.
http://blog.rightscale.com/2008/08/20/amazon-ebs-explained/
S3 Features
• An Internet-scale data storage service – All data is stored redundantly in multiple AZs – Data is located in the region you specify
• Stores objects from 1 byte to 5TB in size • Objects are stored in a bucket and retrieved via
a unique, developer-assigned URL • You can have 100 named buckets • Each bucket can store an unlimited objects in a flat namespace.
Applications of S3
Fast, scalable, and reliable web file hosting – Especially useful for audio and video files
http://aws.amazon.com/articles/1073
Amazon Glacier
Cloud based backup and long term storage – Durable: data stored on multiple devices at
multiple sites. – Cheap: as low as 0.01¢ per GB-month. – Slow: retrieval guaranteed within 24 hours; usually
requires 3-5 hours.
Organize data in vaults. – Store archives (up to 40TB) in vaults. – Can have up to 1000 vaults. – Jobs notify user of completion using Amazon SNS.
Databases
Amazon Relational Database Service (RDS) – Managed relational database services. – Access via standard database protocols and SQL.
Amazon SimpleDB – Non-relational (NoSQL) flexible database service. – Access via web service requests. – Table size limited to 10GB.
Amazon DynamoDB – Scalable NoSQL database service introduced in 2012. – No table size limits, automatically partitions and scales.
Relational Database Service (RDS)
Users created their own DB instances – DB types: MySQL, Oracle, PostgreSQL, MSSQL. – Instance types with different CPU, RAM, storage. – Can create replicated DB instances across AZs.
Amazon manages – Software installation and updates. – Backups. – System administration.
SimpleDB
• Cloud-based non-relational (NoSQL) data store • Data is stored in domains (tables)
– Tables limited to 10GB in size. – Domains have a set of attributes (columns) – Attributes can have up to 256 values – Domains can have up to a billion items (rows)
• SimpleDB can be queried using a simple version of SQL via web service requests. – Does not support JOIN operations
Attributes can be added Dynamically
Initial model for person domain
Effect of adding Middle name attribute
DynamoDB
Highly reliable and scalable key/value store. – Stores associative arrays rather than tables. – Keys can have multiple values.
Fast – High throughput (built on SDs). – Very low latency (<10ms). – Users reserve desired throughput. – DynamoDB reconfigures itself to meet reservations.
Networking
Virtual Private Cloud (VPC) – Logically isolated segment of AWS cloud. – Complete control over virtual network, including – Subnets, routing tables, IP address ranges, etc. – Security and privacy.
AWS Direct Connect – Dedicated private 1 or 10Gbps connection to AWS. – Available in about a dozen major data centers. – Consistent latency and throughput. – Lower data transfer pricing.
AWS Economics
AWS prices its resources based on – Time: An hour of CPU time – Volume: GB of transferred data – Count: Number of messages queued – Time and Space: GB-month of data storage
Billing is done at beginning of month
Instance Pricing Options
Reserved Instances – Reservations for 1 to 3 years. – Price discounted by up to 65%. – Instance type can change within instance class. – Instances can be moved between AZs.
Spot Instances – Bid on spare Amazon compute capacity. – If bid exceeds current SpotPrice, you have an instance. – Your instance runs until SpotPrice exceeds your bid. – Useful for large computations whose results are not needed
at a specific time.
Key Points
• AWS architecture – Global infrastructure – Fundamental Services – Application Services – Management and Administration
• Fundamental Services – Compute: EC2, auto-scaling – Storage: EBS, S3 – Database: RDS, SimpleDB, DynamoDB – Networking: VPC, Direct Connect
Key Points
• AMIs are virtual disk images – A single AMI may have many instances
• Instances are running VMs – Run in an AZ located in a region. – Use keypair to access via ssh.
• On instance termination – Local storage is lost except EBS volumes. – DNS name and IP address are lost. – Use elastic IPs or own DNS for permanent addresses.
• EC2 bills for time, data transfer, and storage.