Storage on AWS
©2017AmazonWebServices,Inc.anditsaffiliates.Allrightsserved.Maynotbecopied,modified,ordistributedinwholeorinpartwithouttheexpressconsentofAmazonWebServices,Inc.
0Storage Primer
Storage - Characteristics
Durability Availability Security Cost Scalability Performance IntegrationMeasure of expected data loss
Measure of expected downtime
Security measures in place
Amount per storage unit, e.g. $ / GB
Upwardflexibility
Performancemetrics
Ability to interact with
Some of the ways we look at storage
AWS has a variety of storage optionsAmazon EBS (Elastic Block Storage)
Amazon Elastic File System (EFS)
Amazon EC2 Instance Store (Ephemeral Volumes)
Amazon S3 (Simple Storage Service)
Amazon Glacier
AWS Storage Gateway: File Gateway
Amazon Snowball & Snowball Edge
AWS Snowmobile
1Block Storage
Amazon EBS
• Persistent block level storage for EC2• Pay only for what you provision• Native redundancy and write cache• Consistent and low-latency performance• Optimized for random I/O• Native support for encryption at rest (data volumes)
Amazon EBS
• Network attached block device– Independent data lifecycle– Virtual disks– Multiple volumes per EC2 instance– Only one EC2 instance at a time per volume– Can be detached from an instance and attached to a different one
• Raw block devices– Unformatted block devices– Ideal for databases, filesystems
• Available in multiple types
AWS EBS Features
Durable Secure
Low-latency SSD Consistent I/O PerformanceStripe multiple volumes for higher I/O performance
Identity and Access PoliciesEncryption
ScalableUnlimited capacity when you need itEasily scale up and down
Performance Backup
Designed for five9’s reliabilityRedundant storage across multiple devices within an AZ
Point-in-time SnapshotsCopy snapshots across AZ and Regions
Amazon EBS• Highly available block storage for all types of data
Internet-scale storage Grow without limits
Benefit from AWS’s massive security investments
Built-in redundancyDesigned for 99.999% availability
Low price per GB per monthNo commitmentNo up-front cost
EBS Volume Types ComparisonMagnetic General Purpose
(SSD)Provisioned IOPS (SSD)
Performance Lowest Cost Burstable PredictableUse Cases Infrequent Data
AccessBoot volumesSmall to Medium DBsDev & Test
I/O IntensiveRelational & NoSQL
Media Magnetic (HDD) SSD SSDMax IOPS 100 on average with
the ability to burst to hundreds of IOPS
Baseline 3 IOPS/GBBurstable to 3,000 IOPS
Consistently performed at provisioned level, up to 20,000 IOPS
Price $.05/GB/Month$.05/million I/O
$.10/GB/MonthI/O Operations - Free
$.125/GB/Month$.065/provisioned IOPS
EBS Volume TypesSolid-State Drives (SSD) Hard disk Drives (HDD)
Volume Type General Purpose SSD (gp2)*
Provisioned IOPS SSD (io1)
Throughput Optimized HDD (st1)
Cold HDD (sc1)
Description General purpose SSD volume that balances price and performance for a wide variety of transactional workloads
Highest-performance SSD volume designed for mission-critical applications
Low cost HDD volume designed for frequently accessed, throughput-intensive workloads
Lowest cost HDD volume designed for less frequently accessed workloads
Use Cases • Recommended for most workloads
• System boot volumes• Virtual desktops• Low-latency interactive
apps• Dev and test environments
• Critical business applications that require sustained IOPS performance, or more than 10,000 IOPS or 160 MiB/s of throughput per volume
• Large database workloads
• Streaming workloads requiring consistent, fast throughput at a low price
• Big data• Data warehouses• Log processing• Cannot be a boot volume
• Throughput-oriented storage for large volumes of data that is infrequently accessed
• Scenarios where the lowest storage cost is important
• Cannot be a boot volume
Volume Size 1 GiB - 16 TiB 4 GiB - 16 TiB 500 GiB - 16 TiB 500 GiB - 16 TiBMax. IOPS**/Volume 10,000 20,000 500 250Max. Throughput/Volume†
160 MiB/s 320 MiB/s 500 MiB/s 250 MiB/s
Max. IOPS/Instance 65,000 65,000 65,000 65,000Max. Throughput/Instance
1,250 MiB/s 1,250 MiB/s 1,250 MiB/s 1,250 MiB/s
Dominant Performance Attribute
IOPS IOPS MiB/s MiB/s
*Default volume type**gp2/io1 based on 16KiB I/O size, st1/sc1 based on 1 MiB I/O size† To achieve this throughput, you must have an instance that supports it, such as r3.8xlarge or x1.32xlarge.
Internet
AWS Cloud
EBS Snapshots
EC2 Availability Zone
EC2
Amazon S3
EBS
EC2 EC2
EBS EBS EBS EBS EBS EBS Snapshot
EBS Snapshot
EBS Snapshot
EBS Snapshot
EBS Snapshot
Create Snapshot
Clone From Snapshot
EBS Volume
How Do Snapshots Work?Time
Snapshot 1 Snapshot 2 Snapshot 3
S3
Block 1Block 2Block 3Block 4
Chunk 1Chunk 2Chunk 3Chunk 4
EC2 Instance Store (Ephemeral Volumes)
• Free with your EC2 instance– SAS and SSD options– Size/type based on instance type
• Local, direct attached resource• Consistent sequential reads and writes• Use only for non-persistent data
2Shared file system
Elastic File System (EFS)• Fully managed file system for EC2 instances• Provides standard file system semantics• Works with standard operating system APIs• Sharable across thousands of instances• Elastically grows to petabyte scale• Delivers performance for a wide variety of workloads• Highly available and durable• NFS v4–based• Accessible from on-prem servers New!
Amazon EFS is Simple
• Fully managed- No hardware, network, file layer- Create a scalable file system in seconds!
• Seamless integration with existing tools and apps- NFS v4.1—widespread, open- Standard file system access semantics- Works with standard OS file system APIs
• Simple pricing = simple forecasting
1
Amazon EFS is Elastic
• File systems grow and shrink automatically as you add and remove files
• No need to provision storage capacity or performance
• You pay only for the storage space you use, with no minimum fee
2
• File systems can grow to petabyte scale
• Throughput and IOPS scale automatically as file systems grow
• Consistent low latencies regardless of file system size
• Support for thousands of concurrent NFS connections
Amazon EFS is Scalable3
• Designed to sustain AZ offline conditions
• Resources aggregated across multiple AZ’s
• Superior to traditional NAS availability models
• Appropriate for Production / Tier 0 applications
Highly Durable and Highly Available
Example use cases
• Big Data Analytics
• Media Workflow Processing
• Web Serving
• Content Management
• Home Directories
EFS – MountingEFS
EC2EC2 EC2 EC2EC2
EFSDNS Nameavailability-zone.file-system-id.efs.aws-region.amazonaws.com
Mountonmachinesudo mount -t nfs4 mount-target-DNS:/ ~/efs-mount-point
EC2
3Object Stores
Amazon S3 (Simple Storage Service)
• Web accessible object store• Pay for exactly what you use• Highly durable (99.999999999% design)• Limitlessly scalable• Natively online• Two flavors:
– Standard Storage - $0.023 * per GB / mo– Standard – Infrequent Access Storage (min size 128KB) – $0.0125* per GB / mo + Data
retrieval cost* (US East (N Virginia) pricing)
Amazon S3 (Simple Storage Service)• Parallel I/O for max speed (Multipart Upload, Ranged GETs)• Resource-level IAM permissions• Bucket Policies & ACLs• Direct access through APIs• Server Side Encryption• Static Website Hosting• Data Lifecycle Rules• Amazon Athena – New
– Interactive Query Service that makes it easy to analyze data in Amazon S3 using standard SQL
Object Storage Tiering
S3 Standard
• Primary data• Big Data
Analytics• Small objects• Temporary
scratch space
S3 - IA
• File sync and share
• Active Archive• Enterprise backup• Media transcoding• Geo-
redundancy/DR
Glacier
• Deep/offline archives
• Tape vaulting replacement
• WORM-compliant data
Data tiering using S3 Life Cycle Policies
Object Storage Use Cases
S3
S3-IA
Glacier
Cloud Applications
Big DataAnalytics
Content Distribution Primary Data
File Sync & Share
ActiveArchive
EnterpriseBackup
MediaTranscoding
Disaster Recovery /Geo Redundancy
Deep / Offline
Archives
Tape Vaulting Replacement
WORM Compliant
Data
Temporary & Small
Objects
Lifecycle
AvailableS3: 99.99%
S3-IA: 99.9%
PerformantLow Latency
High Throughput≥ 30 Days≥ 128K
≥ 90 Days
Durable99.999999999%
ScalableElastic capacity No preset limits
> 0K$0.004 / GB per month
$0.0125 / GB per month
“Hot” DataActive and/or
Temporary Data
“Warm” DataInfrequently
Accessed Data
“Cold” DataArchive and
Compliance Data
≥ 0 Days> 0KStarts at $0.023 / GB per month
1-5 mins
$0.01/GB retrieval
Storage Tiered To Your Requirements
S3-IA
Glacier
S3
3 new retrieval options
3–5 hrs 5–12 hrs
Expedited Standard Bulk$0.03 / GB $0.01 / GB $0.0025 / GB
Amazon Glacier• Low-Cost Archival Storage• Secure
• SSL & AES-256
• Durable• Designed for 99.999999999% durability
• Optimized for data archiving and backup• Suitable for RTO measured in hours• Includes storage costs and retrieval costs
• Three retrieval options: Expedited, Standard, Bulk • As little as $0.004 per GB/Month (US East pricing)• Integrated with S3
4On-Premises
Storage Integration
Storage Gateway hybrid storage solutionsEnables using standard storage protocols to access AWS storage services
AWS StorageGateway
Amazon EBS snapshots
Amazon S3
Amazon Glacier
AWS Identity and Access Management (IAM)
AWS Key Management Service (KMS)
AWS CloudTrail
Amazon CloudWatch
Files
Volumes
Tapes
Storage Gateway – Files, volumes, and tapes
File gateway NFS (v3 and v4.1) interfaceOn-premises file storage backed by Amazon S3 objects
Tape gateway iSCSI virtual tape library interfaceVirtual tape storage in Amazon S3 and Glacier with VTL management
Volume gateway iSCSI block interfaceOn-premises block storage backed by S3 with EBS snapshots
Storage Gateway – Common capabilities
Standard storage protocols integrate with on-premises applications
Local caching for low-latency access to frequently used data
Efficient data transfer with buffering and bandwidth management
Native data storage in AWS
Stateless virtual appliance for resiliency
Integrated with AWS management and security
File gatewayOn-premises file storage maintained as objects in Amazon S3
Customer Premises
FileGateway
• Data stored and retrieved from your S3 buckets• One-to-one mapping from files-to-objects• File metadata stored in object metadata• Bucket access managed by IAM role you own and manage• Use S3 Lifecycle Policies, versioning, or CRR to manage data
GlacierS3 Standard
S3 Standard -Infrequent
Access
HTTPSNFSv3 / v4.1
Application Server
Application Server
Volume gatewayOn-premises volume storage backed by Amazon S3 with EBS snapshots
• Block storage in S3 accessed via the volume gateway• Data compressed in-transit and at-rest• Backup on-premises volumes to EBS snapshots• Create on-premises volumes from EBS snapshots• Up to 1PB of total volume storage per gateway
Amazon EBS
snapshots
Storage Gatewaybucket in
Amazon S3
Customer Premises
VolumeGateway
iSCSI HTTPS
Tape gatewayVirtual tape storage in Amazon S3 and Glacier with VTL management
• Virtual tape storage in S3 and Glacier accessed via tape gateway• Data compressed in-transit and at-rest• Unlimited virtual tape storage, with up to 1PB of tapes active in library• Supports leading backup applications:
Archived Tapes stored in
Amazon Glacier
MED
IA
CH
ANG
ERTA
PE
DR
IVE
Customer Premises
TapeGateway
Virtual Tapesstored in
Amazon S3BackupServer
HTTPSiSCSI
Hybrid storage use cases with Storage Gateway
Enabling cloud workloadsMove data to AWS storage for Big Data, cloud bursting, or migration
Tiered cloud storageEasily add AWS storage to your on-premises environment
Backup, archive, and disaster recoveryCost effective storage in AWS with local or cloud restore
Storage Gateway – Key Benefits
Seamless integration across standard storage protocols
Low-latency access
Durability, cost, and elasticity of AWS Storage services
Efficient data transfer
Data encryption
Integrated with AWS monitoring, management, and security
Amazon Snowball & Snowball Edge
• Petabyte scale data transport• Uses secure appliances• Economic and fast• Faster than Internet for significant data sets• Import into S3• HIPAA Compliant New
What is Snowball? Petabyte scale data transport
E-ink shipping label
Ruggedizedcase
“8.5G Impact”
All data encrypted end-to-end
80 TB10G network
Rain & dust resistant
Tamper-resistant case & electronics
How it works
Amazon Snowmobile• Exabyte-scale data transfer service• Each Snowmobile can transfer up to 100PB• Delivered to your site like a container• Connects to your network via removable high-speed network
switch• Appears as network-attached data store• Once connected secure, high speed data transfer begins• After data transfer, Snowmobile driven back to AWS and data is
loaded into AWS service you select e.g. S3, Redshift, Glacier
Introducing AWS Snowmobile• 45-foot long ruggedized shipping container
• Up to 100PB of capacity
• Load data S3 or Glacier
• Dedicated security personnel, GPS tracking,
alarm monitoring, 24/7 video surveillance,
and optional escort security while in transit
• Data encrypted with 256-bit encryption keys,
managed through KMS
Using Multiple Storage Options Together
• EBS + S3: snapshots
• S3 + EC2 Instance Store: caching
• S3 + CloudFront: edge caching
• S3 + Glacier: data lifecycle archiving
It’s all aboutchoice
Performance-orientedCost-oriented
Any Questions?