Date post: | 23-Jul-2015 |
Category: |
Technology |
Upload: | amazon-web-services |
View: | 331 times |
Download: | 4 times |
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Guy Farber, AWS Business Development
5/19/2015
Getting Started: Storage with
Amazon S3 and Amazon Glacier
Agenda
• AWS Storage Options
• S3 - Scalable object storage
• Glacier - inexpensive archive storage
• Data ingest options
• Main use cases
• Q&A
AWS Global Infrastructure
11 Regions
28 Availability Zones
52 Edge locations
Control your geographic locality
for performance and compliance
AWS Storage Choices
Amazon S3
Durable object
storage for all types
of data
Amazon EBS
Block storage for use
with Amazon EC2
Amazon Glacier
Archival storage
for infrequently
accessed data
Economics Easy to Use Reduce risk Agility, Scale
Pay as you go
No upfront investment
No commitment
No risky capacity
planning
Self service
administration
SDKs for simple
integration
Durable and Secure
Avoid risks of physical
media handling
Reduce time to market
Focus on your
business, not your
infrastructure
Amazon EFS
File storage for use
with Amazon EC2
Amazon S3
Highly durable object storage for all types of data
Internet-scale storage
Grow without limits
Benefit from AWS’s
massive security
investments
Built-in redundancy
Designed for
99.999999999%
durability and 99.99%
availability
Low price per GB
per month
No commitment
No up-front cost
S3 Key Features
Data Management• Cost monitoring and controls
• Lifecycle management
Ease of use• Programmatic access using AWS SDKs
• REST APIs
• Management Console, AWS CLI
Event Notifications• Delivered using SQS, SNS, or Lambda
• Enable you to trigger workflows, alerts or
other processing
Data protection• Versioning
• Cross-region replication
Security• Multi-factor authentication delete
• Flexible access control mechanisms
• Time-limited access to object
• Access logs
• Multiple client and server-side
Encryption options
1 41 81 121
102% year-over-year increase in
data transfer to and from S3
(Q4 2014 vs Q4 2013, not including Amazon use)
S3 usage
S3 scalability: buckets and objects
S3 website: static content
1 PB raw storage
800 TB usable storage
600 TB allocated storage
400 TB application data
S3 capacity pricing—pay only for what you use!
Amazon S3
S3 continuous cost reduction
Available through 11 regions globally
Priced at per GB-month rates
8 price reductions since launch
51% average S3 capacity fee
reduction on 4/1/2014
TCO: comparing on-premises to S3
• Can be challenging for some
customers
• We can help!
Reduced redundancy option99.99% saves ~20%
Amazon Glacier
Archival storage for infrequently accessed data
Amazon Glacier
is optimized for
infrequent retrieval
Stop managing
physical media
Even lower cost than
Amazon S3;
Same high durability
3-5 hour retrieval latency
%5 free tier on retrievals
$0.01 per GB/month
$123 per TB/year
Replace tape libraries, VTLs
Glacier – 3 ways to ingest data
•Direct Glacier API/SDK
• Direct access to Glacier for deep archives
•S3 lifecycle integration
• Move older data to less expensive archive
tier
•Third party tools and gateways
• Integrate existing backup and archive
applications using an IT-friendly interface
Optimize your storage spending by tiering on AWS
Use Amazon Glacier
for lowest-cost,
durable cold storage
of archival data
Use Amazon S3
for reliable, durable
primary storage
Use Amazon S3 Reduced
Redundancy Storage
for secondary backups
at a lower cost
RRS
S3 lifecycle policies →
Key prefix “logs/”
Transition objects to Glacier 30 days after creation
Delete 365 days after creation date
<LifecycleConfiguration>
<Rule>
<ID>archive-in-30-days</ID>
<Prefix>logs/</Prefix>
<Status>Enabled</Status>
<Transition>
<Days>30</Days>
<StorageClass>GLACIER</StorageClass>
</Transition>
<Expiration>
<Days>365</Days>
</Expiration>
</Rule>
</LifecycleConfiguration
Amazon S3 – advanced
features
• Preserve, retrieve, and restore every version
of every object stored in your bucket
• S3 automatically adds new versions and
preserves deleted objects with delete markers
• Easily control the number of versions kept by
using lifecycle expiration policies
• Easy to turn on in the AWS Management
Console
S3 versioning
Key = photo.gif
ID = 121212
Key = photo.gif
ID = 111111
Versioning
Enabled
PUTKey = photo.gif
S3 cross-region replicationAutomated, fast, and reliable asynchronous replication of data across AWS regions
Source
(Virginia)
Destination
(Oregon)
• Only replicates new PUTs. Once
S3 is configured, all new uploads
into a source bucket will be
replicated
• Entire bucket or prefix based
• 1:1 replication between any 2
regions
• Versioning required
Use cases:
• Compliance—store data hundreds of miles apart
• Lower latency—distribute data to regional customers)
• Security—create remote replicas managed by separate AWS accounts
S3 event notifications
Delivers notifications to Amazon SNS, Amazon SQS, or AWS
Lambda when events occur in S3
S3
Events
SNS topic
SQS queue
Lambda function
Notifications
Foo() {…}
Prior to S3 VPCE
S3 virtual private endpoint (VPCE)
Using S3 VPCE
• Public IP on EC2 Instances and IGW
• Private IP on EC2 Instances and NAT
• Access S3 using S3 Private Endpoint (VPE)
without using NAT instances or Gateways
• Increased security
Amazon S3S3
S3 data encryption options
Client-side encryption use AWS SDKs• You manage the encryption keys and never send them to AWS
Server-side encryption (SSE) with Amazon S3 managed keys• “Check-the-box” to encrypt your data at rest. Keys managed by S3
SSE with customer provided keys• You manage your encryption keys and provide them for PUTs and GETS
SSE with AWS Key Management Service managed keys• Keys managed centrally in AWS KMS with permissions and auditing of usage
For more details – watch Encryption and Key Management in AWS:
https://www.youtube.com/watch?v=uhXalpNzPU4
AWS data ingest options
AWS Import/
Export
Internet/VPN
AWS Storage Gateway
Service
AWS Direct
Connect
S3 and Glacier use cases
Cloud Storage for web applications
Origin store for content distribution
Staging area and persistent store for Big Data analytics
Backup and archive target
Druva InSync SaaS: Endpoint Data Protection
Druva relies on Amazon
EC2, S3 and DynamoDB
for inSync Cloud - a fully
automated, secure,
enterprise backup solution
“Building inSync Cloud on
AWS has meant a faster
time market”Milind Borate, Druva CTO
http://aws.amazon.com/solutions/case-studies/druva/
• S3 can be used as durable
origin for global content
distribution
• Provides single origin for
multiple CDNs, such as
Amazon CloudFront
• Data transfer out of S3 into
CloudFront is now free!
• Optimal for serving static
web assets such as images,
videos and HTML
Single origin storage for content distribution
Amazon S3
Bucket
Edge
Location
Edge
Location
Edge
Location
Edge
Location
Edge
Location
3
3
2
2
Edge
Location
Edge
Location
Amazon CloudFront edge locations
AWS provides full-site,
or media asset, delivery
via a worldwide content
delivery network (CDN)
called Amazon CloudFront.
SoundCloud—leveraging S3 and Glacier
for audio transcoding
• World’s leading social sound platform
• Audio files must be transcoded and
stored in multiple formats
S3 via
CloudFront
Glacier
Big Data analytics is a rapidly growing use case…
• Common staging area for Big
Data analytics jobs
• Use distributed cluster solutions
(i.e. MapReduce) to run large-
scale processing and analysis of
data.
• Scale compute resources
without depending on storage
• Leverage a highly available
object store that can be easily
shared by multiple instances
within a cluster
S3 for staging and persistently storing Big Data
Amazon Simple Storage Service (S3)
Amazon EMR Job Flow
Amazon Ec2 Instance
Amazon CloudWatc
h The Amazon EMR job flow runs on a cluster of
Amazon EC2 Instances
Input data
Output results
Metr
ics
Netflix, a global video delivery provider,
uses S3 as the storage layer for Hadoop-
based Big Data applications
S3 Value:• 11 9’s of durability
• Use Versioning as protection against accidental
deletes and overwrites
• Grew quickly from a few hundred TB to many PBs
• Access the same data in S3 from multiple Hadoop
clusters
• Tight integration with Amazon EMR
S3 for Big Data: Netflix
• Eliminate over-purchasing
and provisioning with virtually
limitless capacity
• Enable Information Lifecycle
Management with automated
tiering between S3 and
Glacier
• Ideal for regulatory and
compliance cases
Archive after
30 days
My S3 bucket Amazon Glacier
rawdata1
rawdata2
rawdata3 Delete after
7 years
Backup and Archive
Easy cloud backup with AWS Storage Gateway
Customer Datacenter
Amazon S3
AWS Storage
Gateway VM
On-Premises HostApplication
Servers
ISCSI
Works with
existing
applications
Direct Attached or
Storage Area Network Disks
AWS Storage
Gateway Service
Replace physical tape with AWS Storage
Gateway-VTL
Customer Datacenter
On-Premises Host
Direct Attached or
Storage Area Network Disks
Backup
Application
SCSI Tape Protocol
over iSCSI
Virtual Tape Library
Software Appliance VM
Amazon S3
AWS Storage
Gateway Service
Amazon Glacier
Virtual Tape
Library
Virtual Tape
Shelf
AWS Technology Partners integrate with S3
and Glacier
Spot Trading implements NetApp SteelStore to
optimize backup process
Spot Trading is a
technology-focused
proprietary trading firm
built on applied
technology, using the
latest in innovation to
solve problems in the
financial markets.
NetApp SteelStore + Amazon S3 value:
• 40 hours/month reclaimed by IT team to
focus on new strategies and systems
• Annual archival cost reduced by 96%
• Two-year ROI for SteelStore appliance,
including cloud storage costs
• $500,000 potential cost avoidance by
eliminating a costly SAN upgrade
• Deduplication reduced dataset by 85%
• Restores in minutes (from cache) or 4–5
hours (from Glacier) vs. days with tape
• Encryption in flight and at rest meets data
security requirements
Glacier
Data CenterAWS
Cloud-integratedstorage appliance
NetApp SteelStore
S3
What’s next?
Getting started with S3 and Glacier:
http://aws.amazon.com/s3/getting-started/
http://aws.amazon.com/glacier/getting-started/
Pricing:
http://aws.amazon.com/s3/pricing/
http://aws.amazon.com/glacier/pricing/
AWS Youtube channel:
https://www.youtube.com/user/AmazonWebServices/playlists
AWS Summit – Chicago: An exciting, free cloud conference designed to educate and inform new
customers about the AWS platform, best practices and new cloud services.
Details• July 1, 2015
• Chicago, Illinois
• @ McCormick Place
Featuring• New product launches
• 36+ sessions, labs, and bootcamps
• Executive and partner networking
Registration is now open• Come and see what AWS and the cloud can do for you.
CTA Script
- If you are interested in learning more about how to navigate the cloud to grow
your business - then attend the AWS Summit Chicago, July 1st.
- Register today to learn from technical sessions led by AWS engineers, hear best
practices from AWS customers and partners, and participate in some of the 30+
paid sessions and labs.
- Simply go to
https://aws.amazon.com/summits/chicago/?trkcampaign=summit_chicago_bootc
amps&trk=Webinar_slide
to register today.
- Registration is FREE.
TRACKING CODE:
- Listed above.