DigitalMediaIngestandStorageOptionsonAWS
HenryZhangAmazonWebServices
ContenthasGravityandisgettingheavier…
…it’seasiertomoveprocessingtothecontent
4k/8kContent
Whereistheproblem?
MoreBandwidth$$$$$
MorePowerfulCompute$$$$$
WaymoreStorage$$$$$
SomeProgress(ABR,HEVC,VP10)
Where is the sliding scale on my Infrastructure?
AWS storage solutions
Amazon EFS
FileAmazon EBS Amazon EC2
Instance Store
BlockAmazon S3 Amazon Glacier
Object
Data Transfer
AWS Direct Connect
AWS Snowball
ISV Connectors
Amazon Kinesis
Firehose
Amazon S3 Transfer
Acceleration
AWS StorageGateway
A Concept - the Content LakeInspired from Data Lake (Coined by James Dixon in 2010)
A single store of all of digital content that you create and acquire in any form or factor•Don’t assume any resolutions/formats (for now or future)•It is up to the consumer (application consuming the content) to use the appropriate infrastructure for processing
Amazon S3 : the Content Lake
• Durable, cost-effective and fast• Highly scalable front-end
– Multi-part uploads (parallel writes)– Range-gets (parallel reads)
• No need for capacity planning or provisioning
• Use Amazon S3 with on-premises storage in a hybrid model
• Secure
Object Storage Options
S3 Standard
Active data/WIP Deep Archive/RetentionActive Archive/Mezzanine
S3 Standard - Infrequent Access
Amazon Glacier
Milliseconds 3-5 hoursMilliseconds$0.03/GB/mo $0.007/GB/mo$0.0125/GB/mo
99.999999999%Durability
Durability for long-term preservation
Built-in Fixity Checking
Automatic recovery
1 PB raw storage
800 TB usable storage
600 TB allocated storage
400 TB application data
Storage pricing - pay only for what you use
AWS Cloud Storage
- Transition Standard to Standard-IA- Transition Standard-IA to Amazon Glacier- Expiration lifecycle policy- Versioning support- Prefix support
Storage Tiering - Data Lifecycle
T T+3 days T+5 days T+ 15 days T + 25 days T + 30 days T + 60 days T + 90 days T + 150 days T + 250 days T + 365 days
Data access frequency over time
Save money on storage
58% saving over S3 Standard
44% saving over S3 Standard-IA
* Assumes the highest public pricing tier
Hydrating the Content Lake
AmazonS3
AmazonS3(multi-partUpload)
Direct Connect
Nx1G|10G
MassivelyScalableFront-end
AWS Snowball
What is Snowball? Petabyte scale data transport
E-ink shipping label
Ruggedizedcase
“8.5G Impact”
All data encrypted end-to-end
80 TB10G network
Rain & dust resistant
Tamper-resistant case & electronics
AWS Snowball - Petabyte scale data transport solution
Scale and Speed• Up to 80TB Capacity per device• 10Gbps and 1Gbps connectivity• Parallel data transfer enables PBs transferred in a week
Secure• Tamper-resistant enclosure• 256-bit encryption with KMS• Secure data erasure
Simple• Manage entire process through AWS Console• Lightweight data transfer client• Notifications
How it works
How fast is Snowball?
• Less than 1 day to transfer 300TB via with 4x 80TB Snowballs, less than 1 week including shipping
• Number of days to transfer 300TB via the Internet at typical utilizations
Internet&Connection&SpeedUtilization 1Gbps 500Mbps 300Mbps 150Mbps
25% 95 190 316 63250% 47 95 158 31675% 32 63 105 211
What does it cost?Dimension Price
Usage Charge per Job $250.00
Extra Day Charge (First 10 days* are free) $15.00
Data Transfer In $0.00/GB
Data Transfer Out $0.02/GB
Shipping** Varies
Amazon S3 Charges Standard storage and request fees apply
* Starts one day after the appliance is delivered to you. The first day the appliance is received at your site and the last day the appliance is shipped out are also free and not included in the 10-day free usage time.** Shipping charges are based on your shipment destination and the shipping option (e.g., overnight, 2-day) you choose.
Transfer 1 PB with 13 devices in parallel in 1 week!
AWS Import/Export Snowball• Accelerate PBs with AWS-
provided appliances• NEW 80 TB model
AWS Storage Gateway• Instant hybrid cloud • Up to 120 MB/s cloud upload rate
(4x improvement), and
Data ingestion into AWS storage services
Amazon Kinesis Firehose• Ingest data streams directly into
AWS data stores
AWS Direct Connect• COLO to AWS
ISV Connectors• CommVault• VERITAS• Dalet, Vidispine, etc.
NEW S3 Transfer Acceleration• Accelerate object transfer up to
300% faster using AWS’s private network
corporate data center
Media Archive and Metadata (cloud transition)
Onsite Archive Offsite Tape Archive
HSM
Metadata (Asset Manager)
Processing Tasks
On-Premise Tape
Onsite Archive
HSM
Metadata (Asset Manager)
Processing Tasks
corporate data center
AWS RegionAmazon Glacier
Cloud DAM (Syncing Metadata from on-prem)
Amazon Direct Connect
Offsite Tape ArchiveOn-Premise Tape
Media Archive (transition to the cloud)
Onsite Archive
HSM
Metadata (Asset Manager)
Processing Tasks
corporate data center
AWS Region
Amazon Glacier
Cloud MAM (Syncing Metadata
from on-prem)Amazon S3
Cloud Based Processing Tasks
Amazon Direct Connect
On-Premise Tape Offsite Tape Archive
Media Archive (transition to the cloud)
Onsite Archive
HSM
Metadata (Asset Manager)
Processing Taskscorporate data center
AWS Region
Amazon Glacier
Cloud DAM (Syncing Metadata from on-
prem)Amazon S3
Cloud Based Processing Tasks
Amazon Direct Connect
Onsite Cache Offsite Tape ArchiveOn-Premise Tape
Media Archive (transition to the cloud)
EdgeLocations
AvailabilityZone
Region
Dallas (2)
St.Louis
Miami
JacksonvilleLosAngeles(2)
Seattle
Ashburn (3)
Newark
NewYork(3)
Dublin
London (2)
Amsterdam(2)Stockholm
Frankfurt (2)Paris (2)
Singapore(2)
HongKong(2)
Tokyo(2)
SaoPaulo
SouthBend
SanJosePaloAltoHayward
OsakaMilan
Sydney
MadridSeoul
MumbaiChennai
Global Content Respository
Reference Architecture – Content Processing Pipeline (Using Lambda)
S3 multi-part API
S3 as backend storage for Content Files acesable to other processing tasks
Amazon Elastic Transcoder
S3 Notification
Trigger a Lambda Function to Start a
transcoding job
Ingest
S3 Notification
Lambda function to generate a signed URL to share the
file
Update CMS or Metadata
Media Workloads Re-Imagined
EBSInstance
Store
AmazonEBS/EFS/EC2InstanceStore
Process
Partner/Affiliate/ServiceProvider
UserDelivery/ConsumptionVFX/Production
On-PremApps
Archive
AmazonGlacier(LifeCyclePolicies)
c
c
Direct Connect
Content Access Transfer
Disposable Infrastructure
Auto-scalingWorkload specific
AmazonS3
EFS
Q&A
Learn more at: http://aws.amazon.com/s3/http://aws.amazon.com/glacier/http://aws.amazon.com/importexport/
Media Solution: Sony DADC
Problem Statement:• Challenged by on-prem legacy infrastructure.• Provide a performant, secure and economic media distribution
solution.• Decrease time to market for their customer’s finished content.
Use of AWS:• EC2 content processing and SWF, SQS, SNS for media workflow
automation• S3 for storage, Glacier for content archive• CloudFront for OTT.
Business Benefits: • Workflow pipelines can be run in a highly parallelized fashion
through AWS elastic scalability.• Significantly shorten their content delivery SLA with a new
AWS enabled target of 1-hr.• Fully migrating away from on-prem infrastructure.
On-demand cloud-based media supply chain and delivery solution
AmazonS3AmazonS3(range-gets)
Direct Connect
Nx1G|10G
MassivelyScalableS3Front-end
EBS
Instance Store
cMassivelyScalableComputeonAWSCloud
On-PremApps
Consuming the Content Lake
EFS
Preserve, retrieve, and restore every version of every object stored in your bucket
S3 automatically adds new versions and preserves deleted objects with delete markers
Easily control the number of versions kept by using lifecycle expiration policies
Easy to turn on in the AWS Management Console
Key = photo.gifID = 121212
Key = photo.gifID = 111111
Versioning Enabled
PUTKey = photo.gif
S3 versioning
Amazon S3 event notificationsDelivers notifications to Amazon SNS, Amazon SQS, or AWS Lambda when events occur in Amazon S3
S3
Events
SNS topic
SQS queue
Lambda function
Notifications
Foo() {…}
Support for notification when objects are created via Put, Post, Copy, or Multipart Upload.
Support for notification when objects are deleted, as well as with filtering on prefixes and suffixes for all types of notifications.
Elastic File System - Rendering in the Cloud
• Designed to support petabyte scale file systems
• Throughput scales linearly with storage
• Same latency spec across each AZ• Thousands of concurrent NFS
connections• Works great for large I/O sizes• Pay for only what you use not what
you provision• Managed with multi-copy durability
Securing your content on AWS
• MPAA alignment – AWS meets the latest content security guidelines (Aug 2015)
• VPC private endpoint for Amazon S3 – enables a true private workflow capability
• Encryption & key management capabilities