Date post: | 28-Jul-2015 |
Category: |
Technology |
Upload: | amazon-web-services |
View: | 1,246 times |
Download: | 0 times |
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Vyom Nagrani, Sr. Product Manager, AWS Lambda
June 16, 2015
Dynamic Data Ingestion with
Amazon S3 and AWS Lambda
Amazon S3 Event Notifications: Integrating
storage and workflows
Delivers notifications to Amazon SNS, Amazon SQS, or AWS
Lambda when events occur in Amazon S3
S3
Events
SNS topic
SQS queue
Lambda function
Notifications
Foo() {…}
Benefits of Amazon S3 Notifications for dynamic
data ingestion
Integration – A new surface on the
Amazon S3 “building block” for event-
based computing
Speed – typical time to send
notifications is less than a second
Simplicity – Avoids proxies or polling
to detect changesProxy
List/Diff
Notifications
or
AWS Lambda: A compute service that runs
your code in response to events
Lambda functions: Stateless, event-driven code execution
Triggered by events:
• Put to an Amazon S3 bucket
• Record in an Amazon Kinesis stream
• Direct sync and async invocations
Makes it easy to
• Build back-end services that perform at scale
• Perform data-driven auditing, analysis, and notification
High performance at any scale;
Cost-effective and efficient
No Infrastructure to manage
Pay only for what you use: Lambda
automatically matches capacity to
your request rate. Purchase
compute in 100ms increments.
Bring Your Own Code
“Productivity focused compute platform to build powerful, dynamic,
modular applications in the cloud”
Run code in a choice of standard
languages. Use threads, processes,
files, and shell scripts normally.
Focus on business logic, not
infrastructure. You upload code; AWS
Lambda handles everything else.
Benefits of AWS Lambda for building a server-
less data processing engine
1 2 3
What you can do with S3+Lambda
Customers have told us about powerful applications …
… and we look forward to seeing what you create.
Today’s demo #1: Workflow of a simple video
transcoding application
Notification
Amazon S3 AWS Lambda Amazon S3
New video
uploaded
Potential further additions to a production
video transcoding application
• Include custom transcoding/watermarking libraries
• Break longer video files into smaller clips, transcode each clip separately
• Transcode to multiple formats by running multiple Lambda functions in parallel
• Send S3 event notification to an SNS topic
• Subscribe multiple Lambda functions to that SNS topic
Today’s demo #2: Workflow of infrastructure
monitoring and automation application
Notification
Amazon S3 AWS LambdaAWS
CloudTrail
Amazon SNS
AWS IAM
Optional
Potential further additions to a production
infrastructure monitoring and automation
• In addition to monitoring and alarming, create automated actions in response to policy
violations or suspicious activity
• Create .config file with multiple check points
• Each check can have a different SNS topic to alarm against
• Aggregate CloudTrail log files to be delivered to a single admin S3 bucket across all your
AWS accounts
Today’s demo #3: Workflow of automated file
de-duplication on upload
Notification
Amazon S3 AWS Lambda
Amazon S3
New file
uploaded
Amazon
DynamoDB
Optional
Potential further additions to a production
automated file de-duplication
• Create and compare SHA hash for each file instead of using S3 eTag to reduce collision
• Handle collision situations by calling another Lambda function to do a full file compare
• Index all hashes to a DynamoDB table, check against table instead of reading all files in the
bucket each time a new file is uploaded/edited
• Create Lambda wrapper around deleteObject API call to update index table
Things to remember about S3 Notifications
• Amazon S3 event notifications are set up at the bucket level
• Highly reliable – designed for nine ‘9’s with at least once delivery
• Currently supports Put, Post, Copy, MultiPartComplete, and RRSObjectLost events
• Configuration stored as XML in the notification subresource associated with a bucket
• No additional charge for S3 Notifications
Attaching a Lambda function to S3 Notifications
• Automatic Scaling: Both S3 and Lambda scale automatically with higher PUT rates
• Lambda has a default limit of 1000 TPS, which can be increased by AWS Support Center
• Lambda queues all incoming requests from S3
• Lambda can absorb reasonable bursts of traffic for approximately 15-30 minutes
…Source
S3
Destination
1
Lambda
Destination
2
Functions
Lambda will scale with higher PUT rateS3 scales automatically
… Lambda
Frontend Queue
Best practices for creating Lambda functions
• Memory: CPU proportional to the memory configured
• Increasing memory makes your code execute faster (if CPU bound)
• Timeout: Increasing timeout allows for longer functions, but more wait in case of errors
• Retries: For S3, Lambda retries each function at least 3 times
• Events rejected by AWS Lambda may be retained and retried by S3 for 24 hours
• Permission model: S3 pushes events to Lambda, so grant S3 invocation permission
through a resource policy, and add the execution role Lambda
Monitoring and Debugging Lambda functions
• Monitoring: available in Amazon CloudWatch Metrics
• Invocation count
• Duration
• Error count
• Throttle count
• Debugging: available in Amazon CloudWatch Logs
• All Metrics
• Custom logs
• RAM consumed
• Search for log events
• Real time feed of log events delivered to an Amazon Kinesis stream
Customers running dynamic data ingestion
and processing using S3+Lambda
AWS
Lambda
Indexing
tables or
notifications
“I want to apply custom logic to process content being uploaded to my data store”. • Watermarking / thumbnailing• Transcoding• Indexing and deduplication• Aggregation and filtering• Pre processing• Content validation
Amazon S3
Bucket
Events
Transcoded
files
Three Next Steps
1. Enable S3 notification feature on your existing S3 buckets. Amazon S3 event notifications can be sent in response to actions taken on objects uploaded or stored in Amazon S3.
2. Create and test your first Lambda function. With AWS Lambda, there are no new languages, tools, or frameworks to learn. You can use any third party library, even native ones.
3. Use AWS Lambda to process Amazon S3 objects … no infrastructure to manage, and setup a dynamic data ingestion pipeline in minutes!
Thank you!
Visit http://aws.amazon.com/s3, the
AWS blog, and the S3 forum to learn
more and get started using S3.
Visit http://aws.amazon.com/lambda,
the AWS Compute blog, and the
Lambda forum to learn more and get
started using Lambda.
AWS Summit – Chicago: An exciting, free cloud conference designed to educate and inform new
customers about the AWS platform, best practices and new cloud services.
Details• July 1, 2015
• Chicago, Illinois
• @ McCormick Place
Featuring• New product launches
• 36+ sessions, labs, and bootcamps
• Executive and partner networking
Registration is now open• Come and see what AWS and the cloud can do for you.
• Click here to register: http://amzn.to/1RooPPL