© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Dr. Tim Wagner, General Manager, AWS Lambda and Amazon API GatewayApril 27, 2017
Serverless Design Patterns with AWS Lambda:
Big Data with Little EffortCRAFT-Conf 2017
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda
What isserverless?Build and run applicationswithout thinking about servers
Let’s take a look at the evolution of computing
Physical serversin data centers
Virtual serversin data centers
Virtual serversin the cloud
Each progressive step was better
Physical serversdata centers
Virtual serversdata centers
• Higher utilization• Faster provisioning speed• Improved uptime• Disaster recovery• Hardware independence
• Trade CAPEX for OPEX• More scale• Elastic resources• Faster speed and agility• Reduced maintenance• Better availability and fault
tolerance
Virtual serversin the cloud
But there are still limitations
Physical serversdata centers
Virtual serversdata centers
• Trade CAPEX for OPEX• More scale• Elastic resources• Faster speed and agility• Reduced maintenance• Better availability and fault
tolerance
• Still need to administer virtual servers
• Still need to manage capacity and utilization
• Still need to size workloads
• Still need to manage availability, fault tolerance
• Still expensive to run intermittent jobs
Virtual serversin the cloud
Evolving to serverless
SERVERLESS
Virtual serversin the cloud
Physical serversin data centers
Virtual serversin data centers
No server is easier to manage than no server
All of these responsibilitiesgo awayProvisioning and utilizationAvailability and fault toleranceScalingOperations and management
EVENT DRIVEN CONTINUOUS SCALING PAY BY USAGE
Deliver on demand, never pay for idle
There is a spectrum of compute options
Size of Deployable Unit
FunctionAppContainerVirtual Machine
IaaS Docker
PaaS
Res
pons
ibili
tyCloudProvider
You
Shared
Function as a Service (FaaS)
There is a spectrum of compute options
Size of Deployable Unit
Amazon EC2 Amazon ECS
AWS Elastic Beanstalk
FunctionAppContainerVirtual Machine
Res
pons
ibili
tyCloudProvider
You
Shared
Function as a Service (FaaS)
FaaS services differ in operational burden
Size of Deployable Unit
FaaS
FaaS
Function as a Service (FaaS)
FunctionAppContainerVirtual Machine
FaaS
Amazon EC2 Amazon ECS
AWS Elastic Beanstalk
Res
pons
ibili
tyCloudProvider
You
Shared
Serverless means no management burden
Size of Deployable Unit
FaaS
FaaS
Function as a Service (FaaS)
FunctionAppContainerVirtual Machine
Serverless
Amazon EC2 Amazon ECS
AWS Elastic Beanstalk
Res
pons
ibili
tyCloudProvider
You
Shared
Serverless means no management burden
Size of Deployable Unit
FaaS
FaaS
Function as a Service (FaaS)
FunctionAppContainerVirtual Machine
Amazon EC2 Amazon ECS
AWS Elastic Beanstalk
Res
pons
ibili
tyCloudProvider
You
Shared
AWS Lambda: Run code in response to events
FUNCTION SERVICES (ANYTHING)
Changes in data state
Requests to endpoints
Changes in resource state
NodePythonJavaC#
EVENT SOURCE
Amazon S3 Amazon DynamoDB
Amazon Kinesis
AWS CloudFormation
AWS CloudTrail
Amazon CloudWatch
Amazon Cognito
Amazon SNS
AmazonSES
Cron events
DATA STORES ENDPOINTS
CONFIGURATION & MANAGEMENT EVENT/MESSAGE SERVICES
Example event sources that trigger AWS Lambda
AWS CodeCommit
AmazonAPI Gateway
AmazonAlexa
AWS IoT
AWS Step Functions
Using AWS Lambda
Bring your own code• Node 4.3, 6.1• Java 8• Python 2.7, 3.6• .NET Core 1.0.1• Bring your own libraries
(even native ones)
Simple resource model• Select power rating from
128 MB to 1.5 GB• CPU and network
allocated proportionately
Flexible use• Synchronous or
asynchronous• Integrated with other
AWS services
Flexible authorization• Securely grant access to
resources and VPCs• Fine-grained control for
invoking your functions
Using AWS Lambda
Authoring functions• WYSIWYG editor or
upload packaged .zip• Third-party plugins
(Eclipse, Visual Studio)
Monitoring and logging• Metrics for requests,
errors, and throttles• Built-in logs to Amazon
CloudWatch Logs
Programming model• Use processes, threads,
/tmp, sockets normally• AWS SDK built in
(Python and Node.js)
Stateless• Persist data using
external storage• No affinity or access to
underlying infrastructure
Mapping owned by Event SourceInvokes Lambda asynchronously
AmazonS3
Amazon SNS
ASYNCHRONOUS PUSH MODEL
Async invocation
Mapping owned by Lambda
Lambda function called when new records found on stream
Lambda polls the source
HOW IT WORKS
AmazonAlexa
AWSIoT
SYNCHRONOUS PUSH MODEL
Sync invocation
Amazon DynamoDB
Amazon Kinesis
STREAM PULL MODEL
Sync invocation
How event sources work
Mapping owned by Event SourceCalls Lambda synchronously
Building blocks for serverless
AWS Lambda Amazon DynamoDB
Amazon SNS
Amazon API GatewayAmazon SQS
Amazon Kinesis
Amazon S3
Orchestration and State Management
API Proxy Messaging and Queues Analytics
Monitoring and Debugging
Compute Storage Database
AWS X-RayAWS Step Functions
Edge Compute
AWS Greengrass
Lambda@Edge
Serverless changes how you deliver
Speeds uptime to market
Dedicated timeto innovation
Increases developer productivity
Eliminates operational complexity
Serverless use cases
Chatbots
• Powering chatbot logic
• Alexa Skills for Amazon Echo
Common use cases
Web applications• Static
websites
• Dynamic web apps
• Packages for Flask and Express
Backends
• Apps & services
• Mobile
• IoT
</></>
Media & Log Processing• Real-time data
• Streaming data
Big Data
• MapReduce
• Batch
Lambda + S3
Common use cases
Common use cases
• Lambda processes 200-300 images uploaded per minute
• Peak processing of 6,000 images per minutes
• Reduced image processing time from hours to only 10+ seconds
Common use cases
http://www.vogue.it/
• Lambda performs image processing for PhotoVogue, which hosts 400,000+ photos online
• User experience up to 90% faster
Lambda + Kinesis + DynamoDB
Common use cases
Common use cases
• Processes 4,000 requests per second
• Built the solution in only 2.5 months
• Handled spikes in traffic of 2x normal load
Lambda + API Gateway + S3 + DynamoDB
Serverless web applications
Chalice
Frameworks for building serverless apps
Serverless JavaContainer
Serverless app lifecycle
Capabilities of a serverless platform
Application modeling framework
Monolithic application
Microservices
But what happens when you have an entire app made up of many functions?
Composing serverless applications
Meet SAM
AWS Serverless Application Model (SAM)
Standard model for representing serverless applications on AWS
Functions, APIs, event sources, and data stores
Simplifies deployment and management for serverless applications
AWS Serverless Application Model (SAM)
• Natively supported by AWS CloudFormation
• Export any function as a SAM template
• Package and deploy SAM templates using AWS CLI
• Open spec under Apache 2.0 for community extensions
CI/CD for serverless applications
</>
AWS CodePipeline + SAM
GitHubAmazon S3AWS CodeCommit
AWS CodeBuild AWS CodeBuildThird-party tools
AWS CloudFormation
Commit Build Test Deployto Prod
AWS CodeStar New!
Big Data and AWS Lambda!
Big Data
• MapReduce
• Batch
Big data
Map Phase Reduce PhaseInputs Results
Demos
Focus on two techniques• Amazon S3 – persistent data• Amazon Kinesis Firehose – streaming data
Many more options and topics!• SQL and NoSQL data triggers• Sharded streams• Queues, notifications, working with 3rd party systems, …
Why Lambda + S3?
• Easy – both scale automatically• Common – S3 is one of the most-used AWS services• Persistent – data sticks around• Can be immutable (i.e., keep both original and
transformed data)• Two ways to work:
• Read/write from S3 in a Lambda function• Hook up S3 bucket events to a Lambda function
Why Lambda + Firehose?
• Easy – both scale automatically, no sharding needed• 24 hour record retention• Works great for real-time analytics pipelines• Data can go to Elasticache, S3, or Redshift• Data can come from anywhere
• …including Lambda!• Data can be transformed in flight
• …using a simple Lambda function
Demos
Focus on two techniques• Amazon S3 – persistent data
1. Parallel data processing: Lambda à Amazon S32. In-place object transforms: S3 à Lambda à S3
• Amazon Kinesis Firehose – streaming data3. Streaming data: Lambda à Amazon Kinesis Firehose4. Streaming transforms: Firehose à Lambda à Firehose
1. Parallel data processing using Lambda
Amazon S3
Amazon S3
Animal sightings data
{
“squirrel”, 33,
“raccoon”, 11,
“hummingbird”, 1,
…
}
Three levels:
200 parallel functions X200 S3 objects per function X500 entries per S3 object =
20 million entries
2. In-place S3 data transformation
Amazon S3
Lambda
Object written to S3
World’s simplest map-reduce!
• Each S3 object is mapped (aggregated) by the event processor
• We run N parallel reducers, one per S3 directory• Final serial reduction over the directories (could be
parallelized, but N is small here)
Serverless Map/Reduce with Lambda
https://github.com/awslabs/lambda-refarch-mapreduce
Capture DataStreams
IoT Data
Financial Data
Log Data
Clickstream Data
OutputData
DATABASE
CLOUDSERVICES
EVENT SOURCE
Process Data Streams
FUNCTION
Real-time streaming data
3. Streaming data ingestion
Kinesis Firehose
Kinesis Firehose
4. Inline streaming transformation
Lambda
Kinesis Firehose
…and the cool part?
That was free.
All the infrastructure on which it ran is gone.
• Buy compute time in 100 ms increments
• Low request charge• No hourly, daily, or
monthly minimums• No per-device fees
Never pay for idle!
Free Tier1 million requests and 400,000 GBs of compute every month, every customer
AWS Lambda pricing
Serverless is a fundamental component of modern applications
Conclusion
Lambda is a fundamental component of modern application architectures
It has a place in everything from data processing to simple web apps