Date post: | 05-Apr-2017 |
Category: |
Technology |
Upload: | amazon-web-services |
View: | 443 times |
Download: | 0 times |
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Webinars
Prahlad Rao, AWS Solutions ArchitectBalaji Iyer, AWS Professional Services Consultant
Mar 21, 2017
Optimizing the Data Tier for Serverless Web Applications
What to Expect from the Session
• Anatomy of Serverless Apps• Web applications• Mobile backends
• Hierarchy of choice for data tier options on AWS• Data tier for Serverless architectures• SQL vs. NoSQL considerations• AWS Lambda with Amazon DynamoDB• AWS Lambda with Amazon RDS database • AWS Lambda with Amazon ElastiCache
• Additional Best Practices• Caching, and retries
Web application architecture
AmazonCognito
Amazon API Gateway
AWSLambda
AmazonDynamoDB
Amazon ElastiCache
AmazonRDS
View Blog Posts(GETs)
Manage/ Edit Blog Posts(POSTs)
Web-federated Identity&
Cognito User Pools
AWSLambda
Triggers for sign-ups
AmazonSES
Mailers
Mobile backend
AmazonDynamoDB
https://api.myapp.com Amazon API Gateway
AmazonRDS
Amazon ElastiCache
AWS STS
AmazonCognitoAWS Lambda
Functions
AmazonS3
Core Business Logic
Data tier options on AWS
Amazon DynamoDB
Document and Key-
Value Store
Amazon RDS
SQL Database Engines
Amazon ElastiCache
In-Memory Key-Value
Store
Amazon Redshift
Data Warehouse
NoSQL vs. SQL for a new app: how to choose?
• Strong schema, complex relationships, transactions and joins
• Single/Cluster system scaling• Focus on ACID consistency and
availability• SQL tables will have faster query
performance when running complex queries
• Structured data sources, large ecosystem of SQL toolsets
• Partial Schema, easy reads and writes, simple data model
• Focus on performance and availability at scale
• Varied data sources, dynamic• High data volume, denormalized• Horizontal scaling
NoSQL SQL
Amazon DynamoDB use cases
Ad Tech IoT Gaming Mobile& Web
Ad serving,
retargeting, ID
lookup, user
profile
management,
session-
tracking, RTB
Tracking state,
metadata and
readings from
millions of
devices, real-
time
notifications
Recording
game details,
leaderboards,
session
information,
usage history,
and logs
Storing user
profiles,
session details,
personalization
settings, entity
specific
metadata
AWS Lambda with DynamoDB
• Configuration• No VPC configuration required• IAM roles for access and authentication• Leverage FGAC (Fine Grained Access Control) for
granular access to DynamoDB tables
AWS Lambda with DynamoDB
• Performance• Simple API model• Invoke concurrent connections at scale • Query consistency with volume growth• Simply dial-up read/write capacity units for scaling• Use DynamoDB for storing persistent data,
complement with ElastiCache for better read performance
RDS use cases
Applicable wherever you need relational databases
eCommerce Gaming
Websites IT Solutions
Apps
Reporting
Amazon Aurora: fast, available, and MySQL-compatible
SQL
Transactions
AZ 1 AZ 2 AZ 3
Caching
Amazon S3
ü 5x faster than MySQL on same hardware
ü Sysbench: 100K writes/sec and 500K reads/sec
ü Designed for 99.99% availability
ü 6-way replicated storage across 3 AZs
ü Scale to 64 TB and 15 read replicas
AWS Lambda with RDS• VPC Configuration
• Lambda functions by default have access to internet• Grant Lambda functions access to resources (RDS, EC2, ElastiCache) in
your own VPC by adding:§ VPC subnet IDs and security group IDs to Lambda configuration§ Lambda function execution role (AWSLambdaVPCAccessExecutionRole)§ Security group inbound rules on VPC resources should allow appropriate
ports for the subnet• Allows access to peered VPCs, VPN endpoints, and private S3 endpoints• Lambda access to VPC is optional, unless you need to access VPC
resources
AWS Lambda with RDS• VPC Configuration
• Functions configured for VPC access lose internet access• Even with “Auto-assign Public IP” enabled, Internet gateway and security
group allows all outbound traffic• If functions need access to both Internet and VPC, attach to private subnet
with Internet access through a NAT instance or Amazon VPC NAT gateway• Ensure subnets have enough IPs for ENIs• Avoid DNS resolution of public hostnames for your VPC when accessing
through Lambda function
AWS Lambda with RDS
• Performance• RDS instance type important for high Lambda concurrency• Concurrency control using ”Kinesis sandwich” (Lambda -> Kinesis -> Lambda -
> Storage tier). Allows throttle on backend at a different rate than frontend (may increase latency)
• Instantiate database connections outside scope of handler for connection re-use, other options use language frameworks (nodejs knex, sequelize) or open source libraries like Hibernate
• Faster query performance for complex queries• Fine tune max_connections based on DB instance type
AWS Lambda with Aurora on Amazon RDS and KMSDatabase Authentication
AWS Lambda
RDS Database
AWS KMS VPC NAT Gateway
Master Keys for encrypt/decrypt
1
2
3
4
3
1. Encrypt db password file with KMS
2. Package encrypted db password file along with lambda deployment package and upload to Lambda
3. When function is invoked, Lambda will connect with KMS through NAT gateway to decrypt password file
4. Lambda connects with database using extracted credentials to read/write records
ElastiCache use cases
Caching layer for performance or cost optimization of an underlying database
Storage of ephemeral key-value data
High-performance application patterns such as leaderboards (for gaming users), session management, event counters, in-memory lists
AWS Lambda with ElastiCache• Configuration
• Lambda configuration to access ElastiCache resources inside VPC• Use IAM roles for access and authentication• Leverage additional libraries (pymemcache, node discovery) within
your function
AWS Lambda with ElastiCache
• Performance• Invoke concurrent connections at scale • Use Redis pipeline to maximize number of operations per second• Handle high throughput by scaling instance types• ElastiCache offers faster performance with lowest latency• Write-through vs. lazy load based on applications• Memcache for read heavy workloads• Instead of updating the cache and persistent database, invalidate cache and
let the readers update it• Redis for write heavy workloads• Move data structures outside of the web apps to the data stores
AWS Lambda with API Gateway and Amazon ElastiCache
Amazon API Gateway
Amazon ElastiCache
AWS Lambda
1
2 34
1. Users authenticate via social identity providers or using Cognito
2. Amazon API gateway receives incoming request with query string parameters
3. Lambda function gets invoked, does a look up on the Redis cache
4. Lambda returns data based on the supplied criteria
Amazon Cognito
Closing out – additional best practices• Local Caching
• Instantiate AWS clients and database connections outside event handler for connection re-use
• Initialization code is executed once per function, before handler is called first time
• Connection re-use on frequent invocations will reduce latency• Files stored in /tmp space (512 MB) will exist on connection re-use• Schedule a function to keep it warm
Closing out – additional best practices• Retries and Event Ordering
• Lambda function called synchronously• Using the AWS SDK? Set retry logic there• Direct RESTful call to Lambda? Client control retries entirely• Ordering is up to the caller
• Amazon S3 or SNS triggers Lambda function, or asynchronous calls• 3 tries, total, then event is discarded• Loosely ordered• Let the function fail, Lambda drops the event and puts it on an SQS/SNS for retries –
Dead Letter Queue• Lambda polls Amazon Kinesis or Amazon DynamoDB update stream
• Attempts to process batch of records until data expires from source stream, ordering preserved
New Feature