Date post: | 23-Jan-2018 |
Category: |
Technology |
Upload: | yan-cui |
View: | 157 times |
Download: | 2 times |
in production
an experience reportan experience reportwhat you should know before you go to production
ServerlessServerless
Yan Cui
Server Architect
Principal Engineer
Lead Developer
Senior Developer
http://theburningmonk.com
@theburningmonk
Senior Developer
Yan Cui
Server Architect
Principal Engineer
Lead Developer
Senior Developer
http://theburningmonk.com
@theburningmonk
Senior Developer
hidden complexities and dependencies
low utilisation to leave room for traffic spikes
EC2 scaling is slow, so scale earlier
lots of cost for unused resources
up to 30 mins for deployment
deployment required downtime
WE WANT TO...minimise cost for unused resources
minimise ops effort reduce tech mess
deliver visible improvements faster
170 Lambda functions in prod
1.2 GB deployment packages in prod
95% cost saving vs EC2
15x no. of prod releases per month
Legacy Monolith Amazon Kinesis Amazon Lambda
Google BigQuery
1 developer, 2 daysdesign production
(his 1st serverless project)
Legacy Monolith Amazon Kinesis Amazon Lambda
Google BigQuery“nothing ever got done
this fast at Skype!”
- Chris Twamley
https://github.com/awslabs/serverless-application-model
“…We find that tests that mock external libraries often need to be complex to get the code into the right state for the functionality we need to exercise.
The mess in such tests is telling us that the design isn’t right but, instead of fixing the problem by improving the code, we have to carry the extra complexity in both code and test…”
Don’t Mock Types You Can’t Change
“…The second risk is that we have to be sure that the behaviour we stub or mock matches what the external library will actually do…
Even if we get it right once, we have to make sure that the tests remain valid when we upgrade the libraries…”
Don’t Mock Types You Can’t Change
Paul Johnston
The serverless approach to testing is different and may
actually be easier.
http://bit.ly/2t5viwK
is our request correct?
is the request mapping set up correctly?is the API resources
configured correctly?
are we assuming the correct schema?
LambdaAPI Gateway DynamoDB
is Lambda proxy configured correctly?
is IAM policy set up correctly?
is the table created?
what unit tests will not tell you…
most Lambda functions are simple have single purpose, the risk of
shipping broken software has largely shifted to how they integrate with
external services
observation
…if a service can’t provide you with a relatively easy
way to test the interface in reality, then you should
consider using another one.
Paul Johnston
“…Wherever possible, an acceptance test should exercise the system end-to-end without directly calling its internal code.
An end-to-end test interacts with the system only from the outside: through its interface…”
Testing End-to-End
Legacy Monolith Amazon Kinesis Amazon Lambda
Amazon CloudSearchAmazon API Gateway Amazon Lambda
Test Input
Legacy Monolith Amazon Kinesis Amazon Lambda
Amazon CloudSearchAmazon API Gateway Amazon Lambda
Test Input
Validate
integration tests differ from acceptance tests only in HOW the
Lambda functions are invoked
observation
“…We prefer to have the end-to-end tests exercise both the system and the process by which it’s built and deployed…
This sounds like a lot of effort (it is), but has to be done anyway repeatedly during the software’s lifetime…”
Testing End-to-End
if [ "$1" = "deploy" ] && [ $# -eq 4 ]; then STAGE=$2 REGION=$3 PROFILE=$4
npm install AWS_PROFILE=$PROFILE 'node_modules/.bin/sls' deploy -s $STAGE -r $REGION elif [ "$1" = "int-test" ] && [ $# -eq 4 ]; then STAGE=$2 REGION=$3 PROFILE=$4
npm install AWS_PROFILE=$PROFILE npm run int-$STAGE elif [ "$1" = "acceptance-test" ] && [ $# -eq 4 ]; then STAGE=$2 REGION=$3 PROFILE=$4
npm install AWS_PROFILE=$PROFILE npm run acceptance-$STAGE else usage exit 1 fi
2016-07-12T12:24:37.571Z 994f18f9-482b-11e6-8668-53e4eab441ae GOT is off air, what do I do now?
UTC Timestamp API Gateway Request Id
your log message
• invocation Count• error Count• latency• throttling• granular to the minute• support custom metrics
• same metrics as CW• better dashboard• support custom metrics
https://www.datadoghq.com/blog/monitoring-lambda-functions-datadog/
console.log(“hydrating yubls from db…”);
console.log(“fetching user info from user-api”);
console.log(“MONITORING|1489795335|27.4|latency|user-api-latency”);
console.log(“MONITORING|1489795335|8|count|yubls-served”);
timestamp metric value
metric type
metric namemetrics
logs
“you really don't want your monitoring
system to fail at the same time as the
system it monitors” - me
install Serverless framework as dev dependency at project level
dev dependencies are excluded since 1.16.0
complexity ceiling of a Node.js app
com
plex
ity
referential transparencyimmutability as default
type inferenceoption typesunion types
…
for managing complexity
complexity ceiling of a Node.js app
com
plex
ity
referential transparencyimmutability as default
type inferenceoption typesunion types
…
if you can limit the complexity of your solution, maybe you
won’t need the tools for managing that complexity.me
“AWS Lambda polls your stream and invokes your Lambda function. Therefore, if
a Lambda function fails, AWS Lambda attempts to process the erring batch of
records until the time the data expires…”
http://docs.aws.amazon.com/lambda/latest/dg/retries-on-errors.html
SNS
Kinesis
SQS
after 3 attempts
share processing logic
events are processed in chronological order
failed events are retried out of sequence
“Each shard can support up to 5 transactions per second for reads, up to a maximum total data
read rate of 2 MB per second.”
http://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html
“If your stream has 100 active shards, there will be 100 Lambda functions running concurrently. Then, each
Lambda function processes events on a shard in the order that they arrive.”
http://docs.aws.amazon.com/lambda/latest/dg/concurrent-executions.html
for subsystems that don’t have to be realtime, or are task-
based (ie. order doesn’t matter), consider other
triggers such as S3 or SNS.me
API Gateway and Kinesis Authentication & authorisation (IAM, Cognito) Testing Running & Debugging functions locally Log aggregation Monitoring & Alerting X-Ray Correlation IDs CI/CD Performance and Cost optimisation Error Handling Configuration management VPC Security Leading practices (API Gateway, Kinesis, Lambda) Step Functions Serverless design patterns
http://bit.ly/2AA5zzk
API Gateway and Kinesis Authentication & authorisation (IAM, Cognito) Testing Running & Debugging functions locally Log aggregation Monitoring & Alerting X-Ray Correlation IDs CI/CD Performance and Cost optimisation Error Handling Configuration management VPC Security Leading practices (API Gateway, Kinesis, Lambda) Step Functions Serverless design patterns
get 50% off with: vlcui
http://bit.ly/2AA5zzk