Date post: | 21-Jan-2018 |
Category: |
Technology |
Upload: | new-relic |
View: | 709 times |
Download: | 0 times |
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Microservice and
Serverless ObservabilityCLAY SMITH, NEW RELIC
@SMITHCLAY
IMPROVING
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What’s Observability?
A measure of how well we can
understand a system from the
work it does.
“I know long all the methods in
this service take to execute.”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What’s Instrumentation?
“This method took 25ms to execute”
Instrumentation: Measuring events in software using code.
(a type of white-box monitoring)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Agenda
1. System architectures of past, present and future
2. Collecting the right data to understand modern architectures
3. Observability requirements for modern architectures
4. Case study: AWS Lambda Observability
5. Q&A with New Relic Customer
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How Did You Monitor Apps in 1967?
Attribution: Bundesarchiv, B 145 Bild-F038812-0014 / Schaack, Lothar / CC-BY-SA 3.0
1. People in lab coats looking
at blinking lights.
2. ‘Autotest’ (IBM System/360)
• Status print-outs at
different points during
program execution
• Main storage print-out
in the event of failure (!)
• ‘Automatic patch card
inclusion’ (?)
Source: IBM System/360 Programmer’s Basic Operating
System Programmer’s Guide (September 1967)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Good News: We Don’t Have to Wear
Lab Coats Anymore
Attribution: Flickr / Heisenberg Media/8408215473 / CC-BY-SA 3.0
1. People in jeans and hoodies
looking at screens
2. Various types of machine data
from different sources
• Infrastructure
• Backend Apps and Services
• … Mobile, Browser,
IoT, Edge, etc.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Software Architecture Continues to Change
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
It’s Globally Distributed in Multiple Regions
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
And Compute Is Getting Physically Closer with
Edge Computing and IoT
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Architecture Is Also Extremely Dynamic
Docker container lifespan in minutes (1-100), New Relic April 2017
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
More New Relic Customers
Run Complex, Distributed Systems
New Relic Service Map of Reference Telco Architecture
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Good Data Can Help with the Technical Shift
to New Systems
• Improved debugging and troubleshooting
• Designs validated with data
• Reduced defects, more issues caught
proactively
• Improved feature velocity
Technical
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Good Data Can Help with the Cultural Shift to
New Systems
• Builds transparency across teams
• Shared understanding of complex components
• Decisions not (entirely) driven or explained by
‘gut-feelings’ or guessing
• Freedom to experiment
• Blameless culture
• ‘Context not control’
Cultural
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Instrumentation
Increases
Observability
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How Do We Make
Microservices and Serverless
Functions Observable?
But...
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
#1: Observable Systems Should Emit Events:
Metrics, Logs, and Traces
16
“The database won’t start after the update.”
“Our application is 35% slower than last week
after this configuration change.”
“What are the dependencies for this service?”
Logs
Metrics
Traces
New Relic Provides
*via Partner
Integrations
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
#2: All Components (Not Just Critical Services!)
Should Be Instrumented
BrowserMobile
Server (Virtual)
Hardware and
Managed Services
Host Operating
System and
Containers
Application
Amazon EC2 Instance
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
#3: Instrumentation Should Not Be Opt-in,
Manual, or ‘Hard to Do’
On-Premises
Web
Server
On Premises
Relational Data
Synthetic
customers
Customers
Public Cloud
Micro Services
API
Browser
Apps
Mobile
NoSQL
Data Store
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Lambda Case Study
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Which Monitoring Batteries Are Included?
Amazon Cloudwatch Metrics
Amazon Cloudwatch Logs
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Lambda: Key Metrics
1. Invocations
2. Errors
3. Dead Letter Error
4. Duration
5. Throttles
6. Iterator Age (stream-based invocations only)
http://docs.aws.amazon.com/lambda/latest/dg/monitoring-functions-metrics.html
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What Else Provides AWS Lambda Observability?
AWS X-Ray
Request tracing for many AWS-managed services.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS X-Ray Trace: Example
A “cold start” trace initiated from in AWS X-Ray. Annotations in red.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Warm Start in an X-Ray Trace
Note the function executes almost immediately after the service
receives the request.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Traces In Aggregate Show Interesting Trends
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Serverless Architecture for Aggregating Traces
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What Does the Data Show in Insights?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
A-ha Moment: It Was Under-provisioned
with Memory!
Memory: 768mb
Memory: 1152mb
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Lessons Learned
• Instrument for observability: “What are the internal
lambda service latencies for my function?”
• Find the right balance of metrics, logs, and
traces for a given system: “Over 24 hours what’s
the distribution of function duration for my function?”
• Use analytics to diagnose: “Are cold starts
significant, what other factors are at play?”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Q&A with Marcus Irven, Scripps Network
Serverless Architectures in Production
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
THANK YOU!CLAY SMITH, NEW RELIC
TWITTER: @SMITHCLAY