Post on 20-May-2020
transcript
1
Massimo BrignoliPrincipal Solutions Architect, Elastic
Three Pilars of Observability Kuberneteswith Elastic Stack
2
• Custom on-prem & cloud deployments
• Public cloud fully-managed deployments
– Google Kubernetes Engine (GKE)
– Amazon Elastic Container Service for Kubernetes (EKS)
– Azure Kubernetes Service (AKE)
• Pivotal Container Service (PKS)
• Red Hat OpenShift
Kubernetes is Taking Over the Enterprise
3
Kubernetes is Complicated
Container Runtime
4
Kubernetes Visibility Challenges
55
Observable Kubernetes
Elastic Stack: Three Pillars of Observability in One Platform
● Logging
● Metrics
● APM Tracing
6
It Comes Down to The Three Pillars of Observability
Twitter:https://blog.twitter.com/engineering/en_us/a/2013/observability-at-twitter.htmlPeter Bourgonhttps://peter.bourgon.org/blog/2017/02/21/metrics-tracing-and-logging.html
7
Elastic at the Center Stage
8
Elastic Stack for logs
64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "GET /mailman/listinfo/hsdivision HTTP/1.1" 200 6291
64.242.88.10 - - [07/Jan/2019:16:11:58 -0800] "POST /twiki/bin/view/TWiki/WikiSyntax HTTP/1.1" 404 7352
64.242.88.10 - - [07/Jan/2019:16:20:55 -0800] "GET /twiki/bin/view/Main/DCCAndPostFix HTTP/1.1" 200 5253
For each event, print out what happened.
Metrics vs LogsLogs are chronological records of events
•Turnkey experience for specific data types
•Data to dashboard in just one step
•Automated parsing and enrichment
•Default dashboards, alerts, ML jobs
Making logging more turnkey with modulesLogging Metrics Security
Logging Modules
11
System
•Linux / MacOS
•Windows Events
Containers
•Docker
•Kubernetes
Databases
•MySQL
•PostgreSQL
Queues
•Kafka
•Redis
Web servers
•Apache
•Nginx
Audit data
•Filesystem
•System calls
Infrastructure Applications
WINLOGBEATFILEBEATAUDITBEAT
Log File Import
12
Automatic Structure Discovery
Ad-hoc log search and visualization Kibana Discover, Visualize, Dashboard
14
Elastic Stack for metrics
Elasticsearch beginnings
15
Primarily used for application searchSearch engineInverted index primary data structure, and is great for search
2010
2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics
Elasticsearch evolves to support analytics
https://www.elastic.co/blog/elasticsearch-as-a-column-store
Columnar Store, Built on Lucene "doc values"Search engineInverted index primary data structure, and is great for search
2010
2014 Aggregation Framework Analytics features to slice and dice data along various dimensions
Aggregation Framework
17
Out-of-this-world aggregations
https://www.elastic.co/blog/out-of-this-world-aggregations
Search engineInverted index primary data structure, and is great for search
2010
2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics
BKD trees and sparse fieldsData structures optimized for numbers. Faster analytics, lower storage footprint
2016
2014 Aggregation Framework Analytics features to slice and dice data along various dimensions
Elasticsearch storage efficiencies
18
BKD Trees & Sparse Fields
https://www.elastic.co/blog/searching-numb3rs-in-5.0
1-Dimension
2-Dimensions
Sparse Data
Search engineInverted index primary data structure, and is great for search
2010
2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics
RollupsRoll up or aggregate older data into bigger time buckets and save on disk space
2018
Rollup support for long-term retention
Added in Elasticsearch 6.3
https://www.elastic.co/blog/data-rollups-in-elasticsearch-you-know-for-saving-space
Search engineInverted index primary data structure, and is great for search
2010
BKD trees and sparse fieldsData structures optimized for numbers. Faster analytics, lower storage footprint
2016
2014 Aggregation Framework Analytics features to slice and dice data along various dimensions
2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics
Elasticsearch for search and numerical analytics
20
Inverted Index for full-text search Columnar store for structured data
BKD Trees for numerical operations Rollups save space
Metrics Modules
21
Infrastructure
System
•Linux
•MacOS
•Windows
•Perfmon
Cloud
•AWS
•GCP
•Azure
•DigitalOcean
•Alibaba
Containers
•Docker
•Kubernetes
Virtualization
•vSphere
Network
•Netflow
•Packets
•TLS Envelope
Storage
•Ceph
PACKETBEATMETRICBEATHEARTBEAT
Infrastructure
22
Metrics Modules
Infrastructure
PACKETBEATMETRICBEATHEARTBEAT
Uptime
•Heartbeat
Custom apps
•JMX/Jolokia
•PHP-FPM
•Golang
Datastores
•MySQL
•PostgreSQL
•MongoDB
•Couchbase
•Aerospike
•Graphite
Queues
•Kafka
•Redis
•RabbitMQ
Caches
•Memcached
Web servers
•Apache
•Nginx
Other
•HAProxy
•Zookeeper
Applications
Heartbeat: Uptime Monitoring
Heartbeat: Uptime Monitoring
Functionbeat: Serverless data shipper
Cloudwatch Cloudwatch Logs
Functionbeat: Serverless data shipper
Visualizing time series dataTime Series Visual Builder
28
Elastic Stack for APM
Example: Slow response or load times
Why APM?
03:43:45 Request "GET cyclops.ESProductDetailView"
03:43:57 Response "cyclops.ESProductDetailView 200 OK"
12 seconds - zZzzZZz
Example: Errors & Exceptions
Why APM?
03:43:59 Request "POST /api/checkout"
03:43:59 Response "/api/checkout 500 ERROR"
Agents, API, and APM Server
How APM works
Data processorapm-server
Data storageElasticsearch
BrowserAgent
Web server
Agent
Web server
Agent
UIKibana
BrowserAgent
BrowserAgent
Web server
Agent
APM adds end-user experience and application-level monitoring to the stack
Elastic APM
● Python
● Node.js
● Ruby
● RUM (Real User Monitoring)
Language Support
● Java
● Go
● .NET (in dev)
•Focuses on search experience on top of APM data
•Just another index in Elastic Stack
•Active roadmap to expand programming languages
Great overview and drill-down with industry-standard visualizations
Dedicated APM UI
Single transaction
Distributed Tracing
Transaction 1
SpanSpan
Span
HTTP request Response
Multiple Services
Distributed Tracing
Trace A
Transaction 1
SpanSpan
Transaction 2
Span
Transaction 3
SpanSpan
Span
Combine a custom workflow with the freedom of search
Ad-hoc search in a curated UI
Need another visualization? Build a dashboard, no need to wait for your vendor
APM is just another index in Elasticsearch
Correlate data from different sourcesAbility to re-use analysis content Ability to re-use Elastic-provided content
Correlation between logs, metrics, and APM Elastic Common Schema
Benefits
Version 0.1 published: github.com/elastic/ecsWorking with internal groups to validateCommunity feedback welcome!
Status
39
Metadata processorsEnrich events with useful metadata to correlate logs, metrics & traces
• cloud.availability_zone
• cloud.region
• cloud.instance_id
• cloud.machine_type
• cloud.project_id
• cloud.provider
• docker.container.id
• docker.container.image
• docker.container.name
• docker.container.labels
• kubernetes.pod.name
• kubernetes.namespace
• kubernetes.labels
• kubernetes.annotations
• kubernetes.container.name
• kubernetes.container.image
add_cloud_metadata add_docker_metadata add_kubernetes_metadata
40
Kubernetes deployment
Node 1
Metricbeat
Filebeat
Node 2
Metricbeat
Filebeat
Node n
Metricbeat
Filebeat
Filebeat DaemonSet
Metricbeat DaemonSet
4141
Logging
● Cluster level logging
● Services logging (eg. nginx, mysql)
● Custom application logging
42
Kubernetes Logging
• Need for a logging solution– Kubernetes does not have a native solution
– kubectl logs is too hard for large clusters
• Cluster-level logging– Logs have separate storage and lifecycle independent of nodes, pods and containers
– Kubernetes provides no native storage solution for log data
• Application-level logging– Complicated
– Packaged applications (eg. nginx)
– Custom applications
43
Two Packaged Solutions
• Fluentd DamonSet– Log collection, parsing and distribution
• Fluentd + Stackdriver for GCP
• Fluentd + Elasticsearch
44
Better Log Collection with Filebeat
kubectl create -f filebeat-kubernetes.yaml
45
Filebeat Auto-Discovery
filebeat.autodiscover: providers: - type: kubernetes templates: - condition: contains: kubernetes.container.image: " nginx" config: - module: nginx access: # For nginx access log prospector: type: docker containers.ids: - "${data.kubernetes.container.id}"
• A module contains
– Log file path
– Ingest pipeline
– Fields definitions
– Sample dashboards
46
• Apache2 module
• Auditd module
• Icinga module
• IIS module
• Kafka module
• Logstash module
• MongoDB module
Filebeat ModulesSimplify collection, parsing and visualization of common log formats
• MySQL module
• Nginx module
• Osquery module
• PostgreSQL module
• Redis module
• System module
• Traefik module
4747
Metrics
● Metrics data sources
● Popular solutions
● Metricbeat
48
Kubernetes Monitoring
• What to monitor– Cluster monitoring– Pod monitoring– Application monitoring
• Metrics sources– cAdvisor & Heapster– Kube-state-metrics– Prometheus– APM
• Solutions– Heapster/InfluxDB/Grafana– Heapster/Elasticsearch– Prometheus/Grafana– APM - Datadog, Dynatrace– Metricbeat with Autodiscovery
Collect Store Analyze
ElasticsearchInfluxDB...
KibanaGrafana...
MetricbeatHeapsterPrometheus...
SearchDashboardAlerts...
Data ModelMetrics Sources
49
Comprehensive Metrics Collection Metricbeat
• Kubernetes module• Monitors pods and services
– Cluster, pod & container metrics– Application metrics through auto-discovery
(eg. Nginx)
• Metrics sources - Cover them ALL– Kubelet (heapster, cAdvisor)– kube-state-metric– Kubernetes events– Prometheus module (beta)
• Curated infra UI • Dedicated Kibana app
50
Out-of-the-box Dashboards
51
Curated UI for KubernetesVisualize the cluster and group by nodes or namespaces or pods
52
Monitor Services inside Containers with Auto-Discovery
Metricbeat Filebeat
Node n
Logs
MetricsNginx
metricbeat.autodiscover:
providers:
- type: kubernetes
host: ${HOSTNAME}
templates:
- condition.contains:
kubernetes.container.name: nginx
config:
- module: nginx
period: 10s
metricsets: [" stubstatus"]
hosts: ["${data.host}:8080"]
53
Metricbeat ModulesSimplify collection and visualization of common metrics
● Aerospike module● Apache module● Ceph module● Couchbase module● Docker module● Dropwizard module● Elasticsearch module● Etcd module● Golang module● Graphite module● HAProxy module● HTTP module
● Jolokia module● Kafka module● Kibana module● Kubernetes module● kvm module● Logstash module● Memcached module● MongoDB module● Munin module● MySQL module● Nginx module
● PHP_FPM module● PostgreSQL module● Prometheus module● RabbitMQ module● Redis module● System module● uwsgi module● vSphere module● Windows module● ZooKeeper module
5454
Tracing
● Elastic APM
55
Microservices Can Be ComplicatedMicroservice Architecture of Uber
https://dzone.com/articles/microservice-architecture-learn-build-and-deploy-a
56
First Major Open Source APM SolutionAgents, Server, Dashboards
57
APM Tracing - Transaction Waterfall View
58
You can do MORE ...
• Enforce access policies with X-Pack Security
• Be notified about changes & problems with X-Pack Alerting
• Be smarter with X-Pack Machine Learning
• ...
Be Creative, the Sky is NOT even the Limit with Elastic!
59
Cloud Native Computing Foundation
• https://www.cncf.io/projects/
Resource Monitoring solutions
• https://kubernetes.io/docs/tasks/debug-application-cluster/resource-usage-monitoring/
Log monitoring:
https://kubernetes.io/docs/tasks/debug-application-cluster/logging-stackdriver/
https://kubernetes.io/docs/tasks/debug-application-cluster/logging-elasticsearch-kibana/
Kubernetes Resources
60
Questions you may ask
• How long time do you need to resolve performance issue with
your application?
• How easy is it to get, find and combine logs, metric and APM
data on your current solution?
• How many monitoring systems you need to maintain?
• Do you keep data in silos?
Questions?