+ All Categories
Home > Data & Analytics > Cloud Monitoring with Prometheus

Cloud Monitoring with Prometheus

Date post: 15-Apr-2017
Category:
Upload: qaware-gmbh
View: 406 times
Download: 0 times
Share this document with a friend
20
Prometheus Cloud Monitoring with Prometheus Julius Volz, August 2016
Transcript

Prometheus

Cloud Monitoring with Prometheus

Julius Volz, August 2016

Prometheus

Monitoring system and TSDB:

● instrumentation● metrics collection and storage● querying● alerting● dashboarding / graphing / trending

Made for dynamic cloud environments!

What is Prometheus?https://prometheus.io

Prometheus

● raw log / event collection● request tracing● “magic” anomaly detection● durable long-term storage● automatic horizontal scaling● user / auth management

What does Prometheus NOT do?

Prometheus

● Started in 2012 at SoundCloud by Matt and Julius● Inspired by Google’s monitoring tools● Motivation

○ needed to monitor dynamic cloud environment○ unsatisfying data models, querying, and efficiency in

existing approaches

Origin

Prometheus

Architecture

Prometheus

Four main improvements

1. Multi-dimensional data model (like OpenTSDB).2. Powerful query language (the same for exploring, graphing, alerting).3. Efficient data collection (yes, it's pull, not push).4. Operational simplicity (unlike OpenTSDB).

Prometheus

Multi-dimensional data model

api_http_requests_total{method="GET", endpoint="/api/tracks", status="200"} 2034834

Prometheus

Powerful query language

topk(3, sum(rate(bazooka_instance_cpu_time_seconds_total[5m])) by (app, proc))

sort_desc(sum(bazooka_instance_memory_limit_bytes - bazooka_instance_memory_usage_bytes) by (app, proc))

Prometheus

Efficient data collection

1000s of targets.800,000 samples per second.

Millions of time series.On a single monitoring server.

Running many servers is easy, too…Pull, not push.

Prometheus

Operational simplicity

● written in Go● static binary● not clustered

Prometheus

Expression browser

Prometheus

Built-in graphing

Prometheus

Grafana Support

Prometheus

Challenges in Dynamic Environments

● on-demand VMs (EC2, Azure, GCP, ...)● dynamically scheduled service instances

(Kubernetes, Docker Swarm, ...)● microservices

⇨ many services, dynamic hosts, and ports

How to make sense of this mess?

Prometheus

Monitoring Dynamic Environments

● Use service discovery○ ...to know what should be there○ ...to pull metrics○ ...to add metadata to metrics

● Focus on services, not machines

Prometheus

Architecture

Prometheus

...with Prometheus● configure service in Prometheus

○ automatic discovery and scraping● map host, port, service etc. into

dimensions● query language enables:

○ service-level aggregation○ instance-level drill-down○ precise alerting

Prometheus

Prometheus <3 Kubernetes

● Borg -> Kubernetes● Borgmon -> Prometheus● both use labels● Prometheus supports Kubernetes SD● Kubernetes has Prometheus metrics

Prometheus

Demo?

Prometheus

Thanks!

Q&A


Recommended