+ All Categories
Home > Internet > Prometheus Overview

Prometheus Overview

Date post: 11-Jan-2017
Category:
Upload: brian-brazil
View: 6,477 times
Download: 3 times
Share this document with a friend
19
Prometheus Overview The Promethean Ideal of Monitoring
Transcript
Page 1: Prometheus Overview

Prometheus OverviewThe Promethean Ideal of Monitoring

Page 2: Prometheus Overview

Why monitor?

● Know when things go wrong○ To call in a human to prevent a business-level issue, or prevent an issue in advance

● Be able to debug and gain insight● Trending to see changes over time, and drive technical/business decisions● To feed into other systems/processes (e.g. QA, security, automation)

Page 3: Prometheus Overview

Common Monitoring ChallengesThemes common among companies I’ve talk to:

● Monitoring tools are limited, both technically and conceptually● Tools don’t scale well and are unwieldy to manage● Operational practices don’t align with the business

For example:

Your customers care about increased latency and it’s in your SLAs. You can only alert on individual machine CPU usage.

Result: Engineers continuously woken up for non-issues, get fatigued

Page 4: Prometheus Overview

Fundamental Challenge is Limited Visibility

Page 5: Prometheus Overview

PrometheusInspired by Google’s Borgmon monitoring system.

Started in 2012 by ex-Googlers working in Soundcloud as an open source project, mainly written in Go. Publically launched in early 2015, and continues to be independent of any one company.

Over 100 companies have started relying on it since then.

Page 6: Prometheus Overview

What does Prometheus offer?● Inclusive Monitoring● Powerful data model● Powerful query language● Manageable and Reliable● Efficient● Scalable● Easy to integrate with● Dashboards

Page 7: Prometheus Overview

Services have Internals

Page 8: Prometheus Overview

Monitor the Internals

Page 9: Prometheus Overview

Monitor as a Service, not as Machines

Page 10: Prometheus Overview

Inclusive MonitoringDon’t monitor just at the edges:

● Instrument client libraries● Instrument server libraries (e.g. HTTP/RPC)● Instrument business logic

Library authors get information about usage.

Application developers get monitoring of common components for free.

Dashboards and alerting can be provided out of the box, customised for your organisation!

Page 11: Prometheus Overview

Powerful Data ModelAll metrics have arbitrary multi-dimensional labels.

No need to force your model into dotted string.

Can aggregate, cut, and slice along them.

Supports any double value, labels support full unicode.

Page 12: Prometheus Overview

Powerful Query LanguageCan multiply, add, aggregate, join, predict, take quantiles across many metrics in the same query. Can evaluate right now, and graph back in time.

Answer questions like:

● What’s the 95th percentile latency in the European datacenter?● How full will the disks be in 4 hours?● Which services are the top 5 users of CPU?

Can alert based on any query.

Page 13: Prometheus Overview

Manageable and ReliableCore Prometheus server is a single binary.

Doesn’t depend on Zookeeper, Consul, Cassandra, Hadoop or the Internet.

Only requires local disk (SSD recommended). No potential for cascading failure.

Pull based, so easy to on run a workstation for testing and rogue servers can’t push bad metrics.

Advanced service discovery finds what to monitor.

Page 14: Prometheus Overview

EfficientInstrumenting everything means a lot of data.

Prometheus is best in class for lossless storage efficiency, 3.5 bytes per datapoint.

A single server can handle:

● millions of metrics● hundreds of thousands of datapoints per second

Page 15: Prometheus Overview

ScalablePrometheus is easy to run, can give one to each team in each datacenter.

Federation allows pulling key metrics from other Prometheus servers.

When one job is too big for a single Prometheus server, can use sharding+federation to scale out. Needed with thousands of machines.

Page 16: Prometheus Overview

Easy to integrate withMany existing integrations: Java, JMX, Python, Go, Ruby, .Net, Machine, Cloudwatch, EC2, MySQL, PostgreSQL, Haskell, Bash, Node.js, SNMP, Consul, HAProxy, Mesos, Bind, CouchDB, Django, Mtail, Heka, Memcached, RabbitMQ, Redis, RethinkDB, Rsyslog, Meteor.js, Minecraft...

Graphite, Statsd, Collectd, Scollector, Munin, Nagios integrations aid transition.

It’s so easy, most of the above were written without the core team even knowing about them!

Page 17: Prometheus Overview

Dashboards

Page 18: Prometheus Overview

What do we do?Robust Perception is an independent provider of Prometheus-related services.

We can help you:

● Decide if Prometheus is for you● Manage your transition to Prometheus● Resolve issues that arise● Use Prometheus to run and scale your production systems efficiently

We are proud to be among the core contributors to the Prometheus project.

Page 19: Prometheus Overview

ResourcesOfficial Project Website: prometheus.io

Official Mailing List: [email protected]

Demo: demo.robustperception.io

Robust Perception Website: www.robustperception.io

Queries: [email protected]


Recommended