Hawkular Metrics - Percona · 2017-05-02 · Multi-tenant tenant id required on each request...

Hawkular MetricsMetric Storage & Alerting

Stefan Negrea

2

About Me

Co-Creator of Hawkular Metrics

3

Introduction to Hawkular Metrics

Hawkular DemoHawkular Metrics

& Alerting

4

Pre-History● 2006 JBoss Operations Network 1.0

● 2008 Project RHQ

○ JBoss Operations Network 2.0

○ Metrics stored in Postgres

5

Pre-History

6

Pre-History● 2012 - 2013 RHQ Storage Nodes

○ Cassandra based

○ Store metrics

● 2014 RHQ Metrics

7

Hawkular

It’s a hawk with a monocular. Hawks are known to have a very sharp vision and very good hunters, they can catch preys anticipating their movements at a very fast speed.

The goal is to be able to monitor and catch anomalies in fast pace environments.

All* projects are Apache License 2.0

= +

8

History● 2014 Hawkular organization formed

● 2014 Hawkular Alerting started

● 02/2015 RHQ Metrics joins Hawkular org

● 12/2015 Hawkular Metrics integrated in OpenShift

Origin v3

● 10/2016 Hawkular Metrics includes Hawkular Alerting

Hawkular Metrics is a storage engine for metric data

metric data = a measurement taken at a specific time

storage engine = store metrics efficiently for their useful lifetime

Hawkular Metrics

9

● Gauge○ number

○ varies (not monotonic)

○ rate of change

● Counter○ integer

○ monotonic (increasing or decreasing

○ rate of change

○ support for reset

Supported Metrics

10

Memory usage(metric1, 4.5, 1493301898245)(metric1, 5.6, 1493301898246)(metric1, 1.2, 1493301898247)

Number of visitors(metric2, 4, 1493301898248)(metric2, 5, 1493301898249)(metric2, 9, 1493301898250)(metric2, 0, 1493301898251)

● Availability○ Availability of a resource

○ up, down, or unknown

○ can compute interesting stats based

on values

● String○ just that

○ possible uses: logs, events, config

Supported Metrics

11

Server status(metric3, UP, 1493301898253)(metric3, DOWN, 1493301898254)(metric3, UP, 1493301898255)

Value of configuration key ‘k’(metric4, “k=v”, 1493301898256)(metric4, “k=t”, 1493301898257)(metric4, “k=1”, 1493301898258)(metric4, “k=4”, 1493301898259)

12

Management & Support● Highly available, fault tolerant● No specialized node roles● Minimal configuration

Performance & Scalability● Optimized for writes● Data compression● Indexing

Cassandra - Storage

13

● CQL based● Partitioning & indexing of data based on

usage● Use built-in compression & TTL● Use the Datastax driver fully async● Support for latest C* 3.0.x release● Keep updating to latest stable● Use multiple tables for indexing

Cassandra - Storage

14

● REST API with JSON● JAX-RS 2.0 (async spec)● Fully async = JAX-RS 2.0 async + RX Java

+ async C* driver ● Stateless** server (Metrics, mostly)● Minimal clustering via Infinispan● Schema Management● Easy to use

○ packaged distribution with WildFly

○ download and run, only JDK required

App Layer

15

C* - 4 CPU, 4GBHawkular - 4 CPU, 4GBmessage sizes:10 datapoints: 2592 req/sec => 25920 datapoints/sec100 datapoints: 365 req/sec => 36500 datapoints/sec5000 datapoints: 7.6 req/sec => 38000 datapoints/sec

C*, 8 CPU, 8GBHawkular, 8 CPU, 4GBmessage sizes:10 datapoints: 4655 req/sec => 46550 datapoints/sec100 datapoints: 604 req/sec => 60400 datapoints/sec5000 datapoints: 15 req/sec => 75000 datapoints/sec

Performance - Sample

16

● Multi-tenant○ tenant id required on each request (HAWKULAR-TENANT header)

○ no way to get data from multiple tenants at once

● Can insert data without pre-creating metrics● Data is compressed using Gorilla compression

○ 2 hour time window

○ further reduces disk footprint

○ LZ4 enabled in Cassandra

○ Load testing:

■ 5000 data points/sec for 5 days = 26GB

■ 83M data points ~ 1GB of disk space

Features

17

● Bulk insertion endpoint for metrics and data● Tagging support for metrics and single data points

○ key, value; multi-tag support

■ tag1 = d

○ metrics queryable via TQL (tag query language)

■ AND, OR, NOT

■ grouping

■ wildcard matching

■ a1 = 'd' OR ( a1 != 'ab' AND c1 )

Features

18

● Endpoint for each metric type○ /gauges, /availability, /counters, /strings○ Each metric type has almost identical endpoints

● Raw data - /gauges/raw● Raw data for single metric - /strings/{metric_id}/raw● Query time aggregation

○ multiple metrics - /availability/stats○ single metric - /counter/{metric_id}/stats

● Bulk operations - /metrics

** String metrics do not have stats (yet?)

Features - Simple REST API

19

● Query Time Aggregation○ Combine multiple metrics and get statistical data

○ Gauge and counter: average, median, percentile, sum

○ Availability: ratios for uptime and downtime, downtime duration

○ Time Slicing: first group data, then compute stats

○ Single or multiple metrics

● Rate○ available for gauges and counters

○ rate of change of the values for the timespan

○ ex: how fast is the number of total requests increasing

Features - Aggregation & Rate

20

● Natural fit: collect data and then alert on anomalies

● Two ways to alert on metric data○ Dedicated API for setting up alerts, incoming

data is filtered and processed by the alerting

engine

○ Metrics Alerter that queries single or multiple

metrics, no need to predefine alerts triggers

ahead of time.

Metrics + Alerting

21

● Single and group Triggers● Template triggers● Complex conditions● Dampening● Auto-resolve/auto-disable triggers● Pluggable notifiers

Alerting Features

22

23

● Automatic & persisted aggregation

● Management capabilities for the Cassandra cluster

● Query language

● Performance improvements

○ already have a good baseline, but can do better

○ read/write

Roadmap - 2017

24

Demo

● Install ccm ○ https://github.com/pcmanus/ccm

● Start a single node C* cluster○ ccm create -v 3.0.12 -n 1 -s hawkular

● Download, extract and start Hawkular Metrics○ https://origin-repository.jboss.org/nexus/content/groups/public/org/ha

wkular/metrics/hawkular-metrics-wildfly-standalone/0.26.1.Final/

○ bin/standalone -b 0.0.0.0

● Download, extract and start Grafana● Download, install, and configure the Hawkular plugin for Hawkular

○ https://grafana.com/plugins/hawkular-datasource/installation

○ https://github.com/hawkular/hawkular-grafana-datasource

i. pick a tenant id of your choice

Demo

26

https://github.com/pcmanus/ccm

https://github.com/pcmanus/ccm

https://origin-repository.jboss.org/nexus/content/groups/public/org/hawkular/metrics/hawkular-metrics-wildfly-standalone/0.26.1.Final/



https://grafana.com/plugins/hawkular-datasource/installation

https://grafana.com/plugins/hawkular-datasource/installation

https://github.com/hawkular/hawkular-grafana-datasource

https://github.com/hawkular/hawkular-grafana-datasource

● Install the Hawkular Metrics python client via pip○ pip install hawkular-client

● Install psutil to collect CPU stats○ pip install psutil

● Create an custom agent (using python client)○ make sure you use the same tenant id configured with Grafana

○ pre-create and tag a metric for each CPU

○ collect CPU usage every 10 seconds

○ send the data to Hawkular Metrics

Demo

27

Demo

28

#! /usr/bin/env python3

import psutil, timefrom hawkular.metrics import HawkularMetricsClient, MetricType

client = HawkularMetricsClient(tenant_id='test')

cpu_percent = psutil.cpu_percent(interval = 1, percpu = True)for index, cpu in enumerate(cpu_percent) : client.create_metric_definition(MetricType.Gauge, 'cpu%s' % index, cpu = 'cpu%s' % index)

while True : cpu_percent = psutil.cpu_percent(interval = 1, percpu = True) for index, cpu in enumerate(cpu_percent) : client.push(MetricType.Gauge, 'cpu%s'% index, float(cpu))

time.sleep(10)

● Web - http://www.hawkular.org/

● Github - https://github.com/hawkular

● Metrics Documentation - http://www.hawkular.org/tags/metrics.html

● Alerting Documentation - http://www.hawkular.org/tags/alerts.html

● Twitter - https://twitter.com/hawkular_org

Resources

29

http://www.hawkular.org/

https://github.com/hawkular

http://www.hawkular.org/tags/metrics.html

http://www.hawkular.org/tags/alerts.html

https://twitter.com/hawkular_org

Thank you!

hawkular.org

#hawkular (on freenode)

[email protected]

Date post:	25-May-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

Hawkular Metrics - Percona · 2017-05-02 · Multi-tenant tenant id required on each request...

Documents