+ All Categories
Home > Technology > From Ceilometer to Telemetry: not so alarming!

From Ceilometer to Telemetry: not so alarming!

Date post: 24-Dec-2014
Category:
Upload: nick-nicolas-barcet
View: 1,656 times
Download: 0 times
Share this document with a friend
Description:
Presentation of Ceilometer (OpenStack Telemetry) new features in OpenStack Havana and a look at the features coming in IceHouse. Joint presentation done with Julien Danjou at the OpenStack In Action 4 (Dec 5th 2013)
28
From Ceilometer to Telemetry Not so alarming! A Julien Danjou & Nick Barcet presentation for OpenStack in action! 4 on the 5th December 2013
Transcript
Page 1: From Ceilometer to Telemetry: not so alarming!

From Ceilometer to TelemetryNot so alarming! A Julien Danjou & Nick Barcet presentation

forOpenStack in action! 4

on the 5th December 2013

Page 2: From Ceilometer to Telemetry: not so alarming!

Speakers

Nick Barcet VP Products @ eNovanceCo-founded the Ceilometer project at the Folsom summit and led the project through incubation

Julien Danjou Ceilometer Lead Dev @ eNovanceHas been a core Ceilometer contributor from the outset, taking over the PTL reins for Havana

Page 3: From Ceilometer to Telemetry: not so alarming!

State of the project

● Officially named OpenStack Telemetry● Havana is the first integrated release● Community growth

○ Grizzly: 30 contributors, 267 commits○ Havana: 57 contributors, 434 commits

Page 4: From Ceilometer to Telemetry: not so alarming!

What was done during the Havana cycle?

Page 5: From Ceilometer to Telemetry: not so alarming!

UDP transport● Faster, stateless● Lighter (msgpack encoding)

but…

● No delivery guaranteed● Not signed

▶ Use case: gathering metrics for alarms

Page 6: From Ceilometer to Telemetry: not so alarming!

Improved API

● Group samples by fields when requesting statistics (?groupby[]=user_id)

● Limit the number of items returned (?limit=42)● Provides links to other resources in the API

Page 7: From Ceilometer to Telemetry: not so alarming!

Send your own samples

Users or operators can send samples

➔ Leverage the statistics

➔ Usable for alarming

POST /v2/meters/mymeter

[{ "counter_type": "gauge", "counter_unit": "megabyte", "counter_volume": 142.0, "user_id": "efd87807-12d2-4b38-9c70-5f5c2ac427ff", "project_id": "35b17138-b364-4e6a-a131-8f3099c5be68", "resource_id": "bd9431c1-8d69-4ad3-803a-8d4a6b89fd36", "resource_metadata": { "name1": "value1", "name2": "value2" }, "source": "mypaasplatform", "timestamp": "2013-09-10T20:34:13.711330"}]

Page 8: From Ceilometer to Telemetry: not so alarming!

New storage backends

Page 9: From Ceilometer to Telemetry: not so alarming!

Database TTL

Previously:No way to purge data.

Ceilometer produces a lot of data (gigabytes per day)

Now:ceilometer-expirer will drop data older than the configured time-to-live delay

Page 10: From Ceilometer to Telemetry: not so alarming!

Hyper-V

➔ Disk, network and CPU usage

Page 11: From Ceilometer to Telemetry: not so alarming!

New meters

● API endpoints○ Meters the requests made to API server (Neutron,

Glance, Nova, Swift, etc)● Neutron bandwidth

○ Meter the bandwidth consumed by each project○ Traffic labeled as configured by operator

(based on source/destination)

Page 12: From Ceilometer to Telemetry: not so alarming!

Neutron Traffic Labels

Internet

label: Extlabel: Objectlabel: Compute

Swift Swift SwiftVM VM VM

Page 13: From Ceilometer to Telemetry: not so alarming!

Alarms

Regularly watch for meters statistics values and triggers actions based on threshold crossings.

Page 14: From Ceilometer to Telemetry: not so alarming!

Alarms architecture

Ceilometer API

Ceilometer alarm evaluator

Ceilometeralarm notifier

HTTP

RPC Bus

Trigger TriggerCeilometer

alarm notifierCeilometeralarm notifier

Webhook, SMS, e-mail…

Page 15: From Ceilometer to Telemetry: not so alarming!

Alarm types● Threshold alarms

Triggered once a value crosses a threshold“Call a Webhook as soon as CPU usage goes above 80%”

● Combination alarmsTriggered once all alarms in that alarm are triggered“Call a Webhook as soon as alarm “foo” and alarm “bar” are triggered”

Page 16: From Ceilometer to Telemetry: not so alarming!

Alarms APIPOST /v2/alarms

{ "alarm_actions": [ "http://site:8000/alarm"], "insufficient_data_actions": ["http://site:8000/nodata"], "ok_actions": ["http://site:8000/ok"], "comparison_operator": "gt", "description": "An alarm", "evaluation_periods": 2, "matching_metadata": {"key_name": "key_value"}, "meter_name": "storage.objects", "name": "SwiftObjectAlarm", "period": 240, "statistic": "avg", "threshold": 200.0}

GET /v2/alarms/foobar

PUT /v2/alarms/foobar

DELETE /v2/alarms/foobar

Page 17: From Ceilometer to Telemetry: not so alarming!

Heat & auto-scaling

Heat Engine

injects user metadata

Instance

my_stack

API service

Compute Agent

creates alarms

Alarm evaluator

monitors instances

triggers alarm

Ceilom

eter

Page 18: From Ceilometer to Telemetry: not so alarming!

Heat & auto-scaling

Heat Engine

injects user metadata

my_stack

API

Compute

Alarms

alarming

scales out stack

InstanceInstanceInstance

Ceilom

eter

Page 19: From Ceilometer to Telemetry: not so alarming!

Heat & auto-scaling

Heat Engine

injects user metadata

my_stack

API

Compute

Alarms

alarming

scales out stack

InstanceInstanceInstanceInstanceInstance

Ceilom

eter

Page 20: From Ceilometer to Telemetry: not so alarming!

Events storage(Almost) all OpenStack components send notifications on events: let’s store them.➔ Useful to be able to re-generate samples➔ Useful to generate new sample we did not think about➔ Allow to have a double-entry accounting➔ Audit ability

Not yet complete, to be continued in Icehouse

Page 21: From Ceilometer to Telemetry: not so alarming!

Exciting ideas for Icehouse we’re going to hack on.

Page 22: From Ceilometer to Telemetry: not so alarming!

General improvements

● Split the collector in two logical pieces● Rely on notification for samples rather than

RPC● Bring SQLAlchemy and MongoDB driver

almost on parity● Support for hardware polling● Support Ironic

Page 23: From Ceilometer to Telemetry: not so alarming!

API improvements

● Complex filtering and query DSLx OR y AND z

● /v2/samples(a.k.a. /v2/meter without the meter)

● Return rate rather than absolute value● More statistics functions (rate of change,

moving-window averages…)● Bulk requests

Page 24: From Ceilometer to Telemetry: not so alarming!

Alarming

● Exclude low sample counts● Allow time constrained alarms

Page 25: From Ceilometer to Telemetry: not so alarming!

Distributed polling

Leveraging Tooz and Taskflow to distribute tasks among workers (agents).

★ Ability to distribute the polling

★ Replace alarm evaluator custom distributor

Page 26: From Ceilometer to Telemetry: not so alarming!

The end.

OpenStack Telemetry

#openstack-ceilometer @ Freenode

Ceilometer

Page 27: From Ceilometer to Telemetry: not so alarming!

Backup slides

Page 28: From Ceilometer to Telemetry: not so alarming!

Heat & auto-scaling

Heat Engine

my_stack

Instance

API service

Compute Agent

Alarm evaluatorreports

samples

provides alarm rules

queries statsMeter store

Ceilom

eter


Recommended