+ All Categories
Home > Software > Devoxx 2014 Monitoring

Devoxx 2014 Monitoring

Date post: 27-Jun-2015
Category:
Upload: claude-falguiere
View: 165 times
Download: 1 times
Share this document with a friend
Description:
Slidedeck of the talk Devoxx 2014
Popular Tags:
52
@cfalguiere #DV14 #Monitoring Monitoring Claude Falguière Valtech Paris
Transcript
Page 1: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Monitoring

!Claude Falguière

Valtech Paris

Page 2: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Content

• DevOps is more than tooling!!

!

• Make you love Data!!

• Motivations for providing and collecting data!!

• Monitoring user stories and practices!!

• Getting started and open source tooling

Individuals and interactions over processes and tools

Page 3: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Claude Falguiere

• Devoxx4Kids!• Paris JUG, Devoxx France,

Duchess!!

http://cfalguiere.wordpress.com !

• DevOps Coach !• Java, Performance!

Page 4: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Monitoring

that database is brokenthat number of hits doubles every 2 month

that users struggle to find the order formwhy the app is slow

what users want to buy

What would you do if you knew

Page 5: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Model

DataFacts 42 users

138 ms742 orders

Sales increased by 14%Estimated orders next month 934Average number of requests is 5 times

the number of users

ModelQuestionsHypothesis

Page 6: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Galaxy Rotation ProblemSpiral galaxies spin too fast !!Expected mass should be ten times the observed mass - calculated from the visible objets - to prevent galaxies from flying apart

Page 7: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Assumes readings are wrong

Discovery of Dark Matter

Hypothesis of a missing mass

??

If readings are true, is model wrong ?

Jan Oort Fritz Zwicky

Mass calculated from gravitational effects and evidence of Dark Matter

Dark matter estimated to 84.5% of the total matter in the universe

Vera Rubin

Plank Satellite

1932 - 1933 1960 - 1970 2010 - 2013

Page 8: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

• Hypothesis of a missing mass1932 - 1933

Jan Oort Fritz Zwicky

• Data are validated

• Is Universal Gravitation wrong ?

Page 9: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

• Still speculative, but generally accepted by the mainstream scientific community

Vera Rubin

• Mass calculated from gravitational effects and evidence of Dark Matter 1960 - 1970

• Dark matter estimated to 84.5% of the total matter in the universe 2010 - 2013

Plank Satellite

Page 10: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Lean Startup DevOps

Measure everything

Big Data

Make decisions based on facts!

Page 11: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Lean Startup DevOps

Measure everything

Big Data

Make decisions based on facts!

Page 12: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Lean Startup DevOps

Measure everything

Big Data

Make decisions based on facts!

Page 13: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Lean Startup DevOps

Measure everything

Big Data

Make decisions based on facts!

Page 14: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

What would you do if you knew

that database is brokenthat number of hits doubles every 2 month

that users struggle to find the order formwhy the app is slow

what users want to buy

Page 15: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Motivations and user stories

Alerting

SLA observance!Alerting

Storage, Visualization

Diagnosis / Post-Mortem!Capacity Planning!Improvement

Page 16: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Motivations and user stories

Alerting Storage, Visualization

SLA observance!Alerting

Diagnosis / Post-Mortem!Capacity Planning!Improvement

Page 17: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

App

Alerting

Log Parser

Support

Probe

Log

Storage, Aggregation Dev

System

Network DBA

Architecture

Collector

Visualization

Page 18: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

App

Alerting

Log Parser

Support

Probe

Log

Storage, Aggregation Dev

System

Network DBA

Architecture

Collector

Visualization

Page 19: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

App

Alerting

Log Parser

Support

Probe

Log

Storage, Aggregation Dev

System

Network DBA

Architecture

Collector

Visualization

Page 20: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

System

Collector

CollectorApp

MQ

Log

Storage

Storage

filters rules MQ

Alerting

Page 21: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

System

Collector

CollectorApp

MQ

Log

Storage

Storage

filters rules MQ

Alerting

Page 22: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

App

Alerting

Log Parser

Log

Storage, Aggregation

Topology

Collector! Visualization

App Platform

Monitoring Platform

Page 23: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

App

Alerting

Log Parser

Log

Storage, Aggregation

Resilience

Collector

Visualization

App PlatformMonitoring Platform

MQ

MQ Collector

Page 24: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

What would you do if you knew that

database is broken

Page 25: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Error detection and alerting

• Log filtering !• Event firing!

!

• Context!• is it critical ?!• which feature does it impact ?!• how deep is the impact ?

Page 26: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Is this a log ?Exception in thread "main" com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: ! Access denied for user 'shopapp'@'shprdb1' to database 'shop'!at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)!at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)!at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)!at java.lang.reflect.Constructor.newInstance(Unknown Source)!at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)!at com.mysql.jdbc.Util.getInstance(Util.java:386)!at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1054)!at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4237)!at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4169)!at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:928)!at com.mysql.jdbc.MysqlIO.proceedHandshakeWithPluggableAuthentication(MysqlIO.java:1750)!at com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1290)!at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2493)!at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2526)!at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2311)!at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:834)!at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)!at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)!at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)!at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)!at java.lang.reflect.Constructor.newInstance(Unknown Source)!at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)!at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:416)!at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:347)!at java.sql.DriverManager.getConnection(Unknown Source)!

Page 27: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Log example2013-12-17 05:53:16,208 ERROR [Order Creation Service](456713) [shpras2](web1234) Could not create order id=456713 - Cause: Can’t connect to database ‘shop” - MySqlMessage: Access denied for user 'shopapp'@'shprdb1' to database 'shop'!

2013-12-17 05:53:16,208 !ERROR ![Order Creation Service]!(456713) ![shpras2]!(web1234) !Could not create order id=456713 !Cause: Can’t connect to database ‘shop” !MySqlMessage: Access denied for user 'shopapp'@'shprdb1' to database 'shop'!

Page 28: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

2013-12-17 05:53:16,208 !ERROR ![Order Creation Service]!(456713) ![shpras2]!(web1234) !Could not create order id=456713 !Cause: Can’t connect to database ‘shop” !MySqlMessage: Access denied for user 'shopapp'@'shprdb1' to database 'shop'!

Timestamp

Severity

Context (technical and business)}{Meaningful information

Page 29: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Log Collectors

Logstash

Collectd

storage

Log

Alerting!System

Flume

Collector

Splunk (Commercial)

Page 30: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Logstashinput {! file {! path => “/app/logs/apache/*.log”! type => "apachelog"! }!}!!filter {! if [type] == "apachelog" {! grok {! pattern => “%{COMBINEDAPACHELOG}" ! }! }!}!!output {! elasticsearch { host => localhost } ! stdout { }!}

Page 31: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Logstashinput {! file {! path => “/app/logs/appserver/monitor*.log"! type => "applog"! }!}!!filter {! if [type] == "applog" {! grok {! pattern => “%{TIMESTAMP_ISO8601:ts}” %{WORD}:severity …! }! }!}!!output {! elasticsearch { host => localhost } ! stdout { }!}

Page 32: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Rate check

• Frequency of an error increases!• Activity falls (e.g. Frequency of orders)!

!

• Alerting based on threshold

Page 33: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Baselining

0

30

60

90

120

10:00 10:10 10:20 10:30 10:40 10:50 11:00 11:10

0

50

100

150

200

10:00 10:10 10:20 10:30 10:40 10:50 11:00 11:10

A

BB

D

C

Page 34: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

What would you do if you knew thatnumber of hits doubles every 2 month

0

30

60

90

120

Jan Feb Mar Apr May Jun Jul Aug

Page 35: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Graphers

0

7,5

15

22,5

30

Sun Tue Thu Sat Mon Wed Wed

• Cycles

0

10

20

30

40

Sun Tue Thu Sat Mon Wed Wed

• Correlation

0

17,5

35

52,5

70

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

• Foresight

• Distribution

0

25

50

75

100

April May June July

Page 36: Devoxx 2014 Monitoring

Collectors (Collectd / Statd / Logstash / Flume)

@cfalguiere#DV14 #Monitoring

Storage / Visualization

Graphite

docker: lopter/collectd-graphite

RESTPlain

Page 37: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Collect and ShareCollect Once and Share!• Support, !• Ops, Dev!• Business!!

UpToDate!Flexible!!

Page 38: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Storage / Visualization

Graphite

docker: gsogol/docker-elk

InfluxDB

Kibana

Logstash

ElasticSearch

Grafana

Collectors (Collectd / Statd / Logstash / Flume)

RESTPlain REST REST

Page 39: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

JMX

source: wikipedia

• MBeans!• Registration!• Servo!• RMI and firewalls!• -Dcom.sun.management.jmxremote.rmi.port=p!

• -Djava.rmi.server.hostname=n.n.n.n!

• Jolokia!• jmxtrans!

!

Page 40: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

JMX Collectors

JMX beans

VisualVM!JConsole

logstash collectd

JMX Enabled!!

App Performance Monitoring

tools

storage

Collector

Page 41: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

JSON Event over REST

curl -X POST “…” ! -d '{"ts": "2013-12-17 05:53:16,208", !! "type": “metric”, !! “module”: “Order Creation Service”, !! “module-id”: “456713”, !! “instance”: “shpras2”, !! “thread”: “web1234”, ! “name”: “order-creation”,!! “duration”: “12”, !! “unit”: “ms”}

Timestamp

Context (technical and business)}} Metric)

Page 42: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

What would you do if you knew

why app is slow

Page 43: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Tuning• Collectd/Statd plugins!• Metrics !• Commercial : Plumbr,

AppDynamics, New Relics!

!

!

Where does it spend time ?!Why ?

System

Back-EndDB

System

System

cross-check metrics from various sub-systems

Front-End

Page 44: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

What would you do if you knew thatusers struggle to find the order form

Page 45: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Web Analytics / User tracking

• Web analytics!• Page counters!• Tagging!• Log parser!

!

• Google Analytics!• Piwik (docker: cfalguiere/docker-piwik)

• Reporting APIs

Page 46: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

What would you do if you knew what

users want to buy

Page 47: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Model vs Big Data

• Expected information!• Explicit Model!• List of metrics

• Classification!• Machine Learning!• Patterns detection!

Highlights valuable metrics and relationships

Page 48: Devoxx 2014 Monitoring

Getting started

@cfalguiere#DV14 #Monitoring

List user stories and

metrics

setup monitoring

get facts

get hypothesis

add metrics

validate hypothesis

get facts

Page 49: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

What should I monitor ?Alerting & Post-Mortem :!Presence check

Activity (how many users, requests, orders …)

Ressources that are limited in size Physical : CPU, memory, free disk space, network bandwidth ...

Logical : pools, queues, caches, …

Errors

Others

Page 50: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

What should I monitor ?Plan & Improve :!Any information which is useful to understand the process

time spent for each major step

things that are done often or requires large datasets

user navigation

context

Listen to users and ops

Page 51: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Continuous Improvement

Design for Failure

Learn from data

Page 52: Devoxx 2014 Monitoring

@cfalguiere#DV14 #Monitoring

Thank You


Recommended