Metrics by coda hale : to know your app’ health

Post on 02-Nov-2014

3,036 views 0 download

Tags:

description

Nowadays when developers required to be aligned with operations it’s quite important to have common understanding of how application is performing in production. I believe quite small amount of developers are really care/think about operation of the app. In this talk I’m going to describe how it’s easy to provide performance information of application in production with Metrics by Coda Hale and to share practical use cases.

transcript

Metrics by Coda Haleto know your app’ health

Izzet Mustafayev@EPAM Systems@webdizz webdizz izzetmustafaievhttp://webdizz.name

this is me● SA at EPAM Systems

● primary skill is Java

● JUG member/speaker

● hands-on-coding with Ruby, Groovy and

some Scala

● passionate about agile, clean code

practices and devops movement

agenda● introduction

● why should i care

● features

● reporting

● recipes

● alternatives

● q&a

Finish or Start?

Familiar?

Possible?!.

Your app?!.

Dreams...

One thing...

● does my app work?

● does my app work?

● will my app keep working?

● does my app work?

● will my app keep working?

● what happened that

made my app stop

working?

Statistics

Metrics

Features

Registry...- is the heart of metrics framework

MetricRegistry registry = new MetricRegistry();

// or

@Bean public MetricRegistry metricRegistry(){ return new MetricRegistry(); }

Gauge...- is the simplest metric type that just returns a value

registry.register(name(Persistence.class, "entities-cached"),

new Gauge<Integer>() {

public Integer getValue() { return cache.getItems(); }

});

Counter...- is a simple incrementing and decrementing integer value

Counter onSiteUsers = registry.counter(name(User.class, "users-logged-in"));

// on login increaseonSiteUsers.inc();

// on logout/session expiration decreaseonSiteUsers.dec();

Histogram...- measures the distribution of values in a stream of data

Histogram requestTimes = registry.histogram(name(Request.class, "request-processed")

);

requestTimes.update(request.getTime());

Meter...- measures the rate at which a set of events occur

Meter timedOutRequests = registry.meter(name(Request.class, "request-timed-out")

);

// in finally block on time-outtimedOutRequests.mark();

Timer...- is basically a histogram of the duration of a type of event and a meter of the rate of its occurrence

Timer requests = registry.timer(name(Request.class, "request-processed")

);Timer.Context time = requests.time();

// in finally block of request processingtime.stop();

Health Check..- is a unified way of performing application health checks

public class BackendHealthCheck extends HealthCheck {

private Backend backend;

protected Result check() { if (backend.ping()) { return Result.healthy(); } return Result.unhealthy("Can't ping backend"); } }

Reporting

JMX

// somewhere in your application

JmxReporter reporter = JmxReporter.forRegistry(registry).build();

reporter.start();

HTTP

● AdminServlet

○ MetricsServlet

○ ThreadDumpServlet

○ PingServlet

○ HealthCheckServlet

ConsoleConsoleReporter reporter = ConsoleReporter.forRegistry(registry)

.convertRatesTo(TimeUnit.SECONDS) .convertDurationsTo(TimeUnit.MILLISECONDS)

.build(); reporter.start(1, TimeUnit.MINUTES);

Slf4jReporter reporter = Slf4jReporter.forRegistry(registry) .outputTo(LoggerFactory.getLogger("com.app.metrics")) .convertRatesTo(TimeUnit.SECONDS) .convertDurationsTo(TimeUnit.MILLISECONDS) .build(); reporter.start(1, TimeUnit.MINUTES);

Ganglia & Graphite

Naming by Matt Aimonetti

<namespace>.<instrumented section>.<target (noun)>.<action (past tense verb)>

Example

● customers.registration.validation.failed

Nesting

● customers.registration.validation.failure.similar_email_found

● customers.registration.validation.failure.email_verification_failed

AOPThere is an easy way to introduce metrics using AOP.

<aop:config proxy-target-class="false"> <aop:aspect id="controllerProfilerAspect"

ref="executionMonitor"> <aop:pointcut id="controllerMethods" expression="execution(*

com.app.controller.*Controller.*(..))" />

<aop:around pointcut-ref="controllerMethods" method="logExecutionTime" />

</aop:aspect></aop:config>

public Object logExecutionTime(ProceedingJoinPoint joinPoint) throws Throwable {

String namePfx = metricNamePrefix(joinPoint);

Timer.Context time = registry.timer(name(namePfx, "timed")).time();

try { // execute joint point ... } catch(Throwable throwable) { registry.counter(

name(namePfx, "exception-counted")).inc(); throw throwable; } finally { time.stop() } }

metrics-spring@Configuration@EnableMetricspublic class ApplicationConfiguration { @Bean public MetricRegistry metricRegistry(){ return new MetricRegistry(); } // your other beans here ...}

// annotations@Timed, @Metered, @ExceptionMetered, @Counted , @Gauge, @CachedGauge and @Metric

instrumentation

● JVM

● Jetty

● Jersey

● InstrumentedFilter

● InstrumentedEhcache

● InstrumentedAppender (Log4j and Logback)

● InstrumentedHttpClient

Alternatives

https://github.com/twitter/zipkin

Zipkin is a distributed tracing system that helps Twitter gather timing data for all the disparate services.

Zipkin provides three services:

● to collect data: bin/collector ● to extract data: bin/query ● to display data: bin/web

http://www.moskito.org/

MoSKito is a ● Multi-purpose: collects any possible type of

performance data, including business-related● Non-invasive: does not require any code

change● Interval-based: works simultaneously with

short & long time intervals, allowing instant comparison

● Data privacy: keeps collected data locally, with no 3rd party resources involved.

● Analysis tools: displays accumulated performance data in charts. Live profiling: records user actions as system calls.

https://code.google.com/p/javasimon/

Java Simon is a simple monitoring API that allows you to

follow and better understand your application.

Monitors (familiarly called Simons) are placed directly into your code and you can choose whether you want to count something or measure time/duration.

https://github.com/Netflix/servo

Servo goal is to provide a simple interface for exposing and publishing application metrics in Java.

● JMX: JMX is the standard monitoring interface for Java

● Simple: It is trivial to expose and publish metrics

● Flexible publishing: Once metrics are exposed, it is easy to regularly poll the metrics and make them available for reporting systems, logs, and services like Amazon’s CloudWatch.

Summary

q&a

ThanksIzzet Mustafayev@EPAM Systems@webdizz webdizz izzetmustafaievhttp://webdizz.name