Monitoring Spark ApplicationsTzach Zohar @ Kenshoo, March/2016
Who am ISystem Architect @ Kenshoo
Java backend for 10 years
Working with Scala + Spark for 2 years
https://www.linkedin.com/in/tzachzohar
Who’s Kenshoo10-year Tel-Aviv based startup
Industry Leader in Digital Marketing
500+ employees
Heavy data shop
http://kenshoo.com/
And who’re you?
AgendaWhy Monitor
Spark UI
Spark REST API
Spark Metric Sinks
Applicative Metrics
The Importance of being Earnest
Why MonitorFailures
Performance
Know your data
Correctness of output
Monitoring Distributed SystemsNo single log file
No single User Interface
Often - no single framework (e.g. Spark + YARN + HDFS…)
Spark UI
Spark UISee http://spark.apache.org/docs/latest/monitoring.html#web-interfaces
The first go-to tool for understanding what’s what
Created per SparkContext
Spark UIJobs -> Stages -> Tasks
Spark UIJobs -> Stages -> Tasks
Spark UI Use the “DAG Visualization” in Job Details to:
Understand flow
Detect caching opportunities
Spark UIJobs -> Stages -> Tasks
Detect unbalanced stages
Detect GC issues
Spark UIJobs -> Stages -> Tasks -> “Event Timeline”
Detect stragglers
Detect repartitioning opportunities
Spark UI Disadvantages“Ad-Hoc”, no history*
Human readable, but not machine readable
Data points, not data trends
Spark UI Disadvantages
UI can quickly become hard to use…
Spark REST API
Spark’s REST APISee http://spark.apache.org/docs/latest/monitoring.html#rest-api
Programmatic access to UI’s data (jobs, stages, tasks, executors, storage…)
Useful for aggregations over similar jobs
Spark’s REST APIExample: calculate total shuffle statistics: object SparkAppStats { case class SparkStage(name: String, shuffleWriteBytes: Long, memoryBytesSpilled: Long, diskBytesSpilled: Long) implicit val formats = DefaultFormats val url = "http://<host>:4040/api/v1/applications/<app-name>/stages"
def main (args: Array[String]) { val json = fromURL(url).mkString val stages: List[SparkStage] = parse(json).extract[List[SparkStage]] println("stages count: " + stages.size) println("shuffleWriteBytes: " + stages.map(_.shuffleWriteBytes).sum) println("memoryBytesSpilled: " + stages.map(_.memoryBytesSpilled).sum) println("diskBytesSpilled: " + stages.map(_.diskBytesSpilled).sum) }}
Example: calculate total shuffle statistics:
Example output:
stages count: 1435
shuffleWriteBytes: 8488622429
memoryBytesSpilled: 120107947855
diskBytesSpilled: 1505616236
Spark’s REST API
Spark’s REST APIExample: calculate total time per job name: val url = "http://<host>:4040/api/v1/applications/<app-name>/jobs"
case class SparkJob(jobId: Int, name: String, submissionTime: Date, completionTime: Option[Date], stageIds: List[Int]) { def getDurationMillis: Option[Long] = completionTime.map(_.getTime - submissionTime.getTime) } def main (args: Array[String]) { val json = fromURL(url).mkString parse(json) .extract[List[SparkJob]] .filter(j => j.getDurationMillis.isDefined) // only completed jobs .groupBy(_.name) .mapValues(list => (list.map(_.getDurationMillis.get).sum, list.size)) .foreach { case (name, (time, count)) => println(s"TIME: $time\tAVG: ${time / count}\tNAME: $name") } }
Spark’s REST APIExample: calculate total time per job name:
Example output:
TIME: 182570 AVG: 16597 NAME: count at
MyAggregationService.scala:132
TIME: 230973 AVG: 1297 NAME: parquet at MyRepository.scala:99
TIME: 120393 AVG: 2188 NAME: collect at MyCollector.scala:30
TIME: 5645 AVG: 627 NAME: collect at MyCollector.scala:103
But that’s still ad-hoc, right?
Spark Metric Sinks
Metrics: easy Java API for creating and updating metrics stored in memory, e.g.:
MetricsSee http://spark.apache.org/docs/latest/monitoring.html#metrics
Spark uses the popular dropwizard.metrics library (renamed from codahale.metrics and yammer.metrics)
// Gauge for executor thread pool's actively executing task countsmetricRegistry.register(name("threadpool", "activeTasks"), new Gauge[Int] { override def getValue: Int = threadPool.getActiveCount()})
MetricsWhat is metered? Couldn’t find any detailed documentation of this
This trick flushes most of them out: search sources for “metricRegistry.register”
Where do these metrics go?
Spark Metric SinksA “Sink” is an interface for viewing these metrics, at given intervals or ad-hoc
Available sinks: Console, CSV, SLF4J, Servlet, JMX, Graphite, Ganglia*
we use the Graphite Sink to send all metrics to Graphite
$SPARK_HOME/metrics.properties:*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink*.sink.graphite.host=<your graphite hostname>*.sink.graphite.port=2003*.sink.graphite.period=30*.sink.graphite.unit=seconds*.sink.graphite.prefix=<token>.<app-name>.<host-name>
.. and it’s in Graphite ( + Grafana)
Graphite SinkVery useful for trend analysis
WARNING: Not suitable for short-running applications (will pollute graphite with new metrics for each application)
Requires some Graphite tricks to get clear readings (wildcards, sums, derivatives, etc.)
Applicative Metrics
The Missing PieceSpark meters its internals pretty thoroughly, but what about your internals?
Applicative metrics are a great tool for knowing your data and verifying output correctness
We use Dropwizard Metrics + Graphite for this too (everywhere)
Counting RDD Elementsrdd.count() might be costly (another action)
Spark Accumulators are a good alternative
Trick: send accumulator results to Graphite, using “Counter-backed Accumulators”/** * * Call returned callback after acting on returned RDD to get counter updated */ def countSilently[V: ClassTag](rdd: RDD[V], metricName: String, clazz: Class[_]): (RDD[V], Unit => Unit) = { val counter: Counter = Metrics.newCounter(new MetricName(clazz, metricName)) val accumulator: Accumulator[Long] = rdd.sparkContext.accumulator(0, metricName) val countedRdd = rdd.map(v => { accumulator += 1; v }) val callback: Unit => Unit = u => counter.inc(accumulator.value) (countedRdd, callback) }
Counting RDD Elements
We Measure...Input records
Output records
Parsing failures
Average job time
Data “freshness” histogram
Much much more...
WARNING: it’s addictive...
ConclusionsSpark provides a wide variety of monitoring options
Each one should be used when appropriate - neither one is sufficient on its own
Metrics + Graphite + Grafana can give you visibility to any numeric timeseries
Questions?
Thank you