+ All Categories
Home > Technology > Data Care, Feeding, and Maintenance

Data Care, Feeding, and Maintenance

Date post: 24-Jun-2015
Category:
Upload: mercedes-coyle
View: 110 times
Download: 0 times
Share this document with a friend
Popular Tags:
19
Data Care, Feeding, and Maintenance Mercedes Coyle Data Infrastructure Engineer at @benzobot
Transcript
Page 1: Data Care, Feeding, and Maintenance

Data Care, Feeding, and Maintenance

Mercedes Coyle Data Infrastructure Engineer at

@benzobot

Page 2: Data Care, Feeding, and Maintenance

• Online Video Syndication platform

• Connect content providers, video publishers, advertising partners

• 2-3 million streams/day

Page 3: Data Care, Feeding, and Maintenance

Where does your data come from?

• One-time use analytics, or continual collection and processing?

• How much control do you have over data content and formatting?

• public datasets (gov, twitter) - little control

• application logging - more control

Page 4: Data Care, Feeding, and Maintenance

• Universal data format, or Normalize All the Data

• Pre- vs Post- processing

• Mapping data to a schema, even if it doesn’t have one

How is your data formatted?

Page 5: Data Care, Feeding, and Maintenance

Storage and Analytics tools

• Hadoop - distributed map reduce batch processing for large data sets

• Powerful querying tools (SQL-like Hive, Pig)

• Automated processing tasks for data ingestion and processing

• Slow - analyzing large data takes time, so no realtime results

Page 6: Data Care, Feeding, and Maintenance

Storage and Analytics tools • Realtime infrastructure - instantly available analytics

and data storage

• Storm, Spark, MongoDB, Logstash & Elasticsearch

• Can create aggregations and analytics jobs on the fly, and get results in seconds

• Quickly detect issues and make informed decisions

• Not always simple to query backwards over time series

Storage and Analytics tools

Page 7: Data Care, Feeding, and Maintenance

Storage and Analytics tools

• Small datasets? Reach for some more familiar tools

• CSVs can be handy for quick data analysis on a sample set of your data, especially for biz folks

• Don’t forget about command line tools: grep, awk, sort -u, sum

Storage and Analytics tools

Page 8: Data Care, Feeding, and Maintenance

You have data - now what?!• What do you want to

learn from your data?

• How quickly do you need results?

• Is your dataset one time use, or will you add to it over time?

• How accurate do your results need to be?

• Where does your data need to end up?

Page 9: Data Care, Feeding, and Maintenance

Data Infrastructure!at! ! ! ! !

• 75-100 million documents per day

• Lambda Architecture

• Batch processing with Hadoop

• Homegrown Realtime Processing system using RSyslog, Logstash, Elasticsearch and Kibana (currently undergoing rewrite with Storm)

Page 10: Data Care, Feeding, and Maintenance

Data Infrastructure!at! ! ! ! !

Page 11: Data Care, Feeding, and Maintenance

• Alert Fatigue

• Vanity Metrics

• Alerts and metrics can only be intelligent and actionable if they are relatable

Log All the Data, but don’t monitor All the Data

Page 12: Data Care, Feeding, and Maintenance

Data Investigation: Rapid Stream Decline

Whoops!

Page 13: Data Care, Feeding, and Maintenance

Data Investigation: Rapid Stream Decline

• Our graphs only showed one metric (streams). Why did it decrease so much?

• Two player types, only one was affected.

• System performance metrics and monitoring showed no outages at this time.

Page 14: Data Care, Feeding, and Maintenance

Data Investigation: Digging Deeper

• Publishers provided page load data

• Correlated batch summaries of player loads with page load counts

• Cross-checked data in the Speed Layer to rule out batch processing issues

Page 15: Data Care, Feeding, and Maintenance

Data Investigation: Digging Deeper

Page 16: Data Care, Feeding, and Maintenance

Data Investigation: Digging Deeper

• Further data investigation revealed browser compatibility issues with our players

• Our batch reporting layer visualization highlighted the problem

• Ad-hoc queries in the speed layer allowed quick analysis to determine what caused the issue

Page 17: Data Care, Feeding, and Maintenance

Data Investigation: Next Steps

• More intelligent realtime reporting

• Refine our data visualization tools to better represent our metrics

• Better communication with the teams/products we collect data on to inform analytics and dashboards

Page 18: Data Care, Feeding, and Maintenance

• Hortonworks Hadoop Sandbox - http://hortonworks.com/products/hortonworks-sandbox/

• Storm Starter - https://github.com/nathanmarz/storm-starter and storm-project.net

• MongoDB Aggregation - http://docs.mongodb.org/manual/core/aggregation-introduction/

• Common Event Expression - http://cee.mitre.org/about/

Resources

Page 19: Data Care, Feeding, and Maintenance

• Thanks!

• @benzobot

[email protected]

Questions?


Recommended