Date post: | 26-Jan-2015 |
Category: |
Technology |
Upload: | datastax |
View: | 107 times |
Download: | 0 times |
Data Points += 2 billion
... dailyJuly 24th, 2013
2
ABOUT ME
Sean Knapp(@seanknapp)
•Co-Founder, EVP & Chief Product Officer (formerly CTO)
•Senior Software Engineer @ Google•Built & launched iGoogle•Led Google’s Frontend Web Search and
Ads UX teams, who drove a $1B increase in revenue for Google in 18 months
•B.S. & M.S. in Computer Science from Stanford University
33
Suite of products and services providing white-label management, hosting, and distribution of video online
Hundreds of customers including ESPN, Bloomberg, Disney, Miramax, Univision, Dell, Pac-12 Networks, and more
100M+ unique users streaming more than 1B videos monthly, generating more than 2B analytics events daily
280 employees located in Silicon Valley, NYC, London, Tokyo, Sydney, Singapore, Seoul & Guadalajara
OOYALA OVERVIEW
4
EVOLVING INSIGHTS• Insights circa ’07• How many videos did I show this week?• What were my monthly uniques?
• Insights circa ’09• How many ad impressions did I receive
from users in each Designated Market Area (DMA)?
• Insights circa ’11• How many users do I have right now?
• Insights circa ’13• How does the revenue from iPad users
age 25-34 compare to those on XBox?
Weekly
Instant
Summary
Detailed
Complex
5
BIG DATA @ OOYALA• 1st Gen (circa ’07)• Process: Hadoop MapReduce• Language: Ruby• Store: MySQL
• 2nd Gen (circa ’09)• Process: Hadoop MapReduce• Language: Ruby• Store: Cassandra 0.5+
• 3rd Gen (circa ’11)• Process: MapReduce, Storm• Language: Ruby, Scala• Store: DataStax Enterprise (300TB disk, 1TB
RAM)
• 4th Gen (circa ’13)• Process: MapReduce, Storm, Spark, Hive• Language: Scala• Store: DataStax Enterprise (1.5PB disk, 14TB
RAM)
Batch
Realtime
Summary
Granular
Queryable
6
OUR GOALS
•Evolve our Analytics product from a time-delayed, static reporting system to a realtime, granular, and dynamic query engine
•Launch our Content Recommendation engine, an entirely new product offering
•Scale to billions of user events on a daily basis
•Support an ever expanding set of global customers
•Deliver a 5-9’s platform
7
OUR CHALLENGES
•Very small ops team supporting global infrastructure•Not enough capacity for performance tuning•Routinely fell behind the latest releases•Didn’t know which releases were stable enough
•Unforeseen product requirements beyond the next 12 months
•Existing solution would have cost nearly $1M to scale to just 100TB
8
SELECTION PROCESS
•Key Criteria•Scalability: PB+, 100k+ operations per second•Cost / price-performance•Availability: 5-9’s•Flexibility: schemaless
•Alternative Technologies•Other RDMS systems•HBase•Voldemort
9
WHY CASSANDRA
•First learned about C* in Nov 2008•First deployed C* in Sep 2009
•Compelling Features•Scalability: PB+, high ops/sec, billions of rows and columns•Performance: designed specifically for heavy workloads similar to Ooyala’s•Cost: could run on commodity hardware•Availability: multi-datacenter with no single point of failure•Community: strong, unified direction
10
RESULTS
•Business•Launched the next-gen of our Analytics in ’09 that solidified Ooyala as the leader in our industry•Launched our Content Recommendation engine in ’12 that again separated us from the industry
•Technical•1,000x the scale of just 5 years ago•Much higher ROI: 1PB+ for < $500k in hardware•No more 3am pager alerts
11
Q&A
THANK YOU