Date post: | 21-Jan-2018 |
Category: |
Technology |
Upload: | anya-bida |
View: | 205 times |
Download: | 0 times |
About Anya
Sr. Member of Technical Staff (SRE)
Salesforce Production Engineering
Salesforce Einstein Platform
Co-organizer SF Big Analytics
Spark Tuning
• Cheat-sheet
• Talks
Previously at Alpine Data, SRI
PhD Mayo Clinic, BS Johns Hopkins
@anyabida1
What I am going to talk about
What is DevOps
Salesforce Einstein Scales
Our goal
Top 10 tips
What’s next?
What is DevOps?
Software Development
Network &
SecurityInfrastructure
Build & Release
Data Science
• Awesome library
on SparkML
• Spark clusters
• Microservices
• Cluster, Containers
Fastest Growing Top 5 Enterprise Software Company
$5.4BFY15
$4.1BFY14
$3.1BFY13
$6.7BFY16
$2.3BFY12
$1.7BFY11
$2.56BFY18Q2 revenue
$8.4BFY17 revenue
2009 • 2010 • 2011
2012 • 2013 • 2014
2015 • 2016 • 2017
September 2016
2011 • 2012 • 2013
2014 • 2015 • 2016 • 2017
The world’s most
innovative companies
“Innovator of
the Decade”
Tip 1: Plan for FailureTake off that Data Scientist hat now.
https://www.slideshare.net/jiboumans/how-to-measure-everything-a-million-metrics-per-second-with-minimal-developer-overhead
Simple Dashboard with KPIs
• Request & error rates
• Longest response times - upper
95th & 99th percentile
• Capacity
• Events
Jos Boumans,
Salesforce DMP
slides
Tip 1: Plan for FailureTake off that Data Scientist hat now.
https://www.slideshare.net/jiboumans/how-to-measure-everything-a-million-metrics-per-second-with-minimal-developer-overhead
Simple Dashboard with KPIs
• Request & error rates
• Longest response times - upper
95th & 99th percentile
• Capacity
• Events
Collect metrics from every
machine.
Troubleshoot with all the
metrics at your disposal
Tip 2: Blue Green Deployments
https://docs.mobingi.com/official/guide/bg-deploy
Blue Machine
(old)
Green Machine
(new)
Users
Tip 3: Assume people make mistakes
Technical debt
• Every manual change
• Duplicate metrics
Scale down resources
• Terminate unused machines
• Janitor Monkey
• Understand the cost per job
• Jobs should not accumulate files on disk
Tip 4: Changes should be auditableSchaper - the tool to compare schemas
https://www.linkedin.com/in/huqixiu/
Qixiu “Q” Hu
Tip 4: Changes should be auditableSchaper - the tool to compare schemas
https://www.linkedin.com/in/huqixiu/
Qixiu “Q” Hu
CREATE TABLE myConferences (
name text ,
city text,
early_bird timeuuid,
late_bird timeuuid,
PRIMARY KEY ((name, city),
early_bird)
) WITH CLUSTERING ORDER BY
(early_bird DESC);
CREATE TABLE myConferences (
name text ,
city text,
early_bird timeuuid,
late_bird timeuuid,
PRIMARY KEY ((name, city),
early_bird)
) WITH CLUSTERING ORDER BY
(early_bird DESC);
Tip 4: Changes should be auditableSchaper - the tool to compare schemas
https://www.linkedin.com/in/huqixiu/
Qixiu “Q” Hu
CREATE TABLE myConferences (
name text ,
city text,
early_bird timeuuid,
late_bird timeuuid,
PRIMARY KEY ((name, city),
early_bird)
) WITH CLUSTERING ORDER BY
(early_bird DESC);
CREATE TABLE myConferences (
name text ,
city text,
early_bird timeuuid,
late_bird timeuuid,
discount_code string,
PRIMARY KEY ((name, city),
early_bird)
) WITH CLUSTERING ORDER BY
(early_bird DESC);
Tip 5: Configuration management
Network Connectivity
• 20 parameters
User Access
• 50 parameters
Deploy cluster (eg Mesos)
• 20 non-default parameters
Deploy a microservice
• 50 parameters
Schedule a job
• 3 parameters
SUM X 3 regionsX 20 metrics
Approx.6000
Templates for Automation
Service discovery
Creating dashboards• Prod, non-prod, …
Log queries
Cost analysis
Tip 6: Pick a naming convention
<service>.
<environment>.
<region>.
<hostname>.
<metric>
Tip 7: PermissionsEvery user, service, & job should have specific, auditable permissions.
Cluster Manager
Scheduler
IAM
IAM Roles
• User has an IAM Role
• Job has an IAM Role
• IAM Roles determine read /
write access to data
IAM
Out
Logs
IAM
In
Understanding Memory Management in Spark For Fun And Profit Shivnath Babu (Duke University, Unravel Data Systems)
Mayuresh Kunjir (Duke University)
Tip 8: Understand resource allocation
Node Memory
Container Memory
8Gb
Node Memory
Container
Memory
8Gb
Node
Memory
Node
Memory
Node
Memory
4Gb
used
8Gb
total
Can my 8Gb container launch on this cluster?
8Gb
Tip 9: Monitor multiple viewpointsConnectivity Viewer
https://www.linkedin.com/in/vaibhavt/
Vaibhav Tandon
Tip 9: Monitor multiple viewpointsConnectivity Viewer
https://www.linkedin.com/in/vaibhavt/
Vaibhav Tandon
Tip 9: Monitor multiple viewpointsConnectivity Viewer
https://www.linkedin.com/in/vaibhavt/
Vaibhav Tandon
Getting started tips:1. Plan for failure
2. Blue / Green Deployments
3. Assume people make mistakes
4. Changes should be auditable
5. Configuration management
6. Pick a naming convention
7. Permissions
• user, service, job
8. Understand resource allocation
9. Monitor multiple viewpoints
Getting started tips: 1. Plan for failure
2. Blue / Green Deployments
3. Assume people make mistakes
4. Changes should be auditable
5. Configuration management
6. Pick a naming convention
7. Permissions
• user, service, job
8. Understand resource allocation
9. Monitor multiple viewpoints
10. Infrastructure as Code
Did we just automate ourselves out of our jobs?Nope. Now we have time to take on new projects and grow…
More info:
Jos Boumans,
Salesforce DMP
slides
SRE How Google Runs
Production Systems book
James Ward,
Engineering & Open Source
Ambassador at Salesforce
High Performance
spark book
More info:
Real Time ML Pipelines in Multi-Tenant Environments
Director of Engineering Karl Skucha & Lead Engineer Yan Yang
Introduction to Machine Learning
Engineering & Open Source Ambassador James Ward
Fantastic ML apps and how to build them
Principal Engineer, Matthew Tovbin
Fireworks - lighting up the sky with millions of Sparks
Director of Engineering Thomas Gerber
Functional Linear Algebra in Scala
Engineer & Professor Vlad Patryshev
Panel: Functional Programming for Machine Learning
Saturday @ 2:10pm —Complex Machine Learning Pipelines Made Easy
Machine Learning Engineers Till Bergmann & Chris Rupley