Date post: | 15-Jan-2015 |
Category: |
Software |
Upload: | tzach-zohar |
View: | 382 times |
Download: | 1 times |
Continuous Delivery In Practice
Lessons from Kenshoo’s RTB project
Who, What, Where
Tzach Zohar:● System Architect● [email protected]
Kenshoo: ● Founded 2006● Online Marketing Technology● >500 employees● 12 World Wide locations
Agenda
● Continuous Delivery: What? Why?● RTB Project● How: 10 Field Tested Tips● The Process● Appendices
Continuous Delivery: Definition(s)
“Continuous Delivery (CD) is a design practice …blah blah blah… Techniques such as
automated testing, continuous integration …blah blah blah... resulting in the ability to rapidly, reliably and
repeatedly push out enhancements ...blah blah blah.”
- Wikipedia
Continuous Delivery: Definition(s)
TL;DR
Continuous Delivery: Definition(s)
“Continuous delivery is a set of principles and practices to reduce the cost, time, and
risk of delivering incremental changes to users.”
- Jez Humble
Continuous Delivery: Definition(s)
“Continuous Delivery is a software development discipline where you build
software in such a way that the software can be released to production at any time”
- Martin Fowler
Continuous Delivery: Why bother?
“Our highest priority is to satisfy the customerthrough early and continuous delivery
of valuable software”
First principle of the Agile Manifesto
Continuous Delivery: Why bother?
Better suited productResponsiveness
Less wasteHigher quality
Simplicity
Recommended Further Reading on ThoughtWorks
Background: RTB Project
● ~1.5 years ● ~3 developers, 1 PM, 0.5 Ops (no QA)● ~Dozens of paying clients● ~50 servers (AWS)● ~1.5M requests per minute● ~7ms average response time● ~99.9% availability
Background: RTB Project
Frontend ClusterHighly available, high throughput ~20 node cluster
BackendSingle node, internal APIs
FBXFacebook RTB API
Reporting ClusterElastic Map Reduce (EMR) on-demand 16-node cluster
Cassandra ClusterHighly available, high throughput ~24 node cluster
S3Raw traffic logs
Background: RTB Project
~5-10 deployments / week
How?
1.The Obvious
● Single branch (details later)● Full, Fast, Reliable coverage● Full deployment automation● Fast feedback● ABCD - Always Be Continuously
Deploying
● Unit: complete functional coverage● Integration: with external systems - thin!● Behavioral: we use Cucumber● Staging: verify actual server upgrade
2. Four-Layer Test Suite
2. Four-Layer Test Suite
Staging: verify compatibility of new build with other components’ production builds
2. Four-Layer Test Suite
3. Keep Builds Stable
Do not overlook a test that “sometimes fails”, trusting build status is crucial
3. Keep Builds Stable
● Random data tests● Asynchronous tests● Integration tests
Be suspicious of:
4. Master Is Always Shippable
On every commit? Not QuiteWe follow the “GitHub Flow”:
Local Master
Local Feature Branch
Master
Feature Branch
1. pull
3. push
2. checkout
4. Merge
5. Rigorous Code Reviews
● Because “merge” means “deploy”!● Insist on proper coverage● Insist on code cleanliness● Insist on consistent design● Insist!
5. Rigorous Code Reviews
https://github.com/tzachz/github-comment-counter
6. Real-Time Feedback
Detect issues immediately and visually
7. Keep Upgrade in Mind (1)
Use the “Parallel Change” pattern when changing cross-node APIs / Data
1.Write: oldRead: both
2.Write: new Read: both
3.Write: new Read: new
deploy deploy
8. Keep Upgrade in Mind (2)
Verify backward compatibility in tests
9. A/B Testing
Apply new features to a limited user-group Measure business results per-group
(Not by branching)
9. A/B Testing
Splitting into groups correctly is important
9. A/B Testing
It’s easy to mess up (neglecting biases, wrong grouping, wrong comparison
methods)
This excellent talk by LivePerson’s Shlomo Lahav helped us a lot
10. Own It
Constantly check buildsConstantly collect feedbackConstantly check monitorsAnswer the phone at 3am
10. Own It
That’s It.
The Process
● Greenfield? That’s easy:○ Start with deployment and build○ Deploy a Hello World application○ Every new feature is test-covered
The Process (RTB)
1.Increase Unit+Integration CoverageCreate naive deployment AutomationCreate monitoringManual Staging tests
2.Automated stagingDowntime eradicatedManual (but often) deployment trigger
3.Autopilot - deploy upon commit
~ 9 Months
~ 3 Months
Appendix A. Partial Tool List
Testing: JUnit, Cucumber, NoseBuild / CI: Jenkins, Gradle, JaCoCo
Code Review: GitHubProvisioning: Puppet
Deployment: Fabric, botoMonitoring: Metrics, Graphite
Appendix B. Are You Ready?
Unit Coverage > 90%?
Good Staging Tests?
Informative Monitors?
Builds Are Kept Green?
No API Breaking Changes?
Rigorous Code Reviews?
Support Has Your Phone Number?
Do You Own it?
Not Ready
No Yes
credit: [email protected]
Thanks. Questions?