The APIAnd Things I’ve Learned from 4 Years as a Manager at Netflix
Ben SchmausFirstMark Capital, June 2015
[ [email protected], @schmaus ]
20 million subscribers
1 AWS region
Functional but fragile cloud platform
Datacenter to Cloud Migration
Fundamental Mission Unchanged*
Support product innovation
Insulate devices from failure
* modulo pivot from public API
Ser
vice
Lay
er
/tv
/ios
/web
/android
...
recs
account
search
sims
...
ServiceJARs
EndpointScripts
JavaAPI
Engineers are feature producers and failure defenders
(From How Complex Systems Fail)
Ser
vice
Lay
er
/tv
/ios
/web
/android
...
recs
account
search
sims
...
ServiceJARs
EndpointScripts
JavaAPI
Hys
trix
Humans best at design time
Automatically adapt, degrade gracefully
Clearly report system behavior
Don’t Tweak Knobs
Your Failure Handling Will Fail
...unless you test it regularly
Do your fallbacks work?
Do they trigger before your servers overload?
Retries Seem Simple
Effects compound
Pretty easy to DDOS yourself
Wasted server work on timed out clients
Primary / Secondary Model
Primary Secondary
Primary Secondary
2 week rotation = 1 as secondary then 1 as primary
Team Shapes the Roadmap
Listen to the team; they know where work is needed
Priorities1. Production healthy2. Company (launch new market)3. Biz As Usual (A/B tests, chaos testing)4. Elective (FIT)
“...judgement is the solution for almost every ambiguous problem. Not process.”
- John Ciancutti (former Netflix eng)