Moving out of the Garage
Scaling for StartupsAKA why scaling is fun
Why are you here?• I want to conquer an incredibly technically
challenging problem that no one has solved– Others have attempted to solve it and failed (or
succeeded but I can do it better)– I see something that no one has tried to solve
• I want to build a business possibly based on technology that fills a gap in the market– Technology is core, product is centric
• I want to leverage technology to build a non tech based revenue generating business– Technology is awesome, and it is going to support my
business.
The Path
Day 1: aka the last easy day of your life(or at least for a while)
G2
Day X
myspace.com : a place for moving0-20M active users in 2 years
0-100M active users in 4 years
Laying Foundations
• Lay groundwork to prepare for 3 critical growth phases– Scaling the technology– Scaling the team– Scaling the revenues/efficiency
• Decisions made in first 90 days create lasting impressions that can be felt for years
Scaling the Technology
• Scale up vs. scale out is no longer a question– Unless you just founded a bank, don’t scale up
• Partition Data– Decide early on, and make the right decision• Range Based• Mod Based• Mod’ed Ranges (if I had to do it over)
• Concentrate on write management
Scaling the Technology
• Don’t solve problems you don’t have– Reuse as many existing solutions as possible– What are your goals?
• Make $$?• Build Cool Technology?• Both?
• Fail Fast– Admit failure
• Don’t Double Down on bad decisions– Walk away from failure
Infrastructure Decisions
Leverage existing publically available scaling solutions
– Replication– Sharding– Memcache– Hardware loadbalancers– NAS/SAN
Leverage public solutions whenpossible, when not develop proprietary internal scaling solutions
– Myspace DFS– MyCache– Transaction Manager– Dspace map/reduce
Leverage public and internal solutions
– Without negatively impacting developer productivity
– Without wasting time– Without wasting money
Easiest
Harder
Most Difficult
Decouple the User from the Authoritative Disks
Reading
Writing
Cache
Queue
AuthoritativePrimary Arrays
Overflow
Overflow
Relational Data Store
Flat Data Store
SAN/NAS Virtualization
Layer
Decouple the User from the Authoritative Disks
Reading
Writing
Cache
Queue
AuthoritativePrimary Arrays
Overflow
Overflow
Relational Data Store
Flat Data Store
SAN/NAS Virtualization
Layer
Decouple the User from the Authoritative Disks
Reading
Writing
Cache
Queue
AuthoritativePrimary Arrays
Overflow
Overflow
Relational Data Store
Flat Data Store
SAN/NAS Virtualization
Layer
Range PartitionsUsers 0 -1 Million Users 1-2 Million Users 2–3 Million
New User PipeNew User Pipe
• Infinitely Scalable• Newest Ranges Create Hot Spots
Mod PartitionsMod 1 Mod 2 Mod 3
New User PipeNew User Pipe
•Eliminates Hot Spots•Difficult to add new hardware•Scalable only to a certain point
Mod/Range Combo PartitioningUsers 0 – 1 Million /
Mod 1
New User PipeNew User Pipe
Users 0 – 1 Million / Mod 2
Users 0 – 1 Million / Mod 3
Users 1 – 2 Million / Mod 1
Users 1 – 2 Million / Mod 2
Users 1 – 2 Million / Mod 3
•Eliminates Hot Spots•Infinitely Scalable•Adding additional hardware is easy
Scaling the Organization• The first 25 people you hire will define the success of
your company– Don’t hire fast, hire smart
• Manage your burn, not your timeframe– Do you have competitors trying to do the same?– Are you second to market?
• Sprint– Is it new? Is no one else thinking about this?
• Marathon• Be smart about your stealth phase– Countless failures from coming out to early– Countless failures from coming out to late
We Want To Code
Scaling the Organization
• #1 Priority – Minimize ramp time– Counterpart to technology’s “fail fast”
• Abstract core technologies from front end development groups– Data Access Layer– Cache– Queues– Etc.
• Create vertical product partitions with horizontal skillset partitions
Scaling Profit
• Technology is a cost center– Manage profit by managing expenses
• Bucket Scaling Model• Calculate yearly cost of user– Inverse LUV
• Use commodity gear– No SAN/NAS unless absolutely necessary
• Leverage CDN• The Cloud?
True?False?
True?False?
The Cloud• I love a good buzzword
– Cloud Computing– Economies of scale?– Little to no SLA. Now you own my data
• Consumers eat the cloud– Email (circa…how long ago?)– Photos– Interests
• The cloud has existed for consumers for the last 15 years• As a business, unless you are doing something that requires huge
volatile processing power, rent your servers.• Don’t handshake your data
– Your business is your data• Build your own cloud
– GlusterFS– MaxiScale
Questions?