Distributed Computing Distributed Computing Economics Economics
Jim GrayJim GrayMicrosoft Research Microsoft Research [email protected]@microsoft.comPresentation To Microsoft Venture Presentation To Microsoft Venture Capital SummitCapital Summit28 April 200428 April 2004
Distributed Computing Distributed Computing EconomicsEconomics
Why is Seti@Home a great idea?Why is Seti@Home a great idea?
Why is Napster a great deal?Why is Napster a great deal?
Why is the Computational Grid uneconomic?Why is the Computational Grid uneconomic?
When does computing on demand work?When does computing on demand work?
What is the “right” level of abstraction?What is the “right” level of abstraction?
Is the Access Grid the real killer app?Is the Access Grid the real killer app?
Based on: Distributed Computing Economics, Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24
http://research.microsoft.com/research/pubs/view.aspx?tr_id=655http://research.microsoft.com/research/pubs/view.aspx?tr_id=655
Computing Is FreeComputing Is Free
Computers cost 1k$ (if you shop right) Computers cost 1k$ (if you shop right) (yes, there are 1(yes, there are 1μμ$ to 1M$ computers, but..)$ to 1M$ computers, but..)
So 1 cpu day = 1$ (computers last 3 years)So 1 cpu day = 1$ (computers last 3 years)
If you pay the phone bill, internet bandwidth If you pay the phone bill, internet bandwidth costs 50…500$/mbps/m (not including costs 50…500$/mbps/m (not including routers and management)routers and management)
So 1GB costs 1$ to send and 1$ to receiveSo 1GB costs 1$ to send and 1$ to receive
Caveat: All numbers rounded to nearest factor of 3.Caveat: All numbers rounded to nearest factor of 3.
Why Is Seti@Home A Why Is Seti@Home A Good Deal?Good Deal?
Send 300 KB: Send 300 KB: Costs 3e-4$Costs 3e-4$
User computes for ½ day:User computes for ½ day: Benefit .5e-Benefit .5e-1$1$
ROI: 1500:1ROI: 1500:1
Seti@HomeSeti@HomeThe worlds most powerful computerThe worlds most powerful computer
67 TF is sum of top 4 of Top 50067 TF is sum of top 4 of Top 50067 TF is 9x the number 2 system67 TF is 9x the number 2 system67 TF more than the sum of systems 2...1067 TF more than the sum of systems 2...10
Seti@HomeSeti@Homehttp://setiathome.ssl.berkeley.edu/totals.htmlhttp://setiathome.ssl.berkeley.edu/totals.html
26 April 200426 April 2004
TotalTotal Last 24 HoursLast 24 Hours
UsersUsers 5 M5 M 1,1381,138
Results receivedResults received 1.3 B1.3 B 1,5 M1,5 M
Total CPU timeTotal CPU time 1.5 M years1.5 M years 1,199 years1,199 years
Floating Point Floating Point OperationsOperations
5 E+21 flops5 E+21 flops
5 zeta flops5 zeta flops6 E+18 FLOPS/day 6 E+18 FLOPS/day
6767 TeraFLOPs TeraFLOPs
Why Was Napster A Why Was Napster A Good Deal?Good Deal?
Send 5 MB Send 5 MB costs 5e-3$costs 5e-3$½ a penny per ½ a penny per
songsong
Both sender and receiver can afford itBoth sender and receiver can afford it
Same logic powers web sites (Yahoo!...)Same logic powers web sites (Yahoo!...)1e-3$/page view advertising revenue1e-3$/page view advertising revenue
1e-5$/page view cost of serving web page1e-5$/page view cost of serving web page
100:1 ROI 100:1 ROI
Computing EquivalentsComputing Equivalents1$ buys1$ buys
1 day of cpu time1 day of cpu time
4 GB (fast) ram for a day 4 GB (fast) ram for a day
1 GB of network bandwidth1 GB of network bandwidth
1 GB of disk storage for 3 years1 GB of disk storage for 3 years
10 M database accesses 10 M database accesses
10 TB of disk access (sequential)10 TB of disk access (sequential)
10 TB of LAN bandwidth (bulk)10 TB of LAN bandwidth (bulk)
10 KWhrs == 4 days of computer time10 KWhrs == 4 days of computer time
Depreciating over 3 years, and there are about 1k days in 3 years.Depreciating over 3 years, and there are about 1k days in 3 years.
Some ConsequencesSome Consequences
Beowulf networking is 10,000x cheaper than Beowulf networking is 10,000x cheaper than WAN networking factors of 10WAN networking factors of 1055 matter matter
The cheapest and fastest way to move The cheapest and fastest way to move Terabytes cross country is sneakernetTerabytes cross country is sneakernet24 hours = 4 MB/s24 hours = 4 MB/s50$ shipping vs 1,000$ wan cost50$ shipping vs 1,000$ wan cost
Sending 10PB CERN data via network is silly: Sending 10PB CERN data via network is silly: buy disk bricks in Geneva, fill them, ship thembuy disk bricks in Geneva, fill them, ship them
TeraScale SneakerNet: Using Inexpensive Disks for Backup, TeraScale SneakerNet: Using Inexpensive Disks for Backup, Archiving, and Data ExchangeArchiving, and Data Exchange
Jim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan vandenBergJim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan vandenBergMicrosoft Technical Report may 2002, MSR-TR-2002-54 Microsoft Technical Report may 2002, MSR-TR-2002-54
http://research.microsoft.com/research/pubs/view.aspx?tr_id=569http://research.microsoft.com/research/pubs/view.aspx?tr_id=569
Computational Grid Computational Grid EconomicsEconomics
To the extent that computational grid is like To the extent that computational grid is like Seti@Home or ZetaNet or Folding@home or…it is a Seti@Home or ZetaNet or Folding@home or…it is a great thinggreat thing
The extent that the computational grid is MPI or data The extent that the computational grid is MPI or data analysis, it fails on economic grounds: move the analysis, it fails on economic grounds: move the programs to the data, not the data to the programsprograms to the data, not the data to the programs
The Internet is The Internet is notnot the cpu backplane the cpu backplane
An alternate reality: Nearly free networkingAn alternate reality: Nearly free networkingTelcos go bankrupt and price=cost=0Telcos go bankrupt and price=cost=0
Taxpayers pay your phone bill so price=0 and telcos receive Taxpayers pay your phone bill so price=0 and telcos receive a BIG government subsidya BIG government subsidy
When To Export A TaskWhen To Export A Task
IFIF instruction density > instruction density > 100,000 instructions/byte100,000 instructions/byte
ANDAND remote computer is free remote computer is free (costs you nothing)(costs you nothing)
THEN THEN ROI > 0ROI > 0ELSEELSE ROI < 0ROI < 0
Computing On DemandComputing On Demand
Was called outsourcing/service bureaus in my youth. Was called outsourcing/service bureaus in my youth. CSC and IBM did itCSC and IBM did it
It is not a new way of doing things: think payroll. It is not a new way of doing things: think payroll. Payroll is standard outsourced servicePayroll is standard outsourced service
Now Hotmail, Salesforce.com, Oracle.com,…Now Hotmail, Salesforce.com, Oracle.com,…
Works for standard appsWorks for standard apps
COD works for commoditized servicesCOD works for commoditized services
Airlines outsource reservations. Banks Airlines outsource reservations. Banks outsource ATMsoutsource ATMs
But Amazon, Amex, Wal-Mart, eTrade, eBay... Can’t But Amazon, Amex, Wal-Mart, eTrade, eBay... Can’t outsource their core competence outsource their core competence
What’s The Right Abstraction Level For What’s The Right Abstraction Level For Internet Scale Distributed Computing?Internet Scale Distributed Computing?
Disk block? Disk block? No too lowNo too lowFile? File? No too lowNo too lowDatabase? Database? No too lowNo too lowApplication? Application? Yes, of Yes, of coursecourse
Blast searchBlast searchGoogle searchGoogle searchSend/Get eMailSend/Get eMailPortals that federate astronomy archivesPortals that federate astronomy archives((http://skyQuery.Net/http://skyQuery.Net/))
Web Services (.NET, EJB, OGSA) give this Web Services (.NET, EJB, OGSA) give this abstraction levelabstraction level
Access GridAccess Grid
Q: What comes after the telephone?Q: What comes after the telephone?
A: eMail?A: eMail?
A: Instant messaging?A: Instant messaging?
Both seem retro: text & emotonsBoth seem retro: text & emotons
Access Grid could revolutionize human Access Grid could revolutionize human communicationcommunication
But, it needs a new ideaBut, it needs a new idea
Q: What comes after the telephone?Q: What comes after the telephone?
Supercomputers You UseSupercomputers You Use
Hotmail, Yahoo!, Google: ~10k serversHotmail, Yahoo!, Google: ~10k servers
Amazon, Barnes&NobleAmazon, Barnes&Noble
Expedia, OrbitzExpedia, Orbitz
Dell, HP,…Dell, HP,…
Service-oriented architecturesService-oriented architectures
Not computing on demandNot computing on demand, but , but information on demand!information on demand!
Distributed Computing EconomicsDistributed Computing Economics
Why is Seti@Home a great idea?Why is Seti@Home a great idea?Why is Napster a great deal?Why is Napster a great deal?Why is the Computational Grid Why is the Computational Grid uneconomicuneconomicWhen does computing on When does computing on demand work?demand work?What is the “right” level of abstraction?What is the “right” level of abstraction?Is the Access Grid the real killer app?Is the Access Grid the real killer app?
Based on: Distributed Computing Economics, Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24
http://research.microsoft.com/research/pubs/view.aspx?tr_id=655http://research.microsoft.com/research/pubs/view.aspx?tr_id=655
PollPoll
Is there a market for Supercomputers?Is there a market for Supercomputers?Yes, Google, Expedia, Hotmail,…Yes, Google, Expedia, Hotmail,…
Is Computing On Demand a high-Is Computing On Demand a high-margin business?margin business?I think notI think not
Do you know the equivalent high-Do you know the equivalent high-margin business?margin business?Information on demandInformation on demand
Take AwaysTake Aways
Computing on demand is a service Computing on demand is a service business; probably not high margin; business; probably not high margin; questionable economics; think questionable economics; think LoudCloudLoudCloud
Distributed computing is coming,Distributed computing is coming,but it is probably via Service Oriented but it is probably via Service Oriented Architecture (SOA)Architecture (SOA)
Web Services is the way to do SOAWeb Services is the way to do SOA
OutlineOutline
Overview of Microsoft ResearchOverview of Microsoft Research
Distribute Computing EconomicsDistribute Computing Economics
Q&AQ&A
The Cost Of ComputingThe Cost Of ComputingComputers are Computers are NOTNOT free! free!
IBM, HP, Dell make $billionsIBM, HP, Dell make $billions
Capital Cost of a TpcC system Capital Cost of a TpcC system is mostly storage and is mostly storage and storage software (database)storage software (database)IBM 32 cpu, 512 GB ram IBM 32 cpu, 512 GB ram 2,500 disks, 43 TB2,500 disks, 43 TB(680,613 tpmC @ 11.13 $/tpmc available 11/08/03)(680,613 tpmC @ 11.13 $/tpmc available 11/08/03)http://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdfhttp://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdf
A 7.5M$ super-computerA 7.5M$ super-computer
Total Data Center Cost: Total Data Center Cost: 40% capital & facilities 60% staff40% capital & facilities 60% staff(includes app development)(includes app development)
TpcC Cost Components DB2/AIXhttp://www.tpc.org/results/individual_ results/IBM /IBMp690es_05092003.pdf
software10%
storage61%
cpu/mem29%