distributed computing economics jim gray microsoft research [email protected] [email protected]...

29

Upload: sandra-ilene-murphy

Post on 28-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Distributed Computing Distributed Computing Economics Economics

Jim GrayJim GrayMicrosoft Research Microsoft Research [email protected]@microsoft.comPresentation To Microsoft Venture Presentation To Microsoft Venture Capital SummitCapital Summit28 April 200428 April 2004

OutlineOutline

Overview of Microsoft ResearchOverview of Microsoft Research

Distributed Computing EconomicsDistributed Computing Economics

Q&AQ&A

Microsoft ResearchMicrosoft Research

Founded in 1991Founded in 1991Staff of over 700 in over 55 areasStaff of over 700 in over 55 areasInternationally recognized Internationally recognized research teamsresearch teamsLab locations Lab locations

Redmond, Washington, USA Redmond, Washington, USA 75% 75%Cambridge, United KingdomCambridge, United Kingdom 10% 10% Beijing, People’s Republic of China Beijing, People’s Republic of China 10% 10% Mountain View, California, USAMountain View, California, USA 5% 5% San Francisco, California, USA San Francisco, California, USA 1% 1%

ApproachApproachAdapt the Academic ModelAdapt the Academic Model

Organizational goal: Advance state of the artOrganizational goal: Advance state of the art

University organizational modelUniversity organizational modelFlat structure, critical mass groupsFlat structure, critical mass groups

Open research environmentOpen research environmentAggressive publication in peer-reviewed literatureAggressive publication in peer-reviewed literature

Frequent visitors, daily seminarsFrequent visitors, daily seminars

Strong ties to University ResearchStrong ties to University ResearchNearly 15% of basic research budget Nearly 15% of basic research budget directly invested in Universitiesdirectly invested in Universities

Lab grants, research grants, fellowships, etc.Lab grants, research grants, fellowships, etc.

Hundreds of interns and visitorsHundreds of interns and visitors

My Research AgendaMy Research Agenda

Scaleable ServersScaleable ServersTerraServer – US map onlineTerraServer – US map online

SkyServer – All astronomy data onlineSkyServer – All astronomy data online

DatabasesDatabasesAdvancing databases and data storageAdvancing databases and data storage

Media ManagementMedia ManagementOrganizing your digital shoeboxOrganizing your digital shoebox

OutlineOutline

Overview of Microsoft ResearchOverview of Microsoft Research

Distribute Computing EconomicsDistribute Computing Economics

Q&AQ&A

Distributed Computing Distributed Computing EconomicsEconomics

Why is Seti@Home a great idea?Why is Seti@Home a great idea?

Why is Napster a great deal?Why is Napster a great deal?

Why is the Computational Grid uneconomic?Why is the Computational Grid uneconomic?

When does computing on demand work?When does computing on demand work?

What is the “right” level of abstraction?What is the “right” level of abstraction?

Is the Access Grid the real killer app?Is the Access Grid the real killer app?

Based on: Distributed Computing Economics, Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24

http://research.microsoft.com/research/pubs/view.aspx?tr_id=655http://research.microsoft.com/research/pubs/view.aspx?tr_id=655

PollPoll

Is there a market for Supercomputers?Is there a market for Supercomputers?

Is Computing On Demand a high-Is Computing On Demand a high-margin business?margin business?

Do you know the equivalent high-Do you know the equivalent high-margin business?margin business?

Computing Is FreeComputing Is Free

Computers cost 1k$ (if you shop right) Computers cost 1k$ (if you shop right) (yes, there are 1(yes, there are 1μμ$ to 1M$ computers, but..)$ to 1M$ computers, but..)

So 1 cpu day = 1$ (computers last 3 years)So 1 cpu day = 1$ (computers last 3 years)

If you pay the phone bill, internet bandwidth If you pay the phone bill, internet bandwidth costs 50…500$/mbps/m (not including costs 50…500$/mbps/m (not including routers and management)routers and management)

So 1GB costs 1$ to send and 1$ to receiveSo 1GB costs 1$ to send and 1$ to receive

Caveat: All numbers rounded to nearest factor of 3.Caveat: All numbers rounded to nearest factor of 3.

Why Is Seti@Home A Why Is Seti@Home A Good Deal?Good Deal?

Send 300 KB: Send 300 KB: Costs 3e-4$Costs 3e-4$

User computes for ½ day:User computes for ½ day: Benefit .5e-Benefit .5e-1$1$

ROI: 1500:1ROI: 1500:1

Seti@HomeSeti@HomeThe worlds most powerful computerThe worlds most powerful computer

67 TF is sum of top 4 of Top 50067 TF is sum of top 4 of Top 50067 TF is 9x the number 2 system67 TF is 9x the number 2 system67 TF more than the sum of systems 2...1067 TF more than the sum of systems 2...10

Seti@HomeSeti@Homehttp://setiathome.ssl.berkeley.edu/totals.htmlhttp://setiathome.ssl.berkeley.edu/totals.html

26 April 200426 April 2004

   TotalTotal Last 24 HoursLast 24 Hours

UsersUsers 5 M5 M 1,1381,138

Results receivedResults received 1.3 B1.3 B 1,5 M1,5 M

Total CPU timeTotal CPU time 1.5 M years1.5 M years 1,199 years1,199 years

Floating Point Floating Point OperationsOperations

5 E+21 flops5 E+21 flops

5 zeta flops5 zeta flops6 E+18 FLOPS/day 6 E+18 FLOPS/day

6767 TeraFLOPs TeraFLOPs

Why Was Napster A Why Was Napster A Good Deal?Good Deal?

Send 5 MB Send 5 MB costs 5e-3$costs 5e-3$½ a penny per ½ a penny per

songsong

Both sender and receiver can afford itBoth sender and receiver can afford it

Same logic powers web sites (Yahoo!...)Same logic powers web sites (Yahoo!...)1e-3$/page view advertising revenue1e-3$/page view advertising revenue

1e-5$/page view cost of serving web page1e-5$/page view cost of serving web page

100:1 ROI 100:1 ROI

The Cost Of ComputingThe Cost Of ComputingComputers are Computers are NOTNOT free! free!

IBM, HP, Dell make $billionsIBM, HP, Dell make $billions

Capital Cost of a TpcC system Capital Cost of a TpcC system is mostly storage and is mostly storage and storage software (database)storage software (database)IBM 32 cpu, 512 GB ram IBM 32 cpu, 512 GB ram 2,500 disks, 43 TB2,500 disks, 43 TB(680,613 tpmC @ 11.13 $/tpmc available 11/08/03)(680,613 tpmC @ 11.13 $/tpmc available 11/08/03)http://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdfhttp://www.tpc.org/results/individual_results/IBM/IBMp690es_05092003.pdf

A 7.5M$ super-computerA 7.5M$ super-computer

Total Data Center Cost: Total Data Center Cost: 40% capital & facilities 60% staff40% capital & facilities 60% staff(includes app development)(includes app development)

TpcC Cost Components DB2/AIXhttp://www.tpc.org/results/individual_ results/IBM /IBMp690es_05092003.pdf

software10%

storage61%

cpu/mem29%

Computing EquivalentsComputing Equivalents1$ buys1$ buys

1 day of cpu time1 day of cpu time

4 GB (fast) ram for a day 4 GB (fast) ram for a day

1 GB of network bandwidth1 GB of network bandwidth

1 GB of disk storage for 3 years1 GB of disk storage for 3 years

10 M database accesses 10 M database accesses

10 TB of disk access (sequential)10 TB of disk access (sequential)

10 TB of LAN bandwidth (bulk)10 TB of LAN bandwidth (bulk)

10 KWhrs == 4 days of computer time10 KWhrs == 4 days of computer time

Depreciating over 3 years, and there are about 1k days in 3 years.Depreciating over 3 years, and there are about 1k days in 3 years.

Some ConsequencesSome Consequences

Beowulf networking is 10,000x cheaper than Beowulf networking is 10,000x cheaper than WAN networking factors of 10WAN networking factors of 1055 matter matter

The cheapest and fastest way to move The cheapest and fastest way to move Terabytes cross country is sneakernetTerabytes cross country is sneakernet24 hours = 4 MB/s24 hours = 4 MB/s50$ shipping vs 1,000$ wan cost50$ shipping vs 1,000$ wan cost

Sending 10PB CERN data via network is silly: Sending 10PB CERN data via network is silly: buy disk bricks in Geneva, fill them, ship thembuy disk bricks in Geneva, fill them, ship them

TeraScale SneakerNet: Using Inexpensive Disks for Backup, TeraScale SneakerNet: Using Inexpensive Disks for Backup, Archiving, and Data ExchangeArchiving, and Data Exchange

Jim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan vandenBergJim Gray; Wyman Chong; Tom Barclay; Alex Szalay; Jan vandenBergMicrosoft Technical Report may 2002, MSR-TR-2002-54 Microsoft Technical Report may 2002, MSR-TR-2002-54

http://research.microsoft.com/research/pubs/view.aspx?tr_id=569http://research.microsoft.com/research/pubs/view.aspx?tr_id=569

How Do You Move A Terabyte?How Do You Move A Terabyte?

14 minutes14 minutes6176172002001,920,0001,920,00096009600OC 192OC 192

2.2 hours2.2 hours10001000GbpsGbps

1 day1 day100100100 Mpbs100 Mpbs

14 hours14 hours97697631631649,00049,000155155OC3OC3

2 days2 days2,0102,01065165128,00028,0004343T3T3

2 months2 months2,4692,4698008001,2001,2001.51.5T1T1

5 months5 months36036011711750500.60.6Home DSLHome DSL

6 years6 years3,0863,0861,0001,00040400.040.04Home phoneHome phone

Time/TBTime/TB$/TB$/TBSentSent$/Mbps$/MbpsRentRent

$/month$/monthSpeedSpeedMbpsMbpsContextContext

Source: TeraScale Sneakernet, Microsoft Research, Jim Gray et. all

Computational Grid Computational Grid EconomicsEconomics

To the extent that computational grid is like To the extent that computational grid is like Seti@Home or ZetaNet or Folding@home or…it is a Seti@Home or ZetaNet or Folding@home or…it is a great thinggreat thing

The extent that the computational grid is MPI or data The extent that the computational grid is MPI or data analysis, it fails on economic grounds: move the analysis, it fails on economic grounds: move the programs to the data, not the data to the programsprograms to the data, not the data to the programs

The Internet is The Internet is notnot the cpu backplane the cpu backplane

An alternate reality: Nearly free networkingAn alternate reality: Nearly free networkingTelcos go bankrupt and price=cost=0Telcos go bankrupt and price=cost=0

Taxpayers pay your phone bill so price=0 and telcos receive Taxpayers pay your phone bill so price=0 and telcos receive a BIG government subsidya BIG government subsidy

When To Export A TaskWhen To Export A Task

IFIF instruction density > instruction density > 100,000 instructions/byte100,000 instructions/byte

ANDAND remote computer is free remote computer is free (costs you nothing)(costs you nothing)

THEN THEN ROI > 0ROI > 0ELSEELSE ROI < 0ROI < 0

Computing On DemandComputing On Demand

Was called outsourcing/service bureaus in my youth. Was called outsourcing/service bureaus in my youth. CSC and IBM did itCSC and IBM did it

It is not a new way of doing things: think payroll. It is not a new way of doing things: think payroll. Payroll is standard outsourced servicePayroll is standard outsourced service

Now Hotmail, Salesforce.com, Oracle.com,…Now Hotmail, Salesforce.com, Oracle.com,…

Works for standard appsWorks for standard apps

COD works for commoditized servicesCOD works for commoditized services

Airlines outsource reservations. Banks Airlines outsource reservations. Banks outsource ATMsoutsource ATMs

But Amazon, Amex, Wal-Mart, eTrade, eBay... Can’t But Amazon, Amex, Wal-Mart, eTrade, eBay... Can’t outsource their core competence outsource their core competence

What Do You Outsource?What Do You Outsource?

Disk blocks?Disk blocks?

Files?Files?

SQL?SQL?

RPC?RPC?

Application?Application?

ØØ

XdriveXdrive

SkyServerSkyServer

TerraServerTerraServer

AOL, AOL, Google, Hotmail, Google, Hotmail, Yahoo!, ….Yahoo!, ….

What’s The Right Abstraction Level For What’s The Right Abstraction Level For Internet Scale Distributed Computing?Internet Scale Distributed Computing?

Disk block? Disk block? No too lowNo too lowFile? File? No too lowNo too lowDatabase? Database? No too lowNo too lowApplication? Application? Yes, of Yes, of coursecourse

Blast searchBlast searchGoogle searchGoogle searchSend/Get eMailSend/Get eMailPortals that federate astronomy archivesPortals that federate astronomy archives((http://skyQuery.Net/http://skyQuery.Net/))

Web Services (.NET, EJB, OGSA) give this Web Services (.NET, EJB, OGSA) give this abstraction levelabstraction level

Access GridAccess Grid

Q: What comes after the telephone?Q: What comes after the telephone?

A: eMail?A: eMail?

A: Instant messaging?A: Instant messaging?

Both seem retro: text & emotonsBoth seem retro: text & emotons

Access Grid could revolutionize human Access Grid could revolutionize human communicationcommunication

But, it needs a new ideaBut, it needs a new idea

Q: What comes after the telephone?Q: What comes after the telephone?

Supercomputers You UseSupercomputers You Use

Hotmail, Yahoo!, Google: ~10k serversHotmail, Yahoo!, Google: ~10k servers

Amazon, Barnes&NobleAmazon, Barnes&Noble

Expedia, OrbitzExpedia, Orbitz

Dell, HP,…Dell, HP,…

Service-oriented architecturesService-oriented architectures

Not computing on demandNot computing on demand, but , but information on demand!information on demand!

Distributed Computing EconomicsDistributed Computing Economics

Why is Seti@Home a great idea?Why is Seti@Home a great idea?Why is Napster a great deal?Why is Napster a great deal?Why is the Computational Grid Why is the Computational Grid uneconomicuneconomicWhen does computing on When does computing on demand work?demand work?What is the “right” level of abstraction?What is the “right” level of abstraction?Is the Access Grid the real killer app?Is the Access Grid the real killer app?

Based on: Distributed Computing Economics, Based on: Distributed Computing Economics, Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24Jim Gray, Microsoft Tech report, March 2003, MSR-TR-2003-24

http://research.microsoft.com/research/pubs/view.aspx?tr_id=655http://research.microsoft.com/research/pubs/view.aspx?tr_id=655

PollPoll

Is there a market for Supercomputers?Is there a market for Supercomputers?Yes, Google, Expedia, Hotmail,…Yes, Google, Expedia, Hotmail,…

Is Computing On Demand a high-Is Computing On Demand a high-margin business?margin business?I think notI think not

Do you know the equivalent high-Do you know the equivalent high-margin business?margin business?Information on demandInformation on demand

Take AwaysTake Aways

Computing on demand is a service Computing on demand is a service business; probably not high margin; business; probably not high margin; questionable economics; think questionable economics; think LoudCloudLoudCloud

Distributed computing is coming,Distributed computing is coming,but it is probably via Service Oriented but it is probably via Service Oriented Architecture (SOA)Architecture (SOA)

Web Services is the way to do SOAWeb Services is the way to do SOA

OutlineOutline

Overview of Microsoft ResearchOverview of Microsoft Research

Distribute Computing EconomicsDistribute Computing Economics

Q&AQ&A

© 2004 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.