1 the future of distributed systems. jim gray researcher microsoft corp. [email protected]

60
1 The Future of The Future of Distributed Systems Distributed Systems . Jim Gray Jim Gray Researcher Researcher Microsoft Corp. Microsoft Corp. [email protected] [email protected]

Upload: ariana-parks

Post on 26-Mar-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

1

The Future of The Future of Distributed SystemsDistributed Systems

..

Jim GrayJim GrayResearcherResearcher

Microsoft Corp.Microsoft [email protected]@Microsoft.com

Page 2: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

2

OutlineOutline Global forcesGlobal forces

Moore’s, Metcalf’s, Bell’s, Bills, Andy’s lawsMoore’s, Metcalf’s, Bell’s, Bills, Andy’s laws

Micro dollars per transactionMicro dollars per transaction

Cyber-content is key valueCyber-content is key valuebecause distribution costs go to because distribution costs go to

zerozero

Distributed Systems Concepts and termsDistributed Systems Concepts and terms

Key software technologiesKey software technologies

objects, transactionsobjects, transactions

Page 3: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Metcalf’s LawMetcalf’s LawNetwork Utility = UsersNetwork Utility = Users22

How many connections can it How many connections can it make?make? 1 user: no utility1 user: no utility 100,000 users: a few contacts100,000 users: a few contacts 1 million users: many on Net1 million users: many on Net 1 billion users: everyone on Net1 billion users: everyone on Net

That is why the Internet is so “hot”That is why the Internet is so “hot” Exponential benefitExponential benefit

Page 4: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

XXX doubles every 18 months XXX doubles every 18 months 60% increase per year60% increase per year Micro processor speedsMicro processor speeds Chip densityChip density Magnetic disk densityMagnetic disk density Communications bandwidthCommunications bandwidth

WAN bandwidth approaching WAN bandwidth approaching LAN speedsLAN speeds

Exponential growth:Exponential growth: The past does not matterThe past does not matter 10x here, 10x there, soon you’re talking REAL change10x here, 10x there, soon you’re talking REAL change

PC costs decline faster than any other platformPC costs decline faster than any other platform Volume and learning curvesVolume and learning curves PCs will be the building bricks of all future systemsPCs will be the building bricks of all future systems

Moore’s First LawMoore’s First Law

128KB128KB

128MB128MB

200020008KB8KB

1MB1MB

8MB8MB

1GB1GB

19701970 19801980 19901990

1M1M 16M16Mbits: 1Kbits: 1K 4K4K 16K16K 64K64K 256K256K 4M4M 64M64M 256M256M

1 chip memory size1 chip memory size ( 2 MB to 32 MB)( 2 MB to 32 MB)

Page 5: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Bumps In The Moore’s Bumps In The Moore’s Law RoadLaw Road

DRAM:DRAM: 1988: United States1988: United States

anti-dumping anti-dumping rulesrules

1993-1995: ?price flat1993-1995: ?price flat

10000001000000

11

100100

1000010000

19701970 19801980 19901990 20002000

$/MB of DRAM$/MB of DRAM

.01.01

11

100100

10,00010,000

19701970 19801980 19901990 20002000

$/MB of DISK$/MB of DISK Magnetic disk:Magnetic disk: 1965-1989: 10x/decade1965-1989: 10x/decade 1989-1996: 4x/3year!1989-1996: 4x/3year!

100X/decade100X/decade

Page 6: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Gordon Bell’s Seven Price Gordon Bell’s Seven Price TiersTiers

10$: 10$: wrist watch computerswrist watch computers

100$:100$: pocket/ palm computerspocket/ palm computers

1,000$:1,000$: portable computersportable computers

10,000$: 10,000$: personal computers (desktop)personal computers (desktop)

100,000$: 100,000$: departmental computers departmental computers (closet)(closet)

1,000,000$:1,000,000$: site computers (glass house)site computers (glass house)

10,000,000$:10,000,000$: regional computers (glass regional computers (glass castle)castle) Super server: costs more than $100,000

“Mainframe”: costs more than $1 millionMust be an array of processors, disks, tapes, comm ports

Page 7: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Bell’s Evolution Of Bell’s Evolution Of Computer ClassesComputer Classes

Technology enables two evolutionary paths:1. constant performance, decreasing cost2. constant price, increasing performance

????TimeTime

Mainframes (central)Mainframes (central)

Minis (dep’t.)Minis (dep’t.)

PCs (personals)PCs (personals)Log

pri

ce

Log

pri

ce

WSsWSs

1.26 = 2x/3 yrs -- 10x/decade; 1/1.26 = .81.26 = 2x/3 yrs -- 10x/decade; 1/1.26 = .81.6 = 4x/3 yrs --100x/decade; 1/1.6 = .621.6 = 4x/3 yrs --100x/decade; 1/1.6 = .62

Page 8: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Software EconomicsSoftware Economics An engineer costs aboutAn engineer costs about

$150,000/year$150,000/year R&D gets [5%…15%]R&D gets [5%…15%]

of budgetof budget Need [$3 million…Need [$3 million…

$1 million] revenue $1 million] revenue per engineer per engineer

Microsoft: $9 billionMicrosoft: $9 billion

R&DR&D16%16%

SG&ASG&A34%34%

ProductProductand Serviceand Service

13%13%

TaxTax13%13%

ProfitProfit24%24%

Intel: $16 billionIntel: $16 billion

R&DR&D8%8%

SG&ASG&A11%11%

P&SP&S47%47%

TaxTax

12%12%

ProfitProfit22%22%

R&DR&D8%8%

SG&ASG&A22%22%

P&SP&S59%59%

TaxTax5%5%

ProfitProfit6%6%

IBM: $72 billionIBM: $72 billion

R&DR&D9%9%

SG&ASG&A43%43%

TaxTax7%7%

ProfitProfit15%15%

P&SP&S26%26%

Oracle: $3 billionOracle: $3 billion

Page 9: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Software Economics: Bill’s LawSoftware Economics: Bill’s Law

Bill Joy’s law (Sun): Bill Joy’s law (Sun): don’t write software for less than 100,000 platforms don’t write software for less than 100,000 platforms

@$10 million engineering expense, $1,000 price@$10 million engineering expense, $1,000 price Bill Gate’s law:Bill Gate’s law:

don’t write software for less than 1,000,000 platforms don’t write software for less than 1,000,000 platforms @$10 engineering expense, $100 price@$10 engineering expense, $100 price

Examples: Examples: UNIX versus Windows NT: $3,500 versus $500UNIX versus Windows NT: $3,500 versus $500Oracle versus SQL-Server: $100,000 versus $6,000Oracle versus SQL-Server: $100,000 versus $6,000No spreadsheet or presentation pack on UNIX/VMS/...No spreadsheet or presentation pack on UNIX/VMS/...

Commoditization of base software and hardwareCommoditization of base software and hardware

PricePrice Fixed_Fixed_CostCostMarginal _CostMarginal _Cost==

UnitsUnits ++

Page 10: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Gordon Bell’s Gordon Bell’s Platform EconomicsPlatform Economics

Computer typeComputer type

0.010.01

0.10.1

11

1010

100100

10001000

1000010000

100000100000

MainframeMainframe WSWS BrowserBrowser

Price (K$)Price (K$)

Volume (K)Volume (K)

ApplicationApplicationpriceprice

Traditional computers: custom or semi-custom,Traditional computers: custom or semi-custom, high-tech and high-touch high-tech and high-touch

New computers: high-tech and no-touch New computers: high-tech and no-touch

Page 11: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Grove’s LawGrove’s LawThe New Computer IndustryThe New Computer Industry

Horizontal Horizontal integrationintegrationis new structureis new structure

Each layer picks Each layer picks best from lower best from lower layerlayer

Desktop (C/S) Desktop (C/S) marketmarket1991: 50%1991: 50%1995: 75%1995: 75%

Intel & SeagateIntel & SeagateSilicon & OxideSilicon & Oxide

SystemsSystemsBasewareBasewareMiddlewareMiddlewareApplicationsApplications SAPSAP

OracleOracleMicrosoftMicrosoft

CompaqCompaq

IntegrationIntegration EDSEDS

OperationOperation AT&TAT&TFunctionFunction ExampleExample

Page 12: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

12

OutlineOutline Global forcesGlobal forces

Moore’s, Metcalf’s, Bell’s, Bills, Andy’s lawsMoore’s, Metcalf’s, Bell’s, Bills, Andy’s laws

Micro dollars per transactionMicro dollars per transaction

Cyber-content is key valueCyber-content is key valuebecause distribution costs go to because distribution costs go to

zerozero

Distributed Systems Concepts and termsDistributed Systems Concepts and terms

Key software technologiesKey software technologies

objects, transactionsobjects, transactions

Page 13: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

13

1987: 256 tps Benchmark 1987: 256 tps Benchmark 14 M$ computer (Tandem)14 M$ computer (Tandem) A dozen peopleA dozen people False floor, 2 rooms of machinesFalse floor, 2 rooms of machines

Simulate 25,600 clients

A 32 node processor array

A 40 GB disk array (80 drives)

OS expert

Network expert

DB expert

Performance expert

Hardware experts

Admin expert

Auditor

Manager

Page 14: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

15

1997: 10 years later1997: 10 years later1 Person and 1 box = 1250 tps1 Person and 1 box = 1250 tps

1 Breadbox ~ 5x 1987 machine room1 Breadbox ~ 5x 1987 machine room 23 GB is hand-held23 GB is hand-held One person does all the workOne person does all the work Cost/tps is 1,000x lessCost/tps is 1,000x less

1 micro dollar per transaction1 micro dollar per transaction4x200 Mhz cpu1/2 GB DRAM12 x 4GB disk

Hardware expertOS expertNet expertDB expertApp expert

3 x7 x 4GB disk arrays

Page 15: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

16

What Happened?What Happened? Moore’s law: Moore’s law:

Things get 4x better every 3 yearsThings get 4x better every 3 years (applies to computers, storage, and networks)(applies to computers, storage, and networks)

New Economics: CommodityNew Economics: Commodityclassclass price/mips softwareprice/mips software $/mips k$/year $/mips k$/yearmainframe mainframe 10,000 10,000 100 100 minicomputerminicomputer 100 100 10 10microcomputer 10 microcomputer 10 1 1

GUI: Human - computer tradeoffGUI: Human - computer tradeoffoptimize for people, not computersoptimize for people, not computers

mainframeminimicro

time

pric

e

Page 16: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

17

What Happens NextWhat Happens Next Last 10 years: Last 10 years:

1000x improvement 1000x improvement Next 10 years: Next 10 years:

???????? Today: Today:

text and image servers are freetext and image servers are free 1 1 $/hit cost$/hit cost

70,000 70,000$/hit advertising revenue$/hit advertising revenue Advertising pays for themAdvertising pays for them Content is only “real” expenseContent is only “real” expense ““You ain’t seen nothing yet!” You ain’t seen nothing yet!”

1985 20051995

perf

orm

ance

Page 17: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Kinds Of Kinds Of Information ProcessingInformation Processing

It’s ALL going electronicIt’s ALL going electronic

Immediate is being stored for analysis (so ALL database)Immediate is being stored for analysis (so ALL database)

Analysis and automatic processing are being addedAnalysis and automatic processing are being added

Point-to-pointPoint-to-point BroadcastBroadcast

ImmediateImmediate

Time-Time-shiftedshifted

ConversationConversationMoneyMoney

LectureLectureConcertConcert

MailMail BookBookNewspaperNewspaper

NetworkNetwork

DatabaseDatabase

Page 18: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Why Put EverythingWhy Put EverythingIn Cyberspace?In Cyberspace?

Low rent -Low rent -min $/bytemin $/byte

Shrinks time -Shrinks time -now or laternow or later

Shrinks space -Shrinks space -here or therehere or there

Automate processing -Automate processing -knowbotsknowbots

Point-to-point Point-to-point OR OR

broadcastbroadcast

Imm

ed

iate

OR

tim

e-d

ela

ye

dIm

me

dia

te O

R t

ime

-de

lay

ed

NetworkNetwork

DatabaseDatabase

LocateLocateProcessProcessAnalyzeAnalyzeSummarizeSummarize

Page 19: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Billions Of Clients Billions Of Clients

Every device will be “intelligent”Every device will be “intelligent” Doors, rooms, cars…Doors, rooms, cars… Computing will be ubiquitousComputing will be ubiquitous

Page 20: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Billions Of ClientsBillions Of ClientsNeed Millions Of ServersNeed Millions Of Servers

MobileMobileclientsclients

FixedFixedclients clients

ServerServer

SuperSuperserverserver

ClientsClients

ServersServers

All clients networked All clients networked to serversto servers May be nomadicMay be nomadic

or on-demandor on-demand Fast clients wantFast clients want

fasterfaster servers servers Servers provide Servers provide

Shared DataShared Data ControlControl CoordinationCoordination CommunicationCommunication

Page 21: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

ThesisThesisMany little beat few bigMany little beat few big

Smoking, hairy golf ballSmoking, hairy golf ball How to connect the many little parts?How to connect the many little parts? How to program the many little parts?How to program the many little parts? Fault tolerance?Fault tolerance?

$1 $1 millionmillion $100 K$100 K $10 K$10 K

MainframeMainframe MiniMiniMicroMicro NanoNano

14"14"9"9"

5.25"5.25" 3.5"3.5" 2.5"2.5" 1.8"1.8"1 M SPECmarks, 1TFLOP1 M SPECmarks, 1TFLOP

101066 clocks to bulk ram clocks to bulk ram

Event-horizon on chipEvent-horizon on chip

VM reincarnatedVM reincarnated

Multiprogram cache,Multiprogram cache,On-Chip SMPOn-Chip SMP

10 microsecond ram

10 millisecond disc

10 second tape archive

10 nano-second ram

Pico Processor

10 pico-second ram

1 MM 3

100 TB

1 TB

10 GB

1 MB

100 MB

Page 22: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Future Super Server:Future Super Server:4T Machine4T Machine

Array of 1,000 4B machinesArray of 1,000 4B machines1 bps processors1 bps processors1 BB DRAM 1 BB DRAM 10 BB disks 10 BB disks 1 Bbps comm lines1 Bbps comm lines1 TB tape robot1 TB tape robot

A few megabucksA few megabucks Challenge:Challenge:

ManageabilityManageabilityProgrammabilityProgrammabilitySecuritySecurityAvailabilityAvailabilityScaleabilityScaleabilityAffordabilityAffordability

As easy as a single systemAs easy as a single system

Future servers are CLUSTERSFuture servers are CLUSTERSof processors, discsof processors, discs

Distributed database techniquesDistributed database techniquesmake clusters workmake clusters work

CPU

50 GB Disc

5 GB RAM

Cyber BrickCyber Bricka 4B machinea 4B machine

Page 23: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

The Hardware Is In Place…The Hardware Is In Place…And then a miracle occursAnd then a miracle occurs

? SNAP: scaleable networkSNAP: scaleable network

and platformsand platforms Commodity-distributedCommodity-distributed

OS built on:OS built on: Commodity platformsCommodity platforms Commodity networkCommodity network

interconnectinterconnect Enables parallel applicationsEnables parallel applications

Page 24: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

25

OutlineOutline Global forcesGlobal forces

Moore’s, Metcalf’s, Bell’s, Bills, Andy’s lawsMoore’s, Metcalf’s, Bell’s, Bills, Andy’s laws

Micro dollars per transactionMicro dollars per transaction

Cyber-content is key valueCyber-content is key valuebecause distribution costs go to because distribution costs go to

zerozero

Distributed Systems Concepts and termsDistributed Systems Concepts and terms

Key software technologiesKey software technologies

objects, transactionsobjects, transactions

Page 25: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

26

OutlineOutlineConcepts and TerminologyConcepts and Terminology

Why DistributedWhy Distributed

Distributed data & objectsDistributed data & objects

Distributed executionDistributed execution

Three tier architecturesThree tier architectures

Transaction conceptsTransaction concepts

Page 26: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

27

What’s a Distributed What’s a Distributed System?System?

Centralized: Centralized: everything in one placeeverything in one place stand-alone PC or Mainframestand-alone PC or Mainframe

Distributed: Distributed: some parts remotesome parts remote

distributed usersdistributed users distributed executiondistributed execution distributed datadistributed data

Page 27: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

28

Why Distribute?Why Distribute?

No best organizationNo best organization

Companies constantly swing betweenCompanies constantly swing between Centralized: focus, control, economyCentralized: focus, control, economy Decentralized: adaptive, responsive, competitiveDecentralized: adaptive, responsive, competitive

Why distribute?Why distribute? reflect organization or application structure reflect organization or application structure empower users / producersempower users / producers improve service (response / availability)improve service (response / availability) distributed loaddistributed load use PC technology (economics)use PC technology (economics)

Page 28: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

29

What What Should Be Distributed? Should Be Distributed?

Users and User InterfaceUsers and User Interface Thin client Thin client

ProcessingProcessing Trim clientTrim client

DataData Fat clientFat client

Will discuss tradeoffs later Will discuss tradeoffs later

Database

Business Objects

workflow

Presentation

Page 29: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

30

Transparency Transparency in Distributed Systemsin Distributed Systems

Make distributed system as easy to use and Make distributed system as easy to use and manage as a centralized systemmanage as a centralized system

Give a Single-System ImageGive a Single-System Image

Location transparency:Location transparency: hide fact that object is remotehide fact that object is remote hide fact that object has movedhide fact that object has moved hide fact that object is partitioned or replicatedhide fact that object is partitioned or replicated

Name doesn’t change if object is replicated, Name doesn’t change if object is replicated, partitioned or moved.partitioned or moved.

Page 30: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

44

OutlineOutlineConcepts and TerminologyConcepts and Terminology Why Distribute Why Distribute

Distributed data & objectsDistributed data & objects PartitionedPartitioned ReplicatedReplicated

Distributed executionDistributed execution remote procedure callremote procedure call queuesqueues

Three tier architecturesThree tier architectures

Transaction conceptsTransaction concepts

Page 31: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

45

Distributed ExecutionDistributed ExecutionThreads and MessagesThreads and Messages

Thread is Execution unitThread is Execution unit(software analog of cpu+memory)(software analog of cpu+memory)

Threads execute at a nodeThreads execute at a node

Threads communicate viaThreads communicate via Shared memory (local)Shared memory (local) Messages (local and remote)Messages (local and remote)

threads

shared memory

messages

Page 32: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

46

Peer-to-Peer or Client-ServerPeer-to-Peer or Client-Server

Peer-to-Peer is symmetric:Peer-to-Peer is symmetric: Either side can sendEither side can send

Client-serverClient-server client sends requestsclient sends requests server sends responsesserver sends responses simple subset of peer-to-peersimple subset of peer-to-peer

requestresponse

Page 33: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

48

Remote Procedure Call: Remote Procedure Call: The key to transparencyThe key to transparency

Object may be Object may be local or remotelocal or remote

Methods on Methods on object work object work wherever it is.wherever it is.

Local Local invocationinvocation

y = pObj->f(x);

f()

x

valy = val;

return val;

Page 34: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

49

Remote Procedure Call: Remote Procedure Call: The key to transparencyThe key to transparency

Remote invocationRemote invocation

Obj Local?x

valy = val;

f()

return val;

y = pObj->f(x);

marshal

unmarshal

marshal

unmarshal

x

proxy

unmarshal

pObj->f(x)

marshal

xstub

Obj Local?Obj Local?x

val

f()

return val;

val

val

Page 35: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

50

Transaction

Object Request Broker (ORB)Object Request Broker (ORB) Orchestrates RPCOrchestrates RPC

Registers ServersRegisters Servers Manages pools of serversManages pools of servers Connects clients to serversConnects clients to servers Does Naming, request-level authorization,Does Naming, request-level authorization, Provides transaction coordination Provides transaction coordination (new feature)(new feature) Old names: Old names:

Transaction Processing Monitor, Transaction Processing Monitor, Web server, Web server, NetWareNetWare

Object-Request Broker

Page 36: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

51

So

lari

sU

NIX

Inte

rnat

ion

al

OSFDCE

Op

en s

oft

war

e F

ou

nd

atio

n (

OS

F)

NT

ODBCXA / TX

Ob

ject

M

anag

emen

t G

rou

p (

OM

G)

CORBAOpenGroup

History and Alphabet SoupHistory and Alphabet Soup

1985

1990

1995

X/O

pen

DCE

RPC

GUIDs

IDL

DNS

Kerber

os

COM

Microsoft DCOM based on OSF-DCE TechnologyDCOM and ActiveX extend it

Page 37: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

ActiveX and COMActiveX and COM COM is Microsoft model, engine inside OLE ALL COM is Microsoft model, engine inside OLE ALL

Microsoft software is based on COM (ActiveX)Microsoft software is based on COM (ActiveX) CORBA + OpenDoc is equivalentCORBA + OpenDoc is equivalent Heated debate over which is bestHeated debate over which is best Both share same key goals: Both share same key goals:

Encapsulation: hide implementationEncapsulation: hide implementation Polymorphism: generic operationsPolymorphism: generic operations

key to GUI and reuse key to GUI and reuse Versioning: allow upgradesVersioning: allow upgrades Transparency: local/remoteTransparency: local/remote Security: invocation can be remote Security: invocation can be remote Shrink-wrap: minimal inheritanceShrink-wrap: minimal inheritance Automation: easyAutomation: easy

COM now managed by the Open GroupCOM now managed by the Open Group

Page 38: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

Linking And EmbeddingLinking And EmbeddingObjects are data modules;Objects are data modules;

transactions are execution modulestransactions are execution modules

Link: pointer to object Link: pointer to object somewhere elsesomewhere else Think URL in InternetThink URL in Internet

Embed: bytesEmbed: bytesare hereare here

Objects may be Objects may be activeactive; ; can callback to subscriberscan callback to subscribers

Page 39: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

54

Transaction

Bottom Line Re ORBsBottom Line Re ORBs Microsoft Promises Cairo Microsoft Promises Cairo

distributed objects, distributed objects, secure, transparent, fast invocationsecure, transparent, fast invocation

Netscape promises the CORBANetscape promises the CORBA Both will deliverBoth will deliver Customers can pick the best oneCustomers can pick the best one

Object-Request Broker

Page 40: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

57

OutlineOutlineConcepts and TerminologyConcepts and Terminology Why DistributedWhy Distributed

Distributed data & objectsDistributed data & objects

Distributed executionDistributed execution remote procedure callremote procedure call queuesqueues

Three tier architecturesThree tier architectures whatwhat whywhy

Transaction conceptsTransaction concepts

Page 41: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

61

Work Distribution SpectrumWork Distribution Spectrum

Presentation Presentation and plug-insand plug-ins

Workflow Workflow manages manages session & session & invokes objectsinvokes objects

Business Business objectsobjects

DatabaseDatabase

Fat

ThinFat

Thin

Database

Business Objects

workflow

Presentation

Page 42: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

62

Transaction Processing Transaction Processing Evolution to Three TierEvolution to Three TierIntelligence migrated to clients Intelligence migrated to clients

Mainframe Batch processing Mainframe Batch processing (centralized)(centralized)

Dumb terminals &Dumb terminals & Remote Job Entry Remote Job Entry

Intelligent terminals Intelligent terminals database backendsdatabase backends

Workflow SystemsWorkflow SystemsObject Request BrokersObject Request BrokersApplication GeneratorsApplication Generators

Mainframe

cards

Active

green screen3270

Server

TP Monitor

ORB

Page 43: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

63

Web Evolution to Three TierWeb Evolution to Three TierIntelligence migrated to clients (like TP)Intelligence migrated to clients (like TP)

Character-mode clients, Character-mode clients, smart serverssmart servers

GUI Browsers - Web file serversGUI Browsers - Web file servers

GUI Plugins - Web dispatchers - CGIGUI Plugins - Web dispatchers - CGI

Smart clients - Web dispatcher (ORB)Smart clients - Web dispatcher (ORB)pools of app servers (ISAPI, Viper)pools of app servers (ISAPI, Viper)workflow scripts at client & serverworkflow scripts at client & server

archie ghophergreen screen

WebServer

Mosaic

WAIS

NS & IE

Active

Page 44: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

64

PC Evolution to Three TierPC Evolution to Three Tier Intelligence migrated to serverIntelligence migrated to server

Stand-alone PC Stand-alone PC (centralized)(centralized)

PC + File & print serverPC + File & print servermessage per I/Omessage per I/O

PC + Database server PC + Database server message per SQL message per SQL

statementstatement

PC + App server PC + App server message per transactionmessage per transaction

ActiveX Client, ORB ActiveX Client, ORB ActiveX server, Xscript ActiveX server, Xscript

disk I/OIO request

reply

SQL Statement

Transaction

Page 45: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

65

The Pattern: The Pattern: Three Tier ComputingThree Tier Computing

Clients do presentation, gather inputClients do presentation, gather input

Clients do some workflow (Xscript)Clients do some workflow (Xscript)

Clients send high-level requests to Clients send high-level requests to ORB (Object Request Broker)ORB (Object Request Broker)

ORB dispatches workflows and ORB dispatches workflows and business objects -- proxies for client, business objects -- proxies for client, orchestrate flows & queuesorchestrate flows & queues

Server-side workflow scripts call on Server-side workflow scripts call on distributed business objects to distributed business objects to execute taskexecute task

Database

Business Objects

workflow

Presentation

Page 46: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

66

The Three The Three TiersTiers

Web Client

HTML

VB or Java Script Engine

VB or Java Virt Machine

VBscritptJavaScrpt

VB Javaplug-ins

InternetORB

HTTP+DCOM

ObjectserverPool

MiddlewareORB

TP MonitorWeb Server...

DCOM (oleDB, ODBC,...)

Object & Dataserver.

LU6.2

IBMLegacy Gateways

Page 47: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

67

Why Did Everyone Go To Why Did Everyone Go To Three-Tier?Three-Tier?

ManageabilityManageability Business rules must be with dataBusiness rules must be with data Middleware operations toolsMiddleware operations tools

Performance (scaleability)Performance (scaleability) Server resources are preciousServer resources are precious ORB dispatches requests to server poolsORB dispatches requests to server pools

Technology & PhysicsTechnology & Physics Put UI processing near userPut UI processing near user Put shared data processing near shared Put shared data processing near shared

datadataDatabase

Business Objects

workflow

Presentation

Page 48: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

68

DAD’sRaw Data

Customer comes to storeTakes what he wantsFills out invoiceLeaves money for goods

Easy to buildNo clerks

Why Put Business Objects Why Put Business Objects at Server?at Server?

Customer comes to store with list Gives list to clerk Clerk gets goods, makes invoiceCustomer pays clerk, gets goods

Easy to manageClerks controls accessEncapsulation

MOM’s Business Objects

Page 49: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

69

What Middleware DoesWhat Middleware Does ORB, TP Monitor, Workflow Mgr, Web Server ORB, TP Monitor, Workflow Mgr, Web Server

Registers transaction programs Registers transaction programs

workflow and business objects (DLLs)workflow and business objects (DLLs) Pre-allocates server poolsPre-allocates server pools Provides server execution environmentProvides server execution environment Dynamically checks authorityDynamically checks authority

(request-level security)(request-level security)

Does parameter bindingDoes parameter binding Dispatches requests to serversDispatches requests to servers

parameter bindingparameter binding load balancingload balancing

Provides QueuesProvides Queues Operator interfaceOperator interface

Page 50: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

70

Server Side ObjectsServer Side Objects Easy Server-Side ExecutionEasy Server-Side Execution

ORB gives simple ORB gives simple execution environmentexecution environment

Object gets Object gets startstart invokeinvoke shutdownshutdown

Everything else is Everything else is automaticautomatic

Drag & Drop Business Drag & Drop Business ObjectsObjects

NetworkNetwork

Thread PoolThread Pool

QueueQueue

ConnectionsConnections

ContextContext SecuritySecurity

Shared Data

ReceiverReceiver

SynchronizationSynchronization

Service logic

Co

nfig

ura

tion

Co

nfig

ura

tion

Ma

na

ge

me

nt

Ma

na

ge

me

nt

A Server

Page 51: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

A new programming paradigm Develop object on the desktopDevelop object on the desktop Better yet: download them from the NetBetter yet: download them from the Net Script work flows as method invocations Script work flows as method invocations All on desktopAll on desktop Then, move work flows and objects to server(s)Then, move work flows and objects to server(s) GivesGives

desktop development desktop development three-tier deploymentthree-tier deploymentSoftware CyberbricksSoftware Cyberbricks

Page 52: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

72

Why Server Pools?Why Server Pools? Server resources are precious.Server resources are precious.

Clients have 100x more power than server. Clients have 100x more power than server.

Pre-allocate everything on serverPre-allocate everything on server preallocate memorypreallocate memory pre-open filespre-open files pre-allocate threadspre-allocate threads pre-open and authenticate clientspre-open and authenticate clients

Keep high duty-cycle on objectsKeep high duty-cycle on objects (re-use them)(re-use them) Pool threads, not one per clientPool threads, not one per client

Classic example: Classic example: TPC-C benchmarkTPC-C benchmark 2 processes2 processes

everything pre-allocatedeverything pre-allocated

7,000 clients

IIS SQL

Pool ofDBC linksHTTP

N clients x N Servers x F files =N x N x F file opens!!!

IE

Page 53: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

73

Classic Three-Tier Example Classic Three-Tier Example TPC-C TPC-C

Transaction Processing Transaction Processing Performance Council (TPC): Performance Council (TPC): standard performance benchmarksstandard performance benchmarks

5 transaction types5 transaction types order entry , payment , status (oltp)order entry , payment , status (oltp) delivery (mini-batch)delivery (mini-batch) restock (mini-DSS)restock (mini-DSS)

Metrics: Metrics: Throughput, Price/PerformanceThroughput, Price/Performance

Shows best practices:Shows best practices: everyone three tiereveryone three tier 2 processes at server2 processes at server everything pre-allocatedeverything pre-allocated

HT

TP

HT

TP

OD

BC

OD

BC

SQL SQL

IISIIS= Web= Web

7,000 Web clients7,000 Web clients

Pool ofDBC links

Page 54: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

75

OutlineOutline Laws & micro$/transactionLaws & micro$/transaction

Distributed SystemsDistributed Systems Why DistributedWhy Distributed

Distributed data & objectsDistributed data & objects

Distributed executionDistributed execution

Three tier architecturesThree tier architectures why: manageability & performancewhy: manageability & performance what: server side workflows & objectswhat: server side workflows & objects

Transaction conceptsTransaction concepts Why transactions?Why transactions? Using transactionsUsing transactions

Page 55: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

76

ThesisThesis Transactions are key to Transactions are key to

structuring distributed applications structuring distributed applications ACID properties easeACID properties ease

exception handlingexception handling AtomicAtomic: all or nothing: all or nothing ConsistentConsistent: state transformation: state transformation IsolatedIsolated: no concurrency anomalies: no concurrency anomalies DurableDurable: committed transaction effects persist: committed transaction effects persist

Page 56: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

77

What Is A Transaction?What Is A Transaction?

Programmer’s view: Programmer’s view: Bracket a collection of actionsBracket a collection of actions

A A simplesimple failure model failure model Only two outcomes:Only two outcomes:

Begin()Begin() actionaction actionaction actionaction actionactionCommit()Commit()

Success!Success!

Begin()Begin()action action actionactionactionactionRollback()Rollback()

Begin()Begin()action action actionactionactionaction

Rollback()Rollback()

Failure!Failure!

Fail !Fail !Fail !Fail !

Page 57: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

81

Why ACID For Why ACID For Client/Server And DistributedClient/Server And Distributed

ACID is important for centralized systemsACID is important for centralized systems Failures in centralized systems are simplerFailures in centralized systems are simpler In distributed systems:In distributed systems:

More and more-independent failuresMore and more-independent failures ACID is harder to implementACID is harder to implement

That makes it even MORE IMPORTANTThat makes it even MORE IMPORTANT Simple failure modelSimple failure model Simple repair modelSimple repair model

Page 58: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

90

OutlineOutline Why DistributedWhy Distributed Distributed data & objectsDistributed data & objects Distributed executionDistributed execution Three tier architecturesThree tier architectures Transaction conceptsTransaction concepts

Why transactions?Why transactions?

Using transactionsUsing transactions programmingprogramming workflowworkflow

Page 59: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

91

ReferencesReferences Essential Client/Server Survival Guide 2nd ed.Essential Client/Server Survival Guide 2nd ed.

Orfali, Harkey & Edwards, J. Wiley, 1996Orfali, Harkey & Edwards, J. Wiley, 1996

Client/Server Programming with Java and CORBAClient/Server Programming with Java and CORBA Orfali, Harkey, J Wiley, 1997Orfali, Harkey, J Wiley, 1997

Principles of Transaction ProcessingPrinciples of Transaction Processing Bernstein & Newcomer, Morgan Kaufmann, 1997Bernstein & Newcomer, Morgan Kaufmann, 1997

Transaction Processing ConceptsTransaction Processing Conceptsand Techniquesand Techniques Gray & Reuter, Morgan Kaufmann, 1993Gray & Reuter, Morgan Kaufmann, 1993

Page 60: 1 The Future of Distributed Systems. Jim Gray Researcher Microsoft Corp. Gray@Microsoft.com

92