windows nt scalability jim gray microsoft research [email protected] http/gray/talks

39
Windows NT Scalability Jim Gray Microsoft Research [email protected] http/www.research.Microsoft.com/~Gray/talks/

Upload: annabella-strickland

Post on 05-Jan-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Windows NT Scalability

Jim Gray

Microsoft [email protected]

http/www.research.Microsoft.com/~Gray/talks/

Page 2: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

OutlineOutline

• Scalability: What & Why?

• Scale UP: NT SMP scalability

• Scale OUT: NT Cluster scalability

• Key Message:

– NT can do the most demanding apps today.

– Tomorrow will be even better.

Scale OutScale Out

Scale Up

Scale DownDown

Page 3: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Scale OutScale Out

Server ClusterServer Cluster

What is Scalability?

• Grow without limits– Capacity– Throughput

• Do not add complexity– design– administer– Operate– UseS

cale

Do

wn

Do

wn

Win TermWin TermNetPCNetPC

HandheldHandheld

PortablePortable

TVTV

Sca

le U

pSuperSuperServerServer

ServerServer

PC PC WorkstationWorkstation

Page 4: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Scale UP & OUT Focus Here

• Grow without limits– SMP: 4, 8, 16, 32 CPUs– 64-bit addressing– Huge storage

• Cluster Requirements– Auto manage– High availability– Transparency– Programming tools & appsapps

Scale OutScale Out

Server ClusterServer Cluster

Sca

l e U

p

SuperSuperServerServer

ServerServer

Page 5: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Scalability is Important• Automation benefits growing

– ROI of 1 month....

• Slice price going to zero– Cyberbrick costs 5k$

• Design, Implement & Manage cost going down

– DCOM & Viper make it easy!– NT Clusters are easy!

• Billions of clients imply millions of HUGE servers.

• Thin clients imply huge servers.

ServerServer

Page 6: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Q: Why Does Microsoft Care? A: Billions of clients need millions of servers

Expect Microsoft to work hard on Scaleable Windows NT and Scaleable BackOffice.

Key technique: INTEGRATION.

0300600900

1,2001,5001,8002,1002,4002,700

1994 1995 1996 1997 1998 1999 2000 2001

UnixUnix

WindowsNT WindowsNT ServerServer

NetWareNetWare

Servers Shipped per year

(97-01 are MS estimates)

Page 7: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

OutlineOutline

• Scalability: What & Why?

• Scale UP: NT SMP scalability

• Scale OUT: NT Cluster scalability

• Key Message:– NT can do the most demanding apps today.

– Tomorrow will be even better.

Scale OutScale Out

Scale Up

Scale DownDown

Page 8: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

How Scaleable is NT??The Single Node Story

• 64 bit file system in NT 1, 2, 3, 4, 5

• 8 node SMP in NT 4.E, 32 node OEM

• 64 bit addressing in NT 5

• 1 Terabyte SQL Databases (PetaByte capable)

• 10,000 users (TPC-C benchmark)

• 100 Million web hits per day (IIS)

• 50 GB Exchange mail store next release designed for 16 TB

• 50,000 POP3 users on Exchange (1.8 M messages/day)

• And, more coming…..

Page 9: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Windows NT ServerEnterprise Edition

• Scalability– 8x SMP support (32x in OEM kit)– Larger process memory (3GB Intel)– Unlimited Virtual Roots in IIS (web)

• Transactions– DCOM transactions (Viper TP mon) – Message Queuing (Falcon)

• Availability– Clustering (WolfPack)– Web, File, Print,DB … servers fail over.

Page 10: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

1987: 256 tps $ 14 million computerA dozen peopleTwo rooms of machines

1997: 1,250 tps $ 50 k$ computerOne person1 micro-dollar per transaction (1,000x cheaper)

What Happens in 10 Years?

Ready for the next 10 years?

Page 11: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

NT vs UNIX SMPs• NT traditionally ran on 1 to 4 cpus

– Scales near-linear on them

• UNIX boxes: 32-64 way SMPs– They do 3x more tpmC– They cost 10x more.

• 10 way NT machines are available– They cost more– They are faster

• My view (shared by many)– Need clusters for availability– Cluster commodity servers to make huge systems– a la Tandem, Teradata, VMScluster, IBM Sysplex, IBM SP2 – Clusters reduce need for giant SMPs

tpmC vs Time

05,000

10,00015,00020,00025,00030,00035,000

Jan-95 Jan-96 Jan-97

tpm

C

tpmC vs Time

05,000

10,00015,00020,00025,00030,00035,000

Jan-95 Jan-96 Jan-97

tpm

C

hh

UnixNT

tpmC vs Time

05,000

10,00015,000

20,00025,000

30,00035,000

Jan-95 Jan-96 Jan-97

tpm

C

hUnix

NT

Page 12: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Transaction Throughput TPC-C• On comparable hardware: NT scales better!

• SQL Server & NT Improving 250% per year

• NT has best Price Performance (2x cheaper)

tpmC on Intel CPUs

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

0 1 2 3 4 5 6 7 8 9 10

tpm

C

NT

UNIX

h

h

hhhh

tpmC vs Intel CPUs

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

0 1 2 3 4 5 6 7 8 9 10

tpm

C

NT all

NT Best

Unix best

hh

Page 13: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

NT Scales Better Than Solaris• Microsoft SQL

NTIntel scales to 6x

• Beats Sybase Solaris UltraSPARCup to 11-way

0

5,000

10,000

15,000

20,000

0 10 20cpus

tpm

C

Sybase/Solaris

/UltraSPARC

MS

SQL/N

T/Inte

l

Page 14: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Only NT Has Economy of Scale

• NT is 2x less expensive40$/tpmCvs 110$/tpmC

• Only NT has economy of scale

• Unix has dis-economy of scale

Transactions/k$ by vendor

0.0

5.0

10.0

15.0

20.0

25.0

0 10,000 20,000 30,000

tpmC

tpm

C/k

$

DB2/Unix

Sybase/Unix

Informix/Unix

Microsoft/NT

Oracle/Unix

Page 15: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Scaleup To Big Databases?• NT 4 and SQL Server 6.5

– DBs up to 1 Billion records,

– 100 GB

– Covers most (80%) data warehouses

• SQL Server 7.0

– Designed for Terabytes

• Hundreds of disks per server.

• SMP parallel search

– Data Mining and Multi-Media

• TerraServer is good MM example

ExcelExcelspreadsheetspreadsheet

Manhattan phone book Manhattan phone book (15MB)(15MB)

Human GenomeHuman Genome (3GB) (3GB)

Dayton-HudsonDayton-HudsonSales recordsSales records(300GB)(300GB)

SatelliteSatellitephotos of photos of

Earth (1 TB)Earth (1 TB)

Page 16: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Database Scaleup: TerraServer™• Demo NT and SQL Server scalability• Stress test SQL Server 7.0• Requirements

– 1 TB– Unencumbered (put on www)– Interesting to everyone everywhere– And not offensive to anyone anywhere

• Loaded – 1.1 M place names from Encarta World Atlas– 1 M Sq Km from USGS (1 meter resolution)– 2 M Sq Km from Russian Space agency (2 m)

• Will be on web (world’s largest atlas)• Sell images with commerce server.• USGS CRDA: 3 TB more coming.

Page 17: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

TerraServer System

• DEC Alpha 4100 (4x smp) +

• 324 StorageWorks Drives (1.4 TB)

• RAID 5 Protected

• SQL Server 7.0

• USGS 1-meter data (30% of US)

• Russian Space dataTwo meterresolutionimages(2 M km2

2% of earth)

SPIN-2SPIN-2

Page 18: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Http://t2b2c

Demo

Page 19: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Manageability Windows NT 5.0 and Windows 98

• Active Directory tracks all objects in net

• Integration with IE 4.–Web-centric user interface

• Management Console–Component architecture

• Zero Admin Kit and Systems Management Server

• PlugNPlay, Instant On, Remote Boot,..

• Hydra and Intelli-Mirroring

Page 20: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Windows NT ServerWindows NT Serverwith “Hydra” Serverwith “Hydra” Server

Dedicated Dedicated Windows Windows terminalterminal

Existing, Existing, Desktop PC Desktop PC

MS-DOS, MS-DOS, UNIX, UNIX, Mac Mac clientsclients

Net PCNet PC

Thin Client SupportTSO comes to NT

lower per-client costs

Page 21: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Best of PC andBest of PC andcentralized computing advantagescentralized computing advantages

Windows NT 5.0IntelliMirror™

• Extends CMU Coda File System ideas

• Files and settings mirrored on client and server

• Great for disconnected users

• Facilitates roaming

• Easy to replace PCs

• Optimizes network performance

Page 22: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

OutlineOutline

• Scalability: What & Why?

• Scale UP: NT SMP scalability

• Scale OUT: NT Cluster scalability

• Key Message:

– NT can do the most demanding apps today.

– Tomorrow will be even better.

Scale OutScale Out

Scale Up

Scale DownDown

Page 23: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Scale OUTClusters Have Advantages

• Fault tolerance: – Spare modules mask failures

• Modular growth without limitswithout limits– Grow by adding small modules

• Parallel data search– Use multiple processors and disks

• Clients and servers made from the same stuff– Inexpensive: built with

commodity CyberBricks

Page 24: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

How scaleable is NT??The Cluster Story

• 16-node Tandem Cluster– 64 cpus– 2 TB of disk– Decision support

• 45-node Compaq Cluster– 140 cpus– 14 GB DRAM– 4 TB RAID disk– OLTP (Debit Credit)

• 1 B tpd (14 k tps)

Page 25: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

microsoft.com• Production

– Windows NT.4 and IIS.3• 20 HTTP, • 3 download, • 3 FTP• 5 SQL 6.5• Index Server + 3 search

• Stagers– Site Server for content– DCOM Publishing wizard

• Network– 6 DS3– 4 TB/day download capacity

• Replicas in UK and Japan

• 90m hits/day– 17m page views– #4 site on Internet

• 900k visitors per day• Not cheap

– Data Centers– Bandwidth– 27 people on content – 22 people on systems

Page 26: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Tandem 2 Ton

• 2 TB SQL database

• 1.2 TB user data

• 16 node cluster

• 64 cpus, 480 disks

• Decision support parallel data-mining

• Will be Wolf Pack aware

• Demoed at DB Expo in

• ServerNet™ interconnect

Page 27: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Billion Transactions per Day Project

• Built a 45-node Windows NT Cluster (with help from Intel & Compaq) > 900 disks

• All off-the-shelf parts

• Using SQL Server & DTC distributed transactionsDCOM & ODBC clientson 20 front-end nodes

• DebitCredit Transaction

• Each server node has 1/20 th of the DB

• Each server node does 1/20 th of the work

• 15% of the transactions are “distributed”

Page 28: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Type nodes CPUs DRAM ctlrs disks RAIDspace

WorkflowMTS

20CompaqProliant

2500

20x

2

20x

128

20x

1

20x

1

20x

2 GB

SQL Server

20CompaqProliant

5000

20x

4

20x

512

20x

4

20x36x4.2GB7x9.1GB

20x

130 GB

DistributedTransactionCoordinator

5CompaqProliant

5000

5x

4

5x

256

5x

1

5x

3

5x

8 GB

TOTAL 45 140 13 GB 105 895 3 TB

Billion Transactions Per Day Hardware

• 45 nodes (Compaq Proliant)

• Clustered with 100 Mbps Switched Ethernet

• 140 cpu, 13 GB, 3 TB (RAID 1, 5).

Page 29: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Cluster ArchitectureSwitch

DriverDatabase

DTC

Control

VIPDC2 VIPDC3 VIPDC4 VIPDC5 VIPDC6 VIPDC7 VIPDC8 VIPDC9 VIPDC10 VIPDC11

VIPDC12 VIPDC13 VIPDC14 VIPDC15 VIPDC16 VIPDC17 VIPDC18 VIPDC19 VIPDC20 VIPDC21

VIPDTC1 VIPDTC2 VIPDTC3 VIPDTC4 VIPDTC5

VIPDC42 VIPDC43 VIPDC44 VIPDC45 VIPDC46 VIPDC47 VIPDC48 VIPDC49 VIPDC50 VIPDC51

Page 30: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

DebitCreditDriver

DebitCreditComponent

DatabaseDriverThread

Local Debit Credit

1

2

4

5

7

89

12

13

14

Loop

3 Run

6 Init

10DebitCredit 11

DebitCredit

Page 31: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Distributed Debit Credit - Same DTC

Database1

DebitCredit

Database2

DTC

11

12

13

14

15

17

18

19

20

21

25

25

26

26

27

27

28

28

23

16

24

29

22 UpdateAcct

Page 32: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Distributed Debit Credit - Different DTC

Database1

DebitCredit

Database2

DTC1

11

12

13

DTC2

14

15

16

19

2727

18

17

25

23

20

2122

28 2932

33

3030

3131

3434

26

35

24

UpdateAcct

Page 33: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

1.2 B tpd• 1 B tpd ran for 24 hrs.

• Out-of-the-box software

• Off-the-shelf hardware

• AMAZING!•Sized for 30 days•Linear growth•5 micro-dollars per transaction

Page 34: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

34

Millions of Transactions Per Day

0.1

1.

10.

100.

1,000.

1 Btpd Visa ATT BofA NYSE

Mtp

d

Millions of Transactions Per Day

0.100.200.300.400.500.600.700.800.900.

1,000.

1 Btpd Visa ATT BofA NYSE

Mtp

d

How Much Is 1 Billion Tpd?• 1 billion tpd = 11,574 tps

~ 700,000 tpm (transactions/minute)• ATT

– 185 million calls per peak day (worldwide)

• Visa ~20 million tpd– 400 million customers– 250K ATMs worldwide– 7 billion transactions

(card+cheque) in 1994 • New York Stock Exchange

– 600,000 tpd• Bank of America

– 20 million tpd checks cleared (more than any other bank)– 1.4 million tpd ATM transactions

• Worldwide Airlines Reservations: 250 Mtpd

Page 35: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

1 B tpd: So What?

• Shows what is possible, easy to build

– Grows without limits

• Shows scaleup of DTC, MTS, SQL…

• Shows (again) that shared-nothing clusters scale

•Next task: make it easy.– auto partition data

– auto partition application

– auto manage & operate

Page 36: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Cluster Server: High Availability• Multiple servers form one system

• Industry standard APIs and hardware

• Server application and tools support– IIS web server– File and Print servers– IP and NetName failover– Transaction and Queue Server failover– SQL Server, Enterprise edition

• Tight integration with Windows NT -- its easy!

• Two-Node clusters now (2 to 20 cpus)

• 16 node soon (2 to 192 cpus).

Page 37: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Web Web sitesite

DatabaseDatabase

Web site filesWeb site files

Database filesDatabase files

Server 1Server 1 Server 2Server 2

BrowserBrowser

WolfPack ClusterIIS & SQL Failover Demo

Web Web sitesite

DatabaseDatabase

AliceAlice BettyBetty

Page 38: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

SummarySummary

• SMP Scale UP: OK but limited• Cluster Scale OUT: OK and unlimited• Manageability:

– fault tolerance OK & easy!– more needed

• CyberBricks work• Manual Federation now• Automatic in future

Scale OutScale Out

Scale Up

Scale DownDown

Page 39: Windows NT Scalability Jim Gray Microsoft Research Gray@Microsoft.com http/Gray/talks

Scalability Research Problems • Automatic everything

• Scaleable applications– Parallel programming with clusters– Harvesting cluster resources

• Data and process placement– auto load balance– dealing with scale (thousands of nodes)

• High-performance DCOM – active messages meet ORBs?

• Process pairs, other FT concepts?

• Real time: instant failover

• Geographic (WAN) failover