systems issues for scalable, fault tolerant internet services yatin chawathe eric brewer to appear...

17
Systems Issues for Scalable, Systems Issues for Scalable, Fault Tolerant Internet Services Fault Tolerant Internet Services Yatin Chawathe Yatin Chawathe Eric Brewer Eric Brewer To appear in Middleware ’98 To appear in Middleware ’98 http://www.cs.berkeley.edu/~yatin/papers/sns-crc.ps http://www.cs.berkeley.edu/~yatin/papers/sns-crc.ps

Post on 22-Dec-2015

220 views

Category:

Documents


2 download

TRANSCRIPT

Systems Issues for Scalable, Systems Issues for Scalable, Fault Tolerant Internet ServicesFault Tolerant Internet Services

Yatin ChawatheYatin Chawathe

Eric BrewerEric Brewer

To appear in Middleware ’98To appear in Middleware ’98http://www.cs.berkeley.edu/~yatin/papers/sns-crc.pshttp://www.cs.berkeley.edu/~yatin/papers/sns-crc.ps

MotivationMotivation• Proliferation of network-based servicesProliferation of network-based services

• Two critical issues must be addressed by Two critical issues must be addressed by Internet services:Internet services:– System scalabilitySystem scalability

• Incremental and linear scalabilityIncremental and linear scalability

– Availability and fault toleranceAvailability and fault tolerance• 24x7 operation24x7 operation

A Reusable SNS FrameworkA Reusable SNS Framework• Clusters of workstations are ideal for Internet Clusters of workstations are ideal for Internet

services services [FGC+97][FGC+97]

• But, clusters are difficult to manageBut, clusters are difficult to manage– To ensure linear scalability, service must distribute To ensure linear scalability, service must distribute

load across the clusterload across the cluster– Service must grow the cluster with increasing loadService must grow the cluster with increasing load– Partial failures within a cluster complicate fault Partial failures within a cluster complicate fault

managementmanagementIsolate common requirements of cluster-based Internet apps into a reusable substrate --

the Scalable Network Services (SNS) framework

Isolate common requirements of cluster-based Internet apps into a reusable substrate --

the Scalable Network Services (SNS) framework

ArchitectureArchitecture

SNS SNS ManagerManager

SNS SNS ManagerManager

InternalInternalNetworkNetwork

WorkerWorkerWorkerWorker

Worker DriverWorker DriverWorker DriverWorker Driver

WorkerWorkerWorkerWorker

Worker DriverWorker DriverWorker DriverWorker Driver

WorkerWorkerWorkerWorker

Worker DriverWorker DriverWorker DriverWorker Driver

Worker DriverWorker DriverWorker DriverWorker Driver

WorkerWorkerWorkerWorker

Worker DriverWorker DriverWorker DriverWorker Driver

WorkerWorkerWorkerWorker

...

...

Outside WorldOutside World

WorkersWorkers• Workers are grouped into classes. Within a class, Workers are grouped into classes. Within a class,

workers are identicalworkers are identical

• Workers can receive tasks from the outside world, or Workers can receive tasks from the outside world, or from other workersfrom other workers

• Workers have a simple serial interface for tasksWorkers have a simple serial interface for tasks– The The originatororiginator sends a task to the sends a task to the consumerconsumer by specifying by specifying

the class and inputs for the taskthe class and inputs for the task

– Tasks are atomic and restartableTasks are atomic and restartable

– Worker Drivers present a narrow interface between the Worker Drivers present a narrow interface between the SNS substrate and the worker applicationSNS substrate and the worker application

Centralized SNS ManagerCentralized SNS Manager• SNS Manager is intentionally centralizedSNS Manager is intentionally centralized

– makes it easier to reason about and implement the makes it easier to reason about and implement the various policiesvarious policies

– ““all” we need to do is ensure the fault tolerance of all” we need to do is ensure the fault tolerance of the manager, and make sure it is not a performance the manager, and make sure it is not a performance bottleneckbottleneck

• Three key functionsThree key functions– Resource locationResource location– Load balancing and scalabilityLoad balancing and scalability– Fault toleranceFault tolerance

Resource LocationResource Location

WorkerWorkerWorkerWorker

Worker DriverWorker DriverWorker DriverWorker Driver

WorkerWorkerWorkerWorker

Worker DriverWorker DriverWorker DriverWorker Driver

SNS SNS ManagerManager

SNS SNS ManagerManager

Multicast BeaconsMulticast BeaconsMulticast BeaconsMulticast BeaconsMulticast BeaconsMulticast Beacons

RegisterRegister

FindFindFoundFound

PersistentPersistentConnectionConnection

Load BalancingLoad Balancing• Load measurement and reportingLoad measurement and reporting

– Each worker examines incoming requests and Each worker examines incoming requests and estimates the “load” that would be generatedestimates the “load” that would be generated

– Simplest load metric: queue length at workersSimplest load metric: queue length at workers– Workers periodically report their current load to Workers periodically report their current load to

the SNS Managerthe SNS Manager– SNS Manager maintains load history and SNS Manager maintains load history and

aggregates load reports from all workersaggregates load reports from all workers– Load reports are piggybacked on manager Load reports are piggybacked on manager

beacons to rest of the systembeacons to rest of the system

Load BalancingLoad Balancing• Each worker performs local load balancing Each worker performs local load balancing

decisionsdecisions

• Use lottery scheduling -- # of tickets are Use lottery scheduling -- # of tickets are inversely proportional to worker loadinversely proportional to worker load

• Stale load reports can cause oscillationsStale load reports can cause oscillations– Use a correction factor based on the number of Use a correction factor based on the number of

requests that were sent since last load reportrequests that were sent since last load report

Auto-launch for ScalabilityAuto-launch for Scalability• Worker replication to handle short traffic Worker replication to handle short traffic

burstsbursts– Multiple workers handle requests in parallelMultiple workers handle requests in parallel

– If load on a class of workers gets too high, the SNS If load on a class of workers gets too high, the SNS Manager launches a new oneManager launches a new one

• Overflow pool for long burstsOverflow pool for long bursts– non-dedicated set of machines (e.g. users’ desktop non-dedicated set of machines (e.g. users’ desktop

machines)machines)

– when all dedicated nodes are exhausted, harness an when all dedicated nodes are exhausted, harness an overflow node; release it after burst subsidesoverflow node; release it after burst subsides

– useful for incremental scalabilityuseful for incremental scalability

Fault ToleranceFault Tolerance• Starfish Fault toleranceStarfish Fault tolerance

– ““Peer” monitoring as opposed to Peer” monitoring as opposed to primary/secondary fault toleranceprimary/secondary fault tolerance

• Two mechanisms: Two mechanisms: – Timeouts and retriesTimeouts and retries– Preemptive detection and component restartPreemptive detection and component restart

• Reliance on soft state simplifies crash Reliance on soft state simplifies crash recoveryrecovery

Fault ToleranceFault Tolerance

WorkerWorkerWorkerWorker

Worker DriverWorker DriverWorker DriverWorker Driver

WorkerWorkerWorkerWorker

Worker DriverWorker DriverWorker DriverWorker Driver

WorkerWorkerWorkerWorker

Worker DriverWorker DriverWorker DriverWorker Driver

SNS SNS ManagerManager

SNS SNS ManagerManager

SNS SNS ManagerManager

SNS SNS ManagerManager

AmRestarting

SNS SNS ManagerManager

SNS SNS ManagerManager

SNS SNS ManagerManager

SNS SNS ManagerManager

SNS SNS ManagerManager

SNS SNS ManagerManager

ReRegisterReRegister

Example ApplicationsExample Applications• TranSendTranSend

– Web proxy for on-the-fly content distillationWeb proxy for on-the-fly content distillation

• WingmanWingman– The world’s only graphical web browser for the 3COM The world’s only graphical web browser for the 3COM

PalmPilotPalmPilot

• TopGun MediaboardTopGun Mediaboard– PDA groupware: shared electronic whiteboard for the PDA groupware: shared electronic whiteboard for the

3COM PalmPilot3COM PalmPilot

• MARSMARS– MBone archive serverMBone archive server

EvaluationEvaluation

0

2

4

6

8

10

12

14

16

18

0 10 20 30 40 50 60 70

Time (seconds)

Lo

ad (

qu

eu

e l

eng

th)

Worker 1

Worker 2

EvaluationEvaluation

0

2

4

6

8

10

12

14

16

18

0 10 20 30 40 50 60 70

Time (seconds)

Lo

ad (

qu

eue

len

gth

)

Worker 1Worker 2

EvaluationEvaluation

0

5

10

15

20

25

0 200 400 600 800Time (seconds)

Qu

eu

e L

en

gth

0

20

40

60

Off

ere

d L

oa

d (

req

ue

sts

/se

co

nd

)

Worker 1

Worker 2

Worker 3

Worker 4

Worker 5

Offered Load

Worker 2started

Worker 3started

Workers 4& 5started

SummarySummary• Reusable architecture substrate for building Reusable architecture substrate for building

Internet service applicationsInternet service applications

• Application developers program their Application developers program their services to a well-defined narrow interfaceservices to a well-defined narrow interface

• SNS takes care of resource location, SNS takes care of resource location, spawning, load balancing, fault tolerancespawning, load balancing, fault tolerance

• Number of interesting applications on top of Number of interesting applications on top of the SNS substratethe SNS substrate

• Next step: SNSv2 Next step: SNSv2 NINJANINJA