9/6 2001 network computing lab eecs kaist1 deployment of cluster system and load balancing technique...

9/6 2001 Network Computing Lab EECS KAIST

1

Deployment of cluster system and load balancing technique

Junehwa Song

9/6 2001 2

Contents (1)

Part I : Cluster system Issues on Web ServerWhy need clusterMethod of performance increaseOverview of clusterType of clusterExample of cluster

9/6 2001 3

Part II : Load balancing techniqueMirrorClient based approachDNS based approachDispatcher based approach

Single packet rewriting Double packet rewriting Network dispatcher LVS connection scheduling

Server based approach Http redirection Packet redirection

Reference

9/6 2001 4

Issues on Web Server

Connections explosion Due to rapid growth of WWW application on

the internet, a web server may encounter the situation that a huge number of connection requests in a very short time

Research trend on web server Cluster system Load Balancing Distributed Scalable Web Server

9/6 2001 5

Why need cluster

Meet the demand for scalability and availability It is not always possible for a Web site to accurately predict peak load and prepare enough computing resource Because client request rates tend to be bursty and

fluctuate dramatically

9/6 2001 6

Traffic increases 100% per yearBut server’s performance increase doesn’t be comparable to traffic increase

Network Traffic

Server Performance

9/6 2001 7

Method of performance increase

Vertical performance increase Upgrade CPU, Memory, HDD, etc

Horizontal performance increase Using cluster -> Node addition

Cost problem High performance server is very expensive Performance/Cost < 1 Availability problem by fault

9/6 2001 8

RequireLow cost, high availability, High

performance, and extensibility

So it need Cluster

9/6 2001 9

Overview of Cluster

Single point of presenceMany small machines behave as one

large machineShare a virtual IP address

AvailabilityThe service as a whole must be available d

espite of transient partial hardware and software failure

9/6 2001 10

Fail overCluster automatically relocates an

application from failed node to a healthy one

When a fail over occur, client may see a brief interruption of service but they are not aware that application has been re hosted on a different physical cluster node

ScalabilityWhen the load offered to the service

increases, system can be scaled to meet the requirement

9/6 2001 11

availability grade by down timeAvailabilityAvailability Accumulated Accumulated

down time per down time per yearyear

GradeGrade

90%90% Under 1 MonthUnder 1 Month 11

99%99% Under 4 dayUnder 4 day 22

99.9%99.9% Under 9 hourUnder 9 hour 33

99.99%99.99% Within 1 hourWithin 1 hour 44

99.999%99.999% Within 5 minuteWithin 5 minute 55

99.9999%99.9999% Within 30 Within 30 secondsecond

66

99.99999%99.99999% 3 second3 second 77

9/6 2001 12

Type of cluster

Cluster for Scaling and availability Loosely coupled Horizontally scaling cluster Systems are not aware of other systems

Cluster for Performance For high performance computing(HPC) focus

on performance and scalability Tightly coupled - no availability Scientific cluster - Biology, Physics,

Engineering

9/6 2001 13

9/6 2001 14

Load balancerFront end to the service as seen by

the outside worldDirect network connections from

clients who know a single IP address

Server poolCluster of servers that implement the

actual services

Backend storage (optional)Provide shared storage for the servers

9/6 2001 15

Example of Cluster

LVS(Linux virtual server) Since 1998, open source Connection Scheduling – Part II Cluster management

Piranha by Red Hat Lvs-gui + Heartbeat + ldirectord Mon + heartbeat

Deployment Linux.com, sourceforge.net, www.zope.org, wwwcache.ja.

net

9/6 2001 16

LVS - Architecture

Real server 1Real server 1

Linux DirectorLinux Director

ClientClient

HeartbeatHeartbeat

Real server 2Real server 2

Real server NReal server N

Fault-tolerant Fault-tolerant File systemFile system

Linux DirectorLinux Director

Server ClusterServer ClusterLoad BalancerLoad Balancer File ServerFile Server

InterneInternett

9/6 2001 17

Load Balancing Technique

MirrorClient based approachDNS-based approachDispatcher based approachServer based approach

9/6 2001 18

Mirror

Replicate information across a mirrored server architecture User manually select alternative URLNot user transparentDon’t allow the Web-server system to control request distribution

9/6 2001 19

Client Based Approach

Web Client Web client selects a node of the cluster and submi

ts the request to the selected node Netscape home(http://www.netscape.com) use thi

s technique When user access this site, Navigator selects a random n

umber i between 1 and the number of servers and directs the request to the node wwwi.netscape.com

Limited practical applicability and is not scalable

9/6 2001 20

Smart ClientMigrates server functionality to the

client through a Java applet Increase network traffic and network

delay

Client side Proxies Web Cluster standpoint, proxy servers

are similar to clients

9/6 2001 21

DNS Based Approach

First prototype : NCSA scalable web serverDNS server maps the domain name to multiple IP addressReturning more than one IP address for the hostname or returning a different IP address for each DNS request it receives (Round robin)User transparentSimple and easy to implement

9/6 2001 22

9/6 2001 23

9/6 2001 24

Drawbacks Unable to know the situation of the whole

system Not really fair because DNS uses a simple

round robin DNS may encounter TTL problem in IP-

address cache Between the client and the web server DNS,

many intermediate name servers can cache the logical name to IP address mapping to reduce network traffic and every web browser typically caches some address resolution

9/6 2001 25

Because of address caching, each

address can cause a burst of future requests to the selected server and quickly obsolete the current load information

Many DNS based solutions to this problem

System-Stateless algorithmsServer-State-based algorithmsClient-State-based algorithmsAdaptive TTL Algorithms

9/6 2001 26

Dispatcher based approach

9/6 2001 27

Centralize request scheduling and completely control client-request routing Request routing among server is transparent-unlike DNS-based DNS deals address at the URL level, the dispatcher

has a single, virtual IP address(IP-SVA)

Dispatcher uniquely identifies each server in the system through a private addressDispatcher typically use simple algorithms to select the Web server

9/6 2001 28

Single packet rewritingDouble packet rewritingNetwork Dispatcher

9/6 2001 29

Packet Single Rewriting

9/6 2001 30

TCP router acts as an IP address dispatcher Router tracks the source IP address for every esta

blished TCP connection to route packets regarding the same connection to the same web server node

High System availability When one of server fails, its address can be remov

ed from the router’s table Can be combined with a DNS based solution

9/6 2001 31

Packet Double Rewriting

9/6 2001 32

Two solution using this approachMagicrouter Cisco System’s Local Director

Because outgoing packets typically outnumber incoming request packets, dispatcher becomes bottleneck

9/6 2001 33

Network Dispatcher

Dispatcher forward packets to the selected server using its physical address without IP modification

9/6 2001 34

LVS connection scheduling

Round-Robin SchedulingTreat all real servers as equals

regardless of number of connections or response time

Weighted Round-Robin SchedulingTreat the real servers of different

processing capacities.Each server can be assigned a weight

9/6 2001 35

Least connection schedulingDirect network connection to the

server with the least number of active connection

Weighted Least connection schedulingSuperset of the least connection

schedulingPerformance weight can be assigned

to each serverThe server with a higher weight value

will receive a larger percentage of active connection

9/6 2001 36

Server based approach

Use two level dispatching mechanism Integrating the DNS based approach with

redirection techniques executed by Web server

Solves most DNS scheduling problem

Two SolutionHTTP redirectionPacket redirection

9/6 2001 37

HTTP Redirection

9/6 2001 38

Above figure server1 redirect the request to server2.

Not client transparent ! Overhead of infra cluster communication Every server must periodically

transmit status information to cluster DNS

Increases response time in client side, because of packet redirection

9/6 2001 39

Packet Redirection

Use a round robin DNS mechanism to schedule the request among the Web ServerServer reached by a request reroutes the connection to another server through a packet rewriting Transparent to the client! TCP handoff

Packet rewriting overhead

9/6 2001 40

Reference

[1] Cardellini, V.; Colajanni, M.; Yu, P.S. “Dynamic load balancing on Web-ser

ver systems” IEEE Internet Computing Volume: 3 3 , May-June 1999 , Page(s): 28 -39

[2] Wow Linux. “Linux 기반의 고가용 로드밸런싱 웹 서비스 구축방안[3] Wensong Zhang “Linux Virtual Server for Scalable Network Service”

www.linuxVirtualServer.org[4] Sun Microsystems “Sun Cluster 3 architecture” www.sun.com[5] Alistair A. Croll “Optimizing web server access for E-business” Intel Devcon[6] Hong, H.C.; Chen, Y.C. “Design and practice of a dispatch server architect

ure” Distributed Computing Systems, 1999. Proceedings. 7th IEEE Workshop on Future Trends of , 1999 , Page(s): 246 -251

[7] Mourad, A.; Huiqun Liu “Scalable Web server architectures” Computers and Communications, 1997. Proceedings., Second IEEE Symposium on , 1997 , Page(s): 12 -16

9/6 2001 network computing lab eecs kaist1 deployment of cluster system and load balancing technique...

Documents

type of cluster cluster

systems cluster

cluster system issues

scalable web server

scaling cluster systems

fault slide

engineering slide

requirement slide