9/6 2001 network computing lab eecs kaist1 deployment of cluster system and load balancing technique...
TRANSCRIPT
9/6 2001 Network Computing Lab EECS KAIST
1
Deployment of cluster system and load balancing technique
Junehwa Song
9/6 2001 2
Contents (1)
Part I : Cluster system Issues on Web ServerWhy need clusterMethod of performance increaseOverview of clusterType of clusterExample of cluster
9/6 2001 3
Part II : Load balancing techniqueMirrorClient based approachDNS based approachDispatcher based approach
Single packet rewriting Double packet rewriting Network dispatcher LVS connection scheduling
Server based approach Http redirection Packet redirection
Reference
9/6 2001 4
Issues on Web Server
Connections explosion Due to rapid growth of WWW application on
the internet, a web server may encounter the situation that a huge number of connection requests in a very short time
Research trend on web server Cluster system Load Balancing Distributed Scalable Web Server
9/6 2001 5
Why need cluster
Meet the demand for scalability and availability It is not always possible for a Web site to accurately predict peak load and prepare enough computing resource Because client request rates tend to be bursty and
fluctuate dramatically
9/6 2001 6
Traffic increases 100% per yearBut server’s performance increase doesn’t be comparable to traffic increase
Network Traffic
Server Performance
9/6 2001 7
Method of performance increase
Vertical performance increase Upgrade CPU, Memory, HDD, etc
Horizontal performance increase Using cluster -> Node addition
Cost problem High performance server is very expensive Performance/Cost < 1 Availability problem by fault
9/6 2001 8
RequireLow cost, high availability, High
performance, and extensibility
So it need Cluster
9/6 2001 9
Overview of Cluster
Single point of presenceMany small machines behave as one
large machineShare a virtual IP address
AvailabilityThe service as a whole must be available d
espite of transient partial hardware and software failure
9/6 2001 10
Fail overCluster automatically relocates an
application from failed node to a healthy one
When a fail over occur, client may see a brief interruption of service but they are not aware that application has been re hosted on a different physical cluster node
ScalabilityWhen the load offered to the service
increases, system can be scaled to meet the requirement
9/6 2001 11
availability grade by down timeAvailabilityAvailability Accumulated Accumulated
down time per down time per yearyear
GradeGrade
90%90% Under 1 MonthUnder 1 Month 11
99%99% Under 4 dayUnder 4 day 22
99.9%99.9% Under 9 hourUnder 9 hour 33
99.99%99.99% Within 1 hourWithin 1 hour 44
99.999%99.999% Within 5 minuteWithin 5 minute 55
99.9999%99.9999% Within 30 Within 30 secondsecond
66
99.99999%99.99999% 3 second3 second 77
9/6 2001 12
Type of cluster
Cluster for Scaling and availability Loosely coupled Horizontally scaling cluster Systems are not aware of other systems
Cluster for Performance For high performance computing(HPC) focus
on performance and scalability Tightly coupled - no availability Scientific cluster - Biology, Physics,
Engineering
9/6 2001 13
9/6 2001 14
Load balancerFront end to the service as seen by
the outside worldDirect network connections from
clients who know a single IP address
Server poolCluster of servers that implement the
actual services
Backend storage (optional)Provide shared storage for the servers
9/6 2001 15
Example of Cluster
LVS(Linux virtual server) Since 1998, open source Connection Scheduling – Part II Cluster management
Piranha by Red Hat Lvs-gui + Heartbeat + ldirectord Mon + heartbeat
Deployment Linux.com, sourceforge.net, www.zope.org, wwwcache.ja.
net
9/6 2001 16
LVS - Architecture
Real server 1Real server 1
Linux DirectorLinux Director
ClientClient
HeartbeatHeartbeat
Real server 2Real server 2
Real server NReal server N
Fault-tolerant Fault-tolerant File systemFile system
Linux DirectorLinux Director
Server ClusterServer ClusterLoad BalancerLoad Balancer File ServerFile Server
InterneInternett
9/6 2001 17
Load Balancing Technique
MirrorClient based approachDNS-based approachDispatcher based approachServer based approach
9/6 2001 18
Mirror
Replicate information across a mirrored server architecture User manually select alternative URLNot user transparentDon’t allow the Web-server system to control request distribution
9/6 2001 19
Client Based Approach
Web Client Web client selects a node of the cluster and submi
ts the request to the selected node Netscape home(http://www.netscape.com) use thi
s technique When user access this site, Navigator selects a random n
umber i between 1 and the number of servers and directs the request to the node wwwi.netscape.com
Limited practical applicability and is not scalable
9/6 2001 20
Smart ClientMigrates server functionality to the
client through a Java applet Increase network traffic and network
delay
Client side Proxies Web Cluster standpoint, proxy servers
are similar to clients
9/6 2001 21
DNS Based Approach
First prototype : NCSA scalable web serverDNS server maps the domain name to multiple IP addressReturning more than one IP address for the hostname or returning a different IP address for each DNS request it receives (Round robin)User transparentSimple and easy to implement
9/6 2001 22
9/6 2001 23
9/6 2001 24
Drawbacks Unable to know the situation of the whole
system Not really fair because DNS uses a simple
round robin DNS may encounter TTL problem in IP-
address cache Between the client and the web server DNS,
many intermediate name servers can cache the logical name to IP address mapping to reduce network traffic and every web browser typically caches some address resolution
9/6 2001 25
Because of address caching, each
address can cause a burst of future requests to the selected server and quickly obsolete the current load information
Many DNS based solutions to this problem
System-Stateless algorithmsServer-State-based algorithmsClient-State-based algorithmsAdaptive TTL Algorithms
9/6 2001 26
Dispatcher based approach
9/6 2001 27
Centralize request scheduling and completely control client-request routing Request routing among server is transparent-unlike DNS-based DNS deals address at the URL level, the dispatcher
has a single, virtual IP address(IP-SVA)
Dispatcher uniquely identifies each server in the system through a private addressDispatcher typically use simple algorithms to select the Web server
9/6 2001 28
Single packet rewritingDouble packet rewritingNetwork Dispatcher
9/6 2001 29
Packet Single Rewriting
9/6 2001 30
TCP router acts as an IP address dispatcher Router tracks the source IP address for every esta
blished TCP connection to route packets regarding the same connection to the same web server node
High System availability When one of server fails, its address can be remov
ed from the router’s table Can be combined with a DNS based solution
9/6 2001 31
Packet Double Rewriting
9/6 2001 32
Two solution using this approachMagicrouter Cisco System’s Local Director
Because outgoing packets typically outnumber incoming request packets, dispatcher becomes bottleneck
9/6 2001 33
Network Dispatcher
Dispatcher forward packets to the selected server using its physical address without IP modification
9/6 2001 34
LVS connection scheduling
Round-Robin SchedulingTreat all real servers as equals
regardless of number of connections or response time
Weighted Round-Robin SchedulingTreat the real servers of different
processing capacities.Each server can be assigned a weight
9/6 2001 35
Least connection schedulingDirect network connection to the
server with the least number of active connection
Weighted Least connection schedulingSuperset of the least connection
schedulingPerformance weight can be assigned
to each serverThe server with a higher weight value
will receive a larger percentage of active connection
9/6 2001 36
Server based approach
Use two level dispatching mechanism Integrating the DNS based approach with
redirection techniques executed by Web server
Solves most DNS scheduling problem
Two SolutionHTTP redirectionPacket redirection
9/6 2001 37
HTTP Redirection
9/6 2001 38
Above figure server1 redirect the request to server2.
Not client transparent ! Overhead of infra cluster communication Every server must periodically
transmit status information to cluster DNS
Increases response time in client side, because of packet redirection
9/6 2001 39
Packet Redirection
Use a round robin DNS mechanism to schedule the request among the Web ServerServer reached by a request reroutes the connection to another server through a packet rewriting Transparent to the client! TCP handoff
Packet rewriting overhead
9/6 2001 40
Reference
[1] Cardellini, V.; Colajanni, M.; Yu, P.S. “Dynamic load balancing on Web-ser
ver systems” IEEE Internet Computing Volume: 3 3 , May-June 1999 , Page(s): 28 -39
[2] Wow Linux. “Linux 기반의 고가용 로드밸런싱 웹 서비스 구축방안[3] Wensong Zhang “Linux Virtual Server for Scalable Network Service”
www.linuxVirtualServer.org[4] Sun Microsystems “Sun Cluster 3 architecture” www.sun.com[5] Alistair A. Croll “Optimizing web server access for E-business” Intel Devcon[6] Hong, H.C.; Chen, Y.C. “Design and practice of a dispatch server architect
ure” Distributed Computing Systems, 1999. Proceedings. 7th IEEE Workshop on Future Trends of , 1999 , Page(s): 246 -251
[7] Mourad, A.; Huiqun Liu “Scalable Web server architectures” Computers and Communications, 1997. Proceedings., Second IEEE Symposium on , 1997 , Page(s): 12 -16