load balancing ptiik

8/10/2019 Load Balancing ptiik

1/47

Scheduling in Web Server

ClustersCS 260

LECTURE 3From: IBM Technical Report


2/47

Reference

The State of the Art in Locally DistributedWeb-server Systems

Valeria Cardellini, Emiliano Casalicchio, MicheleColajanni and Philip S. Yu


3/47

Concepts

Web server SystemProviding web services

Trend:1. Increasing number of clients2. Growing complexity of web applications

Scalable Web server systemsThe ability to support large numbers of accesses and

resources while still providing adequate performance


4/47

Locally Distributed Web System

Cluster Based Web Systemthe server nodes mask their IP addresses to clients,using a Virtual IP address corresponding to one device(web switch) in front of the set of the servers Webswitch receives all packets and then sends them toserver nodes

Distributed Web Systemthe IP addresses of the web server nodes are visible toclients. No web switch, just a layer 3 router may beemployed to route the requests


5/47

Cluster based Architecture


6/47

Distributed Architecture


7/47


8/47


9/47

Layer-4 two-way architecture


10/47

Layer-7 two-way architecture


11/47

Layer-7 two-way mechanisms

TCP gateway An application level proxy running on the web switchmediates the communication between the client and the

server makes separate TCP connections to client andserver

TCP splicingreduce the overhead in TCP gateway. For outboundpackets, packet forwarding occurs at network level byrewriting the client IP address - will be described inmore detail in the next class


12/47

Layer-4 Products


13/47


14/47

Dispatching AlgorithmsStrategies to select the target server of the webclusters Static: Fastest solution to prevent web switch

bottleneck, but do not consider the current state of the

servers Dynamic: Outperform static algorithms by using

intelligent decisions, but collecting state information andanalyzing them cause expensive overheads

Requirements: (1) Low computational complexity (2)Full compatibility with web standards (3) stateinformation must be readily available without muchoverhead


15/47

Content blind approach

Static Policies:Randomdistributes the incoming requests uniformly with equalprobability of reaching any serverRound Robin (RR)use a circular list and a pointer to the last selectedserver to make the decision

Static Weighted RR (For heterogeneoussevers) A variation of RR, where each server is assigned a

weight Wi depending on its capacity


16/47

Content blind approach (Cont.)

DynamicClient state awarestatic partitioning the server nodes and to assign groupof clients identified through the clients information, suchas source IP addressServer State AwareLeast Loaded, the server with the lowest load.

Issue: Which is the server load index?Least Connection

fewest active connection first


17/47

Content blind approach (Cont.) Server State Aware Contd.

Fastest Responseresponding fastest

Weighted Round RobinVariation of static RR, associates each server with a dynamically

evaluated weight that is proportional to the server load

Client and server state aware Client affinityinstead of assigning each new connection to a server only on thebasis of the server state regardless of any past assignment,consecutive connections from the same client can be assigned tothe same server


18/47

Considerations of content blind

Static approach is the fastest, easy toimplement, but may make poor assignmentdecision

Dynamic approach has the potential to makebetter decision, but it needs to collect andanalyze state information, may cause highoverhead

Overall, simple server state aware algorithm isthe best choice, least loaded algorithm iscommonly used in commercial products


19/47


20/47

Content aware approach

Sever state awareCache Affinity

the file space is partitioned among the server nodes.Load Sharing. SITEA (Size Interval Task Assignment with Equal Load)switch determines the size of the requested file andselect the target server based on this information. CAP (Client-Aware Policy)web requests are classified based on their impact onsystem resources: such as I/O bound, CPU bound


21/47

Content aware approach (Cont.)

Client state awareService Partitioningemploy specialized servers for certain type of requests.

Client Affinityusing session identifier to assign all web transactionsfrom the same client to the same server


22/47

Content aware approach (Cont.)

Client and server state awareLARD (Locality aware request distribution)

direct all requests to the same web object to the sameserver node as long as its utilization is below a giventhreshold.Cache Managera cache manager that is aware of the cache content ofall web servers.


23/47


24/47

Fair Scheduling in Web Servers

CS 213 Lecture 17

L.N. Bhuyan


25/47

Objective

Create an arbitrary number of servicequality classes and assign a priority weightfor each class.

Provide service differentiation for differentuse classes in terms of the allocation ofCPU and disk I/O capacities


26/47


27/47


28/47


29/47


30/47


31/47


32/47


33/47


34/47


35/47


36/47


37/47


38/47


39/47

Fair Scheduling in a Web Cluster:Objective

Provide service differentiation (or QoSguarantee) for different user classes interms of the allocation of CPU and disk I/Ocapacities => Scheduling

Balance the Load among various nodes inthe cluster to ensure maximum utilizationand minimum execution time => LoadBalancing


40/47

Target System


41/47

Master/Slave Architecture

Server nodes are divided in two groups:Slave group only processes dynamic requestsMaster group can handles both requests


42/47

Performance Guarantees forInternet Services (Gage)

Environment: Web hosting servicesmultiple logical web servers (service subscriber)on a single physical web server cluster .

Gage:guarantee each web server with a pre specificperformancea distinct number of URL requests to serviceper second


43/47

Components

Each service subscriber maintain a queue Request classification

determines the queue for each input request

Request schedulingdetermines which queue to serve next to meetthe QoS requirement for each subscriber.

Resource usage accountingcapture detailed resource usage associated witheach subscribers service requests.


44/47


45/47

The Gage System

QoS guaranteeQoS is in terms of a fixed number of generic URLrequest which represents an average web site accessCurrently, assuming it is 10msec of CPU time, 10msecof disk I/O and 2000 bytes of network bandwidthEach subscribe is given a fixed number of genericrequests.

Other possible QoS metrics: response time , delay jitter etc.

Using TCP splicing


46/47


47/47

Request Scheduling

Two decisions: Which request should be serviced next

(Scheduling)according to each subscribers static resourcereservation and dynamic resource usage

Which RPN should service this request(Load Balancing)according to the load information on each RPN (LeastLoad First) and also exploit access locality

load balancing ptiik

Documents