building scalable internet services wu-chang feng puma technology

Post on 18-Jan-2016

213 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Building Scalable Internet Services

Wu-chang FengPuma Technology

A relatively easy problem….

Why?HTTP: stateless, request-response

protocoldecoupled, independent requests

How?divide and conquerreplicate, partition, distribute, load

balance

Outline

DNSserver per resource partitioningdynamic name resolution

Networking tricksvirtual servers

Case studiesscalable content delivery (Yahoo!)content transformation enginestransparent web cachesscalable secure servers

Server partitioning (static)

Run a new server per resource/service

Examplewww.blah.commail.blah.comimages.blah.comshopping.blah.commy.blah.cometc. etc.

Server partitioning (static)

Advantagesbetter disk utilizationbetter cache performanceprotection against DOS

Disadvantageslower peak capacitycoarse load balancing across

servers/servicesmanagement costs

Server partitioning (dynamic)

Active “forward deployment” of content to explicitly named servers near client

Redirect requests from origin servers byHTTP redirectsdynamic URL rewriting of embedded content

Application-level multicast based on geographic information

Akamai, Digital Island (Sandpiper), SightPath, Xcelera (Mirror-image), CacheFlow

Server partitioning (dynamic)

Dynamically loaded content servers

stanford.edu

espn.go.com

Local, high-speed ISP

Internet

2

4 5

a12.g.akamaitech.neta668.g.akamaitech.neta1284.g.akamaitech.net

a1896.g.akamaitech.net

3

Requested page with links to embedded contentrewritten OR an HTTPredirect

1

Server partitioning (dynamic)

Advantagesbetter network utilizationbetter load distribution

Disadvantagesdistributed management costsstorage costscurrently OK as ($ network bw >> $

storage)

Outline

DNSserver per resource partitioningdynamic name resolution

Networking tricksvirtual servers

Case studiesscalable content delivery (Yahoo!)content transformation enginestransparent web cachesscalable secure servers

DNS load balancing

Popularized by NCSA circa 1993Fully replicated server farmIP address per nodeAdaptively resolve server name

(round-robin, load-based)

DNS load balancing

[a-m].root-servers.net *.ncsa.uiuc.edu is served by ns0.ncsa.uiuc.edu (141.142.2.2) ns1.ncsa.uiuc.edu(141.142.230.144) dns1.cso.uiuc..edu (128.174.5.103) ns.indiana.edu (129.79.1.1)

ns0.ncsa.uiuc.edu www.nsca.uiuc.edu is 141.142.2.28 141.142.2.36 141.142.2.42

1

2

3

4

DNS cache Host: www.ncsa.uiuc.edu ttl=15minDNS ns0.ncsa.uiuc.edu ttl=3days

5

stanford.edu

141.142.2.42

141.142.2.36

141.142.2.42

ncsa.uiuc.edu

6

7

DNS load balancing

Advantagessimple, easy to implementuses existing infrastructure

Disadvantagescoarse load balancinglocal DNS caching affects performancefull server replication

DNS references

RFC 1784“DNS Support for Load Balancing”ftp://ftp.isi.edu/in-notes/rfc1784.txt

DNS RFCs RFCs 882 and 883 (1983) obsoleted

by…RFCs 1034 and 1035 (1987)ftp://ftp.isi.edu/in-notes/

rfc{882,883}.txtftp://ftp.isi.edu/in-notes/

rfc{1034,1035}.txt

Outline

DNSserver per resource partitioningdynamic name resolution

Networking tricksvirtual servers

Case studiesscalable content delivery (Yahoo!)content transformation enginestransparent web cachesscalable secure servers

Virtual servers

Large server farm -> single virtual server

Single front-end for connection routingRouting algorithms

by load (response times, least connections, server load, weighted round-robin)

by layer 3 info (IP addresses)by layer 4 info (ports)by layer 5-7 info (URLs, Cookies, SSL

session IDs, User-Agent, client capabilities, etc. etc.)

Olympic web server (1996)

stanford.edu

Internet

Token Ring

4 x T3

1

4

3

IP=X

IP=X

IP=X

IP=X

IP=XLoad info

2

SYN routingACK forwarding

Olympic web server (1996)

Front-end nodeTCP SYN

route to particular server based on policystore decision (connID, realServer)

TCP ACKforward based on stored decision

TCP FIN or a pre-defined timeoutremove entry

ServersIP address of outgoing interface = IP

address of front-end’s incoming interface

Olympic web server (1996)

Advantagesonly ACK traffic is processedmore reactive to load than DNS

Disadvantagesnon-stickiness between requests

SSLcache performance

software solution (prone to DOS)can’t support L5 switching

must proxy both ways of connectionneed to rewrite ACKs going both ways

Other LB variations (L2-L4)

Hardware switchesLoad balancing algorithms

anything contained within TCP SYN packetsourceIP, sourcePort, destIP, destPort, protocolhash(source, dest, protocol)

server characteristicsleast number of connectionsfastest response timeserver idle time

otherweighted round-robin, random

Virtual servers with L5

Spoof server connection until URL sentSwitch based on content in requestConnections proxied through switch

switch terminates TCP handshakeswitch rewrites sequence numbers going

in both directionsexception

TCP connection migration from Rice Universitymigrate TCP state (sequence no. information)

to real serverIP address of real server = IP address of virtual

server

L5 switchesSYN SN=A

SYN SN=B ACK=A

ACK=B

HTTP requestSYN SN=A

SYN SN=C ACK=A

HTTP request

ACK=C

L5 switchVirtualIP=X

Real serverRealIP=Y

HTTP response

Rewrite Y to X C to B

ACK

Rewrite X to Y B to C

Client

L5 switching

Advantagesincreases effective cache/storage sizesallows for session persistence

(SSL,cookies)support for user-level service

differentiationservice levels based on cookies, user

profile, User-Agent, URL

Disadvantagescontent hot-spotsoverhead

Load balancing switches

Cisco Local DirectorIBM Network DispatcherF5 Labs BIG/ipResonate Central DispatchArrowpoint CS-100/800Foundry ServerIron XLAlteon ACEDirector

Integrated approaches

LB switches coordinate and respond to DNS requestsbased on loadbased on geographic location

VendorsCisco Distributed DirectorF5 Labs BIG/ip with 3DNSAlteon ACEDirector3Resonate Global Dispatch

Integrated example

[a-m].root-servers.net www.blah.com served by A, B, and C

C

B

A

Internet

1. Request for www.blah.com

3. www.blah.com is C

Load C < Load B < Load A orproximity C > proximity B > proximity A

2. www.blah.com?

4. Request to www.blah.com

Complications to LB

Hot-spot URLs L5, URL switching bad

Proxied sources (AOL, SOCKS, etc.)L3, source IP switching bad

Stateful requests (SSL)Load-based/RR bad

IP fragmentationBreaks all algorithms unless switch

defrags

Complications to LB

IPsecmust end IPsec tunnel at switch

doorstepOptimizing cache/disk

non-L5 solutions badOptimizing network bandwidth

non-Akamai-like solutions bad

Outline

DNSserver per resource partitioningdynamic name resolution

Networking tricksvirtual servers

Case studiesscalable content delivery (Yahoo!)content transformation enginestransparent web cachesscalable secure servers

Designing a solution

Examine primary design goalsload balancing performancecache hit ratesCPU utilizationnetwork resources

Apply solutions which fits

Yahoo!

[a-m].root-servers.net *.yahoo.com is served by ns1.yahoo.com (204.71.177.33) ns3.europe.yahoo.com (195.67.49.25) ns2.dca.yahoo.com (209.143.200.34) ns5.dcx.yahoo.com (216.32.74.10)

ns1.yahoo.com www.yahoo.com is 204.71.200.68 204.71.200.67 204.71.200.75 204.71.202.160 204.71.200.74

204.71.200.68

204.71.200.67

1

23

4

DNS cache Host: www.yahoo.comNameServers: yahoo.com

5

stanford.edu

yahoo.com

6

akamaitech.net

7

us.yimg.com

89

Proxinet Example

CPU utilization and cache hit rates are biggest concerns

Ignore network efficiency and other resources

Support user and URL differentiationStraight up L5

killed on hot-spot URLsStraight up load-based algorithm

killed on low cache hit rates unless global cache used

Proxinet Example

Solution: Use a hybrid like LARDload balance with URL to a certain limitload balance with least connections

when load imbalanced provision by URL based on Benjamins

http://www.cs.princeton.edu/~vivek/ASPLOS-98/

Transparent Web Caching

Redirect web requests to cache transparently

Eliminates client management costsHow?

Put web cache/redirector in routing pathRedirector

Pick off cacheable web requestsrewrite destination address and forward to

cacherewrite source address and return to client

http://www.alteonwebsystems.com/products/white_papers/WCR/index.shtml

Transparent Web Caching

Internet

Pick off web requestsrewrite addressesroute to caches

12

3

45

6

Scalable Secure Servers

SSL handshakeServer intensive processing

200 MHz PowerPCClient ~12ms processingServer ~50ms processing

Session reuse avoids overhead of handshake

Subsequent requests must return to initial server

Scalable Secure Servers

Client random + GMT + sessionID=0cipher suites + compression methods

Server random + certificate + sessionIDcipher suites + compression methods

Server Hello

Client Hello

Client Key ExchangeMaster secret encrypted w/ server public key

Finished

Application data

1. Verify cert, extractserver public key, encrypt master secret w/ key

2. Decrypt mastersecret with privatekey, generate keysand randoms

Dog slow

Client Server

3. Generate keys frommaster secret + randoms

Initial SSL Handshake

Scalable Secure Servers

Client random + GMT + sessionIDcipher suites + compression methods

Server random + certificate + sessionIDcipher suites + compression methods

Server Hello + Finished

Client Hello

Finished

Application data

2. Generate keys fromcached master secret andcurrent randoms

1. Generate keysfrom cached mastersecret and currentrandoms

Client Server

SSL session reuse

Scalable Secure Servers

Source IP switching solutionsolves affinity problemload balancing poor

SSL session ID switchingsolves affinity problemload balancing on initial handshake

Links

http://www.eecs.umich.edu/~wuchang/work/cs444i.ppt

E-mail: wuchang@eecs.umich.edu

top related