dotslash: handling web hotspots at dynamic content web sites

22
DotSlash: DotSlash: Handling Web Hotspots Handling Web Hotspots at Dynamic Content Web Sites at Dynamic Content Web Sites Weibin Zhao Henning Schulzrinne {zwb,hgs}@cs.columbia.edu Department of Computer Science Columbia University Global Internet 2005 March 19, 2005

Upload: merrill-bean

Post on 02-Jan-2016

19 views

Category:

Documents


4 download

DESCRIPTION

DotSlash: Handling Web Hotspots at Dynamic Content Web Sites. Weibin Zhao Henning Schulzrinne {zwb,hgs}@cs.columbia.edu Department of Computer Science Columbia University Global Internet 2005 March 19, 2005. Web Hotspots. Web Server. Internet. A well-identified problem - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

DotSlash:DotSlash: Handling Web Hotspots Handling Web Hotspots

at Dynamic Content Web Sitesat Dynamic Content Web Sites

Weibin ZhaoHenning Schulzrinne

{zwb,hgs}@cs.columbia.eduDepartment of Computer Science

Columbia University

Global Internet 2005March 19, 2005

Page 2: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 2

Web HotspotsWeb Hotspots

A well-identified problem Flash crowds, the Slashdot effect 15 minutes of fame

Examples Slashdotting, featured Google search, special

events, breaking news, …

Web Server

Internet

Page 3: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 3

The ChallengeThe Challenge Short-term dramatic surge of request

rate Large & quick increase Last for a short period

Existing mechanisms are not sufficient Capacity planning, CDNs

Good for long term, not cost-effective for hotspots Caching

Not fully controlled by origin server Service degradation, admission control

Last resort, not user friendly

Page 4: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 4

Dynamic Content Web Dynamic Content Web SitesSites More vulnerable to hotspots

CPU-bound, request rate supported is low Hard to cache dynamic content

A much harder problem Different bottlenecks

Database server: on-line bookstore (Amazon) Web server: auction (eBay), bulletin board

(Slashdot) Caching & consistency control

Page 5: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 5

Our ApproachOur Approach DotSlash counteract the Slashdot

effect Rescue system

Triggered automatically when load spikes Mutual-aid model: for different web sites Cost effective: for rare events

Automated rescue process Self-configuring: build an adaptive

distributed web server system on the fly Techniques: service discovery, dynamic

virtual hosting, adaptive overload control, dynamic script replication

Page 6: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 6

Rescue RelationshipRescue Relationship

Can provide rescue to multiple servers: S3

Can get rescue from multiple servers: S1

Cannot provide/get rescue simultaneouslyOrigin Server: S1, S2Rescue Server: S3, S4, S5, S6

S1

S5

S4

S3

S6

S2

S8

S7rescuing

Page 7: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 7

Service DiscoveryService Discovery DotSlash directory services

Enable web servers from different sites to learn about each other: register/query

Built upon mSLP (Mesh-enhanced Service Location Protocol): replicated Directory Agents (DAs)

Discover mSLP DAs dot-slash.net DNS domain DNS SRV for dot-slash.net

query_name=_slpda._tcp.dot-slash.net, query_type=srv

Page 8: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 8

Workload MonitoringWorkload Monitoring Bottlenecks & Metrics

Network (static content): outbound HTTP traffic

CPU (dynamic content): /proc/stat Moving average filter

Load regions Desired

Configurable: [40%, 60%] Trigger rescue actions

Desired load region

Heavily loaded region

Lightly loaded region

Page 9: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 9

DotSlash Rescue ProtocolDotSlash Rescue Protocol Application level request response

Requests: SOS, RATE, SHUTDOWN SOS: initiate a rescue

origin rescue RATE: adjust allowed redirect (data)

rate rescue origin

SHUTDOWN: end a rescue origin rescue

Page 10: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 10

Rescue ControlRescue Control

Normal

Rescue

SOS

Get rescue

Release rescue

Provide rescue

Shutdown last rescue

Increase Pr

Decrease Pr

Request more rescue

Increase Rr

Decrease Rr

Provide more rescue

Shutdown some rescue

Page 11: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 11

Request RedirectionRequest Redirection Origin server

Offload client requests to rescue servers Two-level redirection

DNS-RR Add/remove rescue server IP addresses via

dynamic DNS update HTTP redirect

Use rescue server aliases Don’t redirect requests from rescue servers

Redirect policies WRR based on rescue server capacity

Page 12: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 12

Dynamic Virtual HostingDynamic Virtual Hosting Rescue server

Serve new content (origin server) on the fly Alias

Generate dynamically, and register via dynamic DNS update

Mapping: request itself / origin server Based on the Host header in the request Three cases

Its configured name: www.rescue.com itself An alias: www-vh1.rescue.com (HTTP redirect)

origin An origin server name: www.origin.com (DNS-RR)

origin Handle expired mapping

Page 13: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 13

DotSlash for Dynamic DotSlash for Dynamic ContentContent Remove the web server bottleneck Dynamic Script Replication LAMP configuration

origin server database

rescue server

MySQLApache

Apache

(5) PHP (6)

(1)

(2)(3)

(4)Client(7)

(8)

Page 14: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 14

Dynamic Script ReplicationDynamic Script Replication Rescue server

Map a redirected URI to a script file Trigger 404 handler if the script file not

found Retrieve the script file Handle file inclusions Set query variables Run the script by invoking native include

Origin server If a request is from a rescue server and

for dynamic content, return the script file

Page 15: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 15

Handle File InclusionsHandle File Inclusions The problem

A replicated script may include files that are located at the origin server

Assume: included files under DocumentRoot Approaches

Renaming inclusion statements Need to parse scripts

Customized error handler Catch inclusion errors

Page 16: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 16

ImplementationImplementation Apache module, PHP extension Dynamic DNS: dot-slash.net Service discovery: enhanced SLP

BIND mSLP

HTTP SHM

SLPDNS

DSRP OtherDotsdClient Apache

DotsdMod_dots

Page 17: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 17

EvaluationEvaluation Experimental Setup

Linux machines: Redhat 9.0 HC: 2 GHz CPU, 1 GB memory LC: 1 GHz CPU, 512 MB memory

Apache: 2.0.48, DotSlash module, PHP: 4.3.6, DotSlash extension MySQL: 4.0.18

Benchmark RUBBoS (Rice U.) bulletin board

19 scripts: 1 KB to 7 KB 439 MB database

Page 18: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 18

Increasing Max Request Increasing Max Request Rate: RRate: R

No rescue: R=118

With rescue: R=245

#rescue servers: 9

Origin (HC) DB (HC)Rescue (LC)

Configuration:

Rescue (LC)Rescue (LC)Rescue (LC)Rescue (LC)Rescue (LC)Rescue (LC)Rescue (LC)Rescue (LC)

245/118>2

CPU: Origin=100% DB=45%

CPU: Origin=55% DB=100%

Page 19: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 19

EffectivenessEffectiveness

No rescue: R=49 With rescue: R=245

245/49=5

origin server HC LC

improvement 2 5

#rescue servers

9 10

Conclusion: remove web server bottleneck

Origin (LC) DB (HC)Rescue (LC)

Another Configuration:

Comparison:

#rescue server: 10

Rescue (LC)Rescue (LC)Rescue (LC)Rescue (LC)Rescue (LC)Rescue (LC)Rescue (LC)Rescue (LC)Rescue (LC)

Page 20: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 20

CPU Utilization ControlCPU Utilization Control

Page 21: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 21

Workload MigrationWorkload Migration

Page 22: DotSlash: Handling Web Hotspots  at Dynamic Content Web Sites

March 19, 2005 DotSlash 22

ConclusionsConclusions DotSlash framework, prototype, evaluation

Fully automated rescue system, transparent to clients

Scalable Get 10-fold improvement (static content) Remove web server bottleneck (dynamic content)

Future work Remove database server bottleneck

For further information http://www.cs.columbia.edu/~zwb/project/dotsl

ash WCW’04