design, implementation and evaluation of a client characterization driven web server balachander...

24
Design, Implementation and Evaluation of a Client Characterization Driven Web Server Balachander Krishnamurthy – AT&T Balachander Krishnamurthy – AT&T Research Labs Research Labs Craig Wills – WPI Craig Wills – WPI Yin Zhang – AT&T Research Labs Yin Zhang – AT&T Research Labs Kashi Vishwanath Kashi Vishwanath – Duke University – Duke University

Post on 21-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Design, Implementation and Evaluation of a Client

Characterization Driven Web Server

Balachander Krishnamurthy – AT&T Research LabsBalachander Krishnamurthy – AT&T Research Labs

Craig Wills – WPICraig Wills – WPI

Yin Zhang – AT&T Research LabsYin Zhang – AT&T Research Labs

Kashi VishwanathKashi Vishwanath – Duke University – Duke University

• Users with varying connectivity request a variety of files• Web site wants no user to get disinterested Web site wants no user to get disinterested • Handle requests based on connectivityHandle requests based on connectivity

Motivation

End Users

Web server

Outline of our solution

Figure out connectivity : use web server logsFigure out connectivity : use web server logs

Classify clients(IPs) :store in shared memoryClassify clients(IPs) :store in shared memory Cluster: group of IPs [KW00]Cluster: group of IPs [KW00]

When a page is requested, the Web server: When a page is requested, the Web server: Identifies client connectivity Identifies client connectivity Takes action if required: alternate content etc.Takes action if required: alternate content etc.

122.3.4.22 - - [12/Apr/2003:01:40:22 –0400] “GET /foo.html HTTP/1.0” 200 147122.3.4.22 - - [12/Apr/2003:01:40:22 –0400] “GET /foo.html HTTP/1.0” 200 147

System Architecture Server Log

Classification Table(Shared Memory)

Classifier

User Requests

Apache Processes

What is a page access? User clicks: User clicks: container pagecontainer page page eg. /foo.html page eg. /foo.html

Subsequent requests : Subsequent requests : embedded objectsembedded objects eg. /img1.gif eg. /img1.gif At server : how long before the user At server : how long before the user

starts getting the first object ?:E_first starts getting the first object ?:E_first gets the entire page ?: E_last gets the entire page ?: E_last

Smoothed means for popular pagesSmoothed means for popular pages Classes (poor, rich, normal) – [KW02]Classes (poor, rich, normal) – [KW02]

Poor : E_first >3 and E_last >5Poor : E_first >3 and E_last >5 Rich : E_first <=1 and E_last <=2Rich : E_first <=1 and E_last <=2

Modified Apache Server

Modification to Apache 1.3.24Modification to Apache 1.3.24 three files, less than 200 linesthree files, less than 200 lines

http_main.c, http_protocol.c, http_core.chttp_main.c, http_protocol.c, http_core.c One time overhead One time overhead

Open cluster libraryOpen cluster library Read in configuration fileRead in configuration file

/foo.html .gz .p_lc ; .r_mc/foo.html .gz .p_lc ; .r_mc Initialize shared memoryInitialize shared memory

Modified Apache Server :Per Request Changes Per Request ChangesPer Request Changes

Is it a popular page ( URI ) ?Is it a popular page ( URI ) ? Lookup class (in shared memory data) Lookup class (in shared memory data)

Cluster lookup if IP unavailableCluster lookup if IP unavailable Tailored action if appropriate and log itTailored action if appropriate and log it

Testing: How to choose test pages

Proxy logs from a large manufacturing company (Dec Proxy logs from a large manufacturing company (Dec 2001)2001)

Over 100,000 usersOver 100,000 users Select 1000 most popular pagesSelect 1000 most popular pages Download these : April 2002Download these : April 2002 641 successful pages reconstructed641 successful pages reconstructed 33% and 66% percentile value for characteristics33% and 66% percentile value for characteristics

Container bytes, # of Embedded object, embedded Container bytes, # of Embedded object, embedded bytesbytes

3 x 3 x 3 = 27 buckets of pages3 x 3 x 3 = 27 buckets of pages

Testing:Distribution for test pages

Percentage of pages in each bucket

Embedded Bytes

Small Medium Large

Embedded Container Bytes Container Bytes Container Bytes

Objects Small Medium Large Small Medium Large Small Medium Large

Small 20 6 2 4 1 0 0 0 0

Medium 2 3 1 5 7 8 2 5 3

Large 0 0 0 1 2 4 1 8 14

Server Overhead : Latency increase at clientGenerate IPs (X-IP header)Generate IPs (X-IP header)

Overhead (usec)

Step Mean Median Std.dev.

Is URI a container document ? 19.2 2 12.1

Class lookup in shared memory 12.5 9 5.8

Cluster related overhead

Converting IP address 4.2 3 6.1

Looking up cluster 8 7 4.8

Cluster lookup in shared memory 2.5 2 4.9

Classification based on cluster 0.7 0 4.4

Server Actions

Modifying URI 25.5 25 6.1

Logging changed request 2.8 3 3.6

Total overhead 75.4 51 18.2

0

2

4

6

8

10

12

14

1 5 10 50 100 200 400 600 800 1000

Number of concurrent connections

Ave

rag

e se

rver

ove

rhea

d %

Plot of average increase in processing time Plot of average increase in processing time for the modified serverfor the modified server

Stress Test

Placing Clients and Modified Servers Prototype Apache Server with our test sitePrototype Apache Server with our test site

Linux – att.com in NJ, USALinux – att.com in NJ, USA Linux – wpi.com MA, USALinux – wpi.com MA, USA FreeBSD – icir.org CA,USAFreeBSD – icir.org CA,USA

ClientsClients att:AT&T Labs-Research, NJ,USAatt:AT&T Labs-Research, NJ,USA de: Saarbruecken University, Germanyde: Saarbruecken University, Germany cable: cable modem user, NJ,USAcable: cable modem user, NJ,USA modem: 56Kbps dialup modem user, NJ,USAmodem: 56Kbps dialup modem user, NJ,USA uk:London, U.K via a dedicated 56Kbps line.uk:London, U.K via a dedicated 56Kbps line.

0

100

200

300

400500

600

700

800

900

0 20 40 60 80 100

Throughput (KB/sec)

Roun

d-Tr

ip T

ime

(ms)

Clients:Observed network characteristics

uk-icir

uk-att

cable-icir

uk-wpi

de-att att-iciratt-wpi

Spans a wide spectrum

Experiments :Httperf Clients Request similar mix of pages as described earlier Request similar mix of pages as described earlier 200 random requests each200 random requests each New Headers: X-Server-Actions, X-ClassNew Headers: X-Server-Actions, X-Class Baseline measureBaseline measure

parallel-1.0 – up to 4 parallel HTTP/1.0 requestsparallel-1.0 – up to 4 parallel HTTP/1.0 requests Server actions Server actions

Manner of deliveryManner of deliveryCompress, serial-1.1, pipeline-1.1, bundleCompress, serial-1.1, pipeline-1.1, bundle

Amount servedAmount servedBase-only, half-object,half-resolutionBase-only, half-object,half-resolution

Classifier Stability and Correctness

Overall consistent with expectation 90%

Pair Poor Normal Richatt-icir - 11% 89%att-wpi - - 100%cable-att - 11% 89%cable-wpi - 12% 88%cable-icir 10% 88% 2%de-att 73% 27% -de-icir 7% 92% 1%de-wpi 24% 74% 2%modem-att 99% 1% -modem-icir 99% 1% -modem-wpi 97% 3% -uk-att 100% - -uk-icir 100% - -uk-wpi 97% 3% -

Rich

Moderate

Poor

Results : Low throughput clientUk-wpiUk-wpi

Container Bytes - Embedded Objects - Embedded BytesAction S-S-S S-M-M M-M-M M-M-L M-L-L L-M-M L-L-M L-L-L

para-1.0 2.05s 5.95s 8.63s 12.70s 17.99s 11.72s 12.64s 21.08spipe-1.1 10 19 13 6 7 10 15 6

compress 15 5 28 20 14 41 40 22bundle 0 9 7 7 18 5 15 15

bundle.gz 0 26 19 20 28 14 22 24baseonly 68 89 74 82 87 65 67 80halfobject 48 40 23 33 27 20 24 25

halfres 6 8 8 0 15 6 5 13

Results : medium throughput

de-att de-att Container Bytes - Embedded Objects - Embedded Bytes

Action S-S-S S-M-M M-M-M M-M-L M-L-L L-M-M L-L-M L-L-Lpara-1.0 2.47s 8.18s 8.60s 7.49s 11.09s 9.44s 10.19s 11.93spipe-1.1 48 66 56 35 39 59 46 42

compress 0 0 0 0 0 10 0 8bundle 28 75 74 70 76 70 77 73

bundle.gz 32 71 70 70 79 67 80 75baseonly 82 94 90 88 92 89 90 91halfobject 48 32 33 48 30 35 13 32

halfres 0 13 11 0 6 19 0 12

Results:Summary

Server Action Benefitremoving embedded objects all caseshalf resoultion modem clientcompression except well-connectedbundling better connected clients,

large latency for poor clientsbundle.gz except better-connected clientsserial-1.1 neverpipe-1.1 high throughput or RTT

Conclusions

What all do we do ?What all do we do ? Online client classificationOnline client classification Deliver modified server actionsDeliver modified server actions Measure latency reduction to different clientsMeasure latency reduction to different clients Compare various actionsCompare various actions

First to do this in a unified framework.First to do this in a unified framework.

Conclusions:Server and Classifier

Overhead at serverOverhead at server Average:75 usec. Negligible for end-userAverage:75 usec. Negligible for end-user Turn off classification during overloadTurn off classification during overload Server poor-content during overloadServer poor-content during overload

ClassificationClassification Close to expectationClose to expectation Stable over the duration of experimentStable over the duration of experiment Improve by using select pages and better Improve by using select pages and better

thresholdsthresholds

Future

Other server actionsOther server actions Delta encodingDelta encoding Policies regarding cacheablility of objectsPolicies regarding cacheablility of objects

Create a test for clusteringCreate a test for clustering

Acknowledgement

Client testingClient testing Saarbruecken, GermanySaarbruecken, Germany ICIR, USAICIR, USA

Proxy logs - Manufacturing companyProxy logs - Manufacturing company

Thanks!

Questions ?Questions ?

Slides: http://www.cs.duke.edu/~kvv/www.ppt

Related Work

Mark network packets [NT02]Mark network packets [NT02] Improve performance of Improve performance of allall clients clients

Adapt content based on Adapt content based on server load server load [AB99][AB99] User’s expectation [BBK00]User’s expectation [BBK00] Alternate admission control and server scheduling Alternate admission control and server scheduling

policies [CMC01, CR02]policies [CMC01, CR02] Dynamically altering multimedia content [MB01]Dynamically altering multimedia content [MB01] Explicit client notification e.g. SPAND [ SSK97]Explicit client notification e.g. SPAND [ SSK97]