design, implementation and evaluation of a client characterization driven web server balachander...
Post on 21-Dec-2015
217 views
TRANSCRIPT
Design, Implementation and Evaluation of a Client
Characterization Driven Web Server
Balachander Krishnamurthy – AT&T Research LabsBalachander Krishnamurthy – AT&T Research Labs
Craig Wills – WPICraig Wills – WPI
Yin Zhang – AT&T Research LabsYin Zhang – AT&T Research Labs
Kashi VishwanathKashi Vishwanath – Duke University – Duke University
• Users with varying connectivity request a variety of files• Web site wants no user to get disinterested Web site wants no user to get disinterested • Handle requests based on connectivityHandle requests based on connectivity
Motivation
End Users
Web server
Outline of our solution
Figure out connectivity : use web server logsFigure out connectivity : use web server logs
Classify clients(IPs) :store in shared memoryClassify clients(IPs) :store in shared memory Cluster: group of IPs [KW00]Cluster: group of IPs [KW00]
When a page is requested, the Web server: When a page is requested, the Web server: Identifies client connectivity Identifies client connectivity Takes action if required: alternate content etc.Takes action if required: alternate content etc.
122.3.4.22 - - [12/Apr/2003:01:40:22 –0400] “GET /foo.html HTTP/1.0” 200 147122.3.4.22 - - [12/Apr/2003:01:40:22 –0400] “GET /foo.html HTTP/1.0” 200 147
System Architecture Server Log
Classification Table(Shared Memory)
Classifier
User Requests
Apache Processes
What is a page access? User clicks: User clicks: container pagecontainer page page eg. /foo.html page eg. /foo.html
Subsequent requests : Subsequent requests : embedded objectsembedded objects eg. /img1.gif eg. /img1.gif At server : how long before the user At server : how long before the user
starts getting the first object ?:E_first starts getting the first object ?:E_first gets the entire page ?: E_last gets the entire page ?: E_last
Smoothed means for popular pagesSmoothed means for popular pages Classes (poor, rich, normal) – [KW02]Classes (poor, rich, normal) – [KW02]
Poor : E_first >3 and E_last >5Poor : E_first >3 and E_last >5 Rich : E_first <=1 and E_last <=2Rich : E_first <=1 and E_last <=2
Modified Apache Server
Modification to Apache 1.3.24Modification to Apache 1.3.24 three files, less than 200 linesthree files, less than 200 lines
http_main.c, http_protocol.c, http_core.chttp_main.c, http_protocol.c, http_core.c One time overhead One time overhead
Open cluster libraryOpen cluster library Read in configuration fileRead in configuration file
/foo.html .gz .p_lc ; .r_mc/foo.html .gz .p_lc ; .r_mc Initialize shared memoryInitialize shared memory
Modified Apache Server :Per Request Changes Per Request ChangesPer Request Changes
Is it a popular page ( URI ) ?Is it a popular page ( URI ) ? Lookup class (in shared memory data) Lookup class (in shared memory data)
Cluster lookup if IP unavailableCluster lookup if IP unavailable Tailored action if appropriate and log itTailored action if appropriate and log it
Testing: How to choose test pages
Proxy logs from a large manufacturing company (Dec Proxy logs from a large manufacturing company (Dec 2001)2001)
Over 100,000 usersOver 100,000 users Select 1000 most popular pagesSelect 1000 most popular pages Download these : April 2002Download these : April 2002 641 successful pages reconstructed641 successful pages reconstructed 33% and 66% percentile value for characteristics33% and 66% percentile value for characteristics
Container bytes, # of Embedded object, embedded Container bytes, # of Embedded object, embedded bytesbytes
3 x 3 x 3 = 27 buckets of pages3 x 3 x 3 = 27 buckets of pages
Testing:Distribution for test pages
Percentage of pages in each bucket
Embedded Bytes
Small Medium Large
Embedded Container Bytes Container Bytes Container Bytes
Objects Small Medium Large Small Medium Large Small Medium Large
Small 20 6 2 4 1 0 0 0 0
Medium 2 3 1 5 7 8 2 5 3
Large 0 0 0 1 2 4 1 8 14
Server Overhead : Latency increase at clientGenerate IPs (X-IP header)Generate IPs (X-IP header)
Overhead (usec)
Step Mean Median Std.dev.
Is URI a container document ? 19.2 2 12.1
Class lookup in shared memory 12.5 9 5.8
Cluster related overhead
Converting IP address 4.2 3 6.1
Looking up cluster 8 7 4.8
Cluster lookup in shared memory 2.5 2 4.9
Classification based on cluster 0.7 0 4.4
Server Actions
Modifying URI 25.5 25 6.1
Logging changed request 2.8 3 3.6
Total overhead 75.4 51 18.2
0
2
4
6
8
10
12
14
1 5 10 50 100 200 400 600 800 1000
Number of concurrent connections
Ave
rag
e se
rver
ove
rhea
d %
Plot of average increase in processing time Plot of average increase in processing time for the modified serverfor the modified server
Stress Test
Placing Clients and Modified Servers Prototype Apache Server with our test sitePrototype Apache Server with our test site
Linux – att.com in NJ, USALinux – att.com in NJ, USA Linux – wpi.com MA, USALinux – wpi.com MA, USA FreeBSD – icir.org CA,USAFreeBSD – icir.org CA,USA
ClientsClients att:AT&T Labs-Research, NJ,USAatt:AT&T Labs-Research, NJ,USA de: Saarbruecken University, Germanyde: Saarbruecken University, Germany cable: cable modem user, NJ,USAcable: cable modem user, NJ,USA modem: 56Kbps dialup modem user, NJ,USAmodem: 56Kbps dialup modem user, NJ,USA uk:London, U.K via a dedicated 56Kbps line.uk:London, U.K via a dedicated 56Kbps line.
0
100
200
300
400500
600
700
800
900
0 20 40 60 80 100
Throughput (KB/sec)
Roun
d-Tr
ip T
ime
(ms)
Clients:Observed network characteristics
uk-icir
uk-att
cable-icir
uk-wpi
de-att att-iciratt-wpi
Spans a wide spectrum
Experiments :Httperf Clients Request similar mix of pages as described earlier Request similar mix of pages as described earlier 200 random requests each200 random requests each New Headers: X-Server-Actions, X-ClassNew Headers: X-Server-Actions, X-Class Baseline measureBaseline measure
parallel-1.0 – up to 4 parallel HTTP/1.0 requestsparallel-1.0 – up to 4 parallel HTTP/1.0 requests Server actions Server actions
Manner of deliveryManner of deliveryCompress, serial-1.1, pipeline-1.1, bundleCompress, serial-1.1, pipeline-1.1, bundle
Amount servedAmount servedBase-only, half-object,half-resolutionBase-only, half-object,half-resolution
Classifier Stability and Correctness
Overall consistent with expectation 90%
Pair Poor Normal Richatt-icir - 11% 89%att-wpi - - 100%cable-att - 11% 89%cable-wpi - 12% 88%cable-icir 10% 88% 2%de-att 73% 27% -de-icir 7% 92% 1%de-wpi 24% 74% 2%modem-att 99% 1% -modem-icir 99% 1% -modem-wpi 97% 3% -uk-att 100% - -uk-icir 100% - -uk-wpi 97% 3% -
Rich
Moderate
Poor
Results : Low throughput clientUk-wpiUk-wpi
Container Bytes - Embedded Objects - Embedded BytesAction S-S-S S-M-M M-M-M M-M-L M-L-L L-M-M L-L-M L-L-L
para-1.0 2.05s 5.95s 8.63s 12.70s 17.99s 11.72s 12.64s 21.08spipe-1.1 10 19 13 6 7 10 15 6
compress 15 5 28 20 14 41 40 22bundle 0 9 7 7 18 5 15 15
bundle.gz 0 26 19 20 28 14 22 24baseonly 68 89 74 82 87 65 67 80halfobject 48 40 23 33 27 20 24 25
halfres 6 8 8 0 15 6 5 13
Results : medium throughput
de-att de-att Container Bytes - Embedded Objects - Embedded Bytes
Action S-S-S S-M-M M-M-M M-M-L M-L-L L-M-M L-L-M L-L-Lpara-1.0 2.47s 8.18s 8.60s 7.49s 11.09s 9.44s 10.19s 11.93spipe-1.1 48 66 56 35 39 59 46 42
compress 0 0 0 0 0 10 0 8bundle 28 75 74 70 76 70 77 73
bundle.gz 32 71 70 70 79 67 80 75baseonly 82 94 90 88 92 89 90 91halfobject 48 32 33 48 30 35 13 32
halfres 0 13 11 0 6 19 0 12
Results:Summary
Server Action Benefitremoving embedded objects all caseshalf resoultion modem clientcompression except well-connectedbundling better connected clients,
large latency for poor clientsbundle.gz except better-connected clientsserial-1.1 neverpipe-1.1 high throughput or RTT
Conclusions
What all do we do ?What all do we do ? Online client classificationOnline client classification Deliver modified server actionsDeliver modified server actions Measure latency reduction to different clientsMeasure latency reduction to different clients Compare various actionsCompare various actions
First to do this in a unified framework.First to do this in a unified framework.
Conclusions:Server and Classifier
Overhead at serverOverhead at server Average:75 usec. Negligible for end-userAverage:75 usec. Negligible for end-user Turn off classification during overloadTurn off classification during overload Server poor-content during overloadServer poor-content during overload
ClassificationClassification Close to expectationClose to expectation Stable over the duration of experimentStable over the duration of experiment Improve by using select pages and better Improve by using select pages and better
thresholdsthresholds
Future
Other server actionsOther server actions Delta encodingDelta encoding Policies regarding cacheablility of objectsPolicies regarding cacheablility of objects
Create a test for clusteringCreate a test for clustering
Acknowledgement
Client testingClient testing Saarbruecken, GermanySaarbruecken, Germany ICIR, USAICIR, USA
Proxy logs - Manufacturing companyProxy logs - Manufacturing company
Related Work
Mark network packets [NT02]Mark network packets [NT02] Improve performance of Improve performance of allall clients clients
Adapt content based on Adapt content based on server load server load [AB99][AB99] User’s expectation [BBK00]User’s expectation [BBK00] Alternate admission control and server scheduling Alternate admission control and server scheduling
policies [CMC01, CR02]policies [CMC01, CR02] Dynamically altering multimedia content [MB01]Dynamically altering multimedia content [MB01] Explicit client notification e.g. SPAND [ SSK97]Explicit client notification e.g. SPAND [ SSK97]