October 15, 2002 MASCOTS 20021
WebTraff:A GUI for Web Proxy Cache
Workload Modeling and Analysis
Nayden Markatchev
Carey Williamson
Department of Computer Science
University of Calgary
October 15, 2002 MASCOTS 20022
Introduction
What is WebTraff?
- An extended and improved version of ProWGen (Proxy Workload Generator), including a GUI interface to a useful set of tools for Web traffic modeling and analysis
Purpose: To facilitate the easy generation and analysis of controllable and representative workloads for Web caching simulations
October 15, 2002 MASCOTS 20023
Talk Overview
WebTraff General Information System Requirements, Data Formats,
Assumptions, Inputs, Outputs, Usage Simple Demo
Using WebTraff to generate and analyze a workload, plus Web proxy cache simulation
Questions and Discussion
October 15, 2002 MASCOTS 20024
System Requirements
Software Requirements Unix based
environment running X windows
cc, gcc, g++, tcl 8.0 or newer, tk 8.0 or newer, wish, perl 5.0 or newer, gnuplot, gs
Hardware Requirements 64 MB or more RAM 100 MB hard disk
space (for storing long workload traces)
Future Work:Port to Windows(volunteers?)
October 15, 2002 MASCOTS 20026
Overview of WebTraff
The WebTraff toolkit provides three main functions: Web workload trace generation Web workload trace analysis Web proxy cache simulation
Graphs displayed in PostScript format
October 15, 2002 MASCOTS 20029
Web Workload Generation
This portion of the tool provides a GUI to ProWGen [Busari/Williamson 2001]
ProWGen models four key characteristics of Web proxy workloads. Zipf-like document popularity distribution High degree of “one-time” referencing Heavy-tailed file and transfer size distributions Temporal locality property in references
October 15, 2002 MASCOTS 200210
Web Workload Generation (cont’d) Name of trace file being generated Sliding widgets for:
Number of references (lines) in a workload file Number of distinct Web objects in workload Percentage of objects that are “one-timers” Slope of Zipf-like document popularity profile Slope of Pareto tail for document size distribution Degree of statistical correlation (if any) between
size and popularity for Web objects
October 15, 2002 MASCOTS 200211
Web Workload Generation (cont’d)
The notion of “temporal locality” refers to temporal correlation in referencing behaviour (e.g., recent past good predictor of near future)
Four models for referencing behaviour: Independent Reference Model (IRM) Static LRU Stack Model (SLRU) Dynamic LRU Stack Model (DLRU) New LRU Stack Model (NLRU)
October 15, 2002 MASCOTS 200212
Web Workload Generation (cont’d)
“Popularity Bias” parameter (hack!) This button was added to remedy a problem
in earlier version of ProWGen, which tended to choose one-timers early in the trace and popular documents late in the trace
Can now control this in workload generation Can visually check for stationarity of cache hit
ratio during simulations
October 15, 2002 MASCOTS 200214
Web Workload Analysis
Two main categories of analysis functions: Time series analysis (on the left) Web workload analysis (on the right)
Radio buttons, slide bars and text boxes available to control plotting characteristics
October 15, 2002 MASCOTS 200223
Web Proxy Cache Simulation
Application-level caching simulation parameters Cache size Cache replacement policy
Five replacement policies currently available Random replacement (RAND) First-In-First-Out (FIFO) Least-Recently-Used (LRU) (default setting) Least-Frequently-Used (LFU) Greedy-Dual-Size (GDS)
October 15, 2002 MASCOTS 200228
For More Information…
WebTraff toolkit: http://www.cpsc.ucalgary.ca/~carey/software.htm
“ProWGen: A Synthetic Workload Generation Tool for the Simulation Evaluation of Web Proxy Caches” Busari/Williamson, Computer Networks, Vol 38, No 6, June 2002 http://www.cpsc.ucalgary.ca/~carey/publications.htm
Contact information: Email {carey,nayden}@cpsc.ucalgary.ca