Microsoft Research 1
Characterizing Alert and Browse Services for Mobile Clients
Atul Adya, Victor Bahl, Lili QiuMicrosoft Research
USENIX Annual Technical ConferenceMonterey, CA, June 2002
Microsoft Research 2
Outline Motivation Related Work Overview of Data Logs and Key Results Detailed Analysis
Notification Services Browse Services Correlation between the Two Services
Summary and Implications
Microsoft Research 3
Motivation Wireless web services
Becoming popular Crucial to understand usage pattern Few existing studies on how they are used
Microsoft Research 4
Related Work
Workload of clients at wireline networks Server-based studies
NASA, ClarkNet, MSNBC, WorldCup, … Proxy-based studies
NLANR, Digital, UW, … Client-based studies
Boston Univ., WebTV, …
Workload of wireless clients Kunz et. al. 2000
Only 80K requests over seven months
No existing study on notification usage
Microsoft Research 5
Overview
A popular commercial Web site for mobile clients Content
news, weather, stock quotes, email, yellow pages, travel reservations, entertainment etc.
Services Notification Browse
Period studied 3.25 million notifications in Aug. 20 – 26, 2000 33 million browse requests in Aug. 15 – 26, 2000
Microsoft Research 6
Overview: User CategoriesCellular users
Browse the Web in real time using cellular technologies
Offline users Download content onto their PDAs for later (offline)
browsing, e.g. AvantGo
Desktop users Signup services and specify preferences
Notification log has 200,860 users (99% were wireless users)
Browse log:User Type # Users # Requests
Cellular 58,432 2,210,758
Offline 50,968 20,508,272
Desktop 639,971 7,342,206
Misc. 1,634 2,944,708
Microsoft Research 7
Major Findings Notification Services
Popularity of notification messages follows Zipf-like distribution
Top 1% notification objects account for 54-64% of total messages
Exhibits geographical locality Browse Services
0.1% - 0.5% urls account for 90% requests The set of popular urls remain stable
Correlation between the two services Correlation is limited
Microsoft Research 8
Outline Motivation Related Work Overview of Data Logs and Key Results Detailed Analysis
Notification Services Browse Services Correlation between the Two Services
Summary and Implications
Microsoft Research 9
Notification Log Analysis
Types of Analyses Content analysis Notification message popularity User behavior analysis
Geographical locality
Microsoft Research 10
Content Analysis
02,0004,0006,0008,000
10,000
Ca
len
da
r
Ho
rosc
op
es
Ho
tMa
il
Lo
tte
ry
Ne
ws
IM N
ote
En
gin
eM
ax
Msg
.W
arn
ing
s
Qu
ote
s
Au
th.
Co
de
s
Sp
ort
s
We
ath
er
Categories
KB
ytes
sen
t
Weekday Weekend
Important to content providers and notification service designers
Popular categories: weather, news, stock quotes, email.
Microsoft Research 11
Notification Message Popularity Researchers have found Web accesses follow
Zipf-like distribution (i.e., # request 1/i)
1.E+00
1.E+01
1.E+02
1.E+03
1.E+04
1.E+05
1.E+06
1 10 100 1,000 10,000 100,000 1,000,000
Popularity ranking of msg
# T
ran
sm
iss
ion
s
Trace Least square line fit
Notification message popularity follows Zipf-like distribution ( [1.1, 1.3]) generate synthetic traces
Microsoft Research 12
Notification Msg Popularity (Cont.) Notification msgs
are highly concentrated on a small number of documents
Top 1% notification documents account for 54% - 64% of the total messages
00.20.40.60.8
1
0 0.2 0.4 0.6 0.8 1
Fraction of notification documents
Fra
ctio
n of
tota
l m
essa
ges
Application-level multicast would be an efficient way of delivering popular notifications.
Microsoft Research 13
Geographical Locality Local sharing
2 users in the same cluster receive the msg
00.10.20.30.40.50.60.70.8
0 500 1000 1500 2000 2500 3000
City ID
Fra
ctio
n o
f re
qu
est
s lo
cally
sh
are
d
Geographical cluster Random cluster
Notification exhibits geographical locality.
Microsoft Research 14
Outline Motivation Related Work Overview of Data Logs and Key Results Detailed Analysis
Notification Services Browse Services Correlation between the Two Services
Summary and Implications
Microsoft Research 15
Browser Log Analysis
Types of Analyses Content analysis Documents popularity User behavior analysis
Temporal stability Geographical locality Load distribution of different users
Microsoft Research 16
Content Analysis
Important to content providers: what content isinteresting to users
Rank #1
Rank #2 Rank #3
Notification Wireless News Weather Stock
Browse Wireless Stock quotes
News YellowPages
Offline Help News Stock
Desktop Sign-ups
Email Sports
Top three preferences for different kinds of users
Microsoft Research 17
Document PopularityTwo definitions of document
Base URLs Full URLs: including parameters
1.E+001.E+011.E+021.E+031.E+041.E+051.E+06
1 10 100 1000
Popularity ranking of base urls
# Re
ques
ts
1.E+001.E+01
1.E+021.E+03
1.E+041.E+05
1.E+06
1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05
Popularity ranking of full urls
# R
eque
sts
Document Popularity does not closely follow Zipf-like distribution.
Microsoft Research 18
Document Popularity (Cont.) Requests are
highly concentrated on a small number of documents
0.1% - 0.5% full urls (i.e., 112 – 442) account for 90% requests
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
Percentage of full urls
Pe
rce
nta
ge
of
req
ue
sts
Very small amount of memory needed to cache popular query results if content doesn’t
change.
Microsoft Research 19
Temporal Stability Methodology
Consider 2 days’ traces
Pick the top n documents from each day
Compute overlap0
0.2
0.4
0.6
0.8
1
1.E+00 1.E+02 1.E+04 1.E+06
# Top documents picked
Fra
ctio
n o
f o
verl
ap
8/15 vs. 8/16 8/15 vs. 8/17 8/15 vs. 8/188/15 vs. 8/19 8/15 vs. 8/20
Popular urls remain stable cache popular query results or optimize performance based on stable workload
Microsoft Research 20
Geographical Locality
Wireless Users
0
0.1
0.2
0.3
0.4
0 100 200 300
City ID
Fra
ctio
n o
f re
qu
est
s lo
cally
sh
are
d
Geographical Random 1 Random 2Random 3 Random 4
Limited geographical locality in users’ browse interest.
Compare local sharing in geographical clustersvs. in random clusters
Microsoft Research 21
Load Distribution of Users
0
20
40
60
80
100
1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06
Interarrival time (seconds)
Per
cent
age
of
requ
ests
Offline users Wireless users
Offline users generate more bursty traffic need to identify & properly handle such bursts
Microsoft Research 22
Outline Motivation Related Work Overview of Data Logs and Key Results Detailed Analysis
Notification Services Browse Services Correlation between the Two Services
Summary and Implications
Microsoft Research 23
Correlation between Notification and Browsing
Correlation in the amount of usage Correlation in popular content
categories
Microsoft Research 24
Correlation in Amount of Usage
1
10
100
1000
0 200 400 600 800
# browse requests from a user
avg
. # n
otif
ica
tion
s to
a
use
r
1
10
100
1000
0 200 400 600 800
# notifications to a userA
vg #
bro
wse
re
qu
est
s fr
om
a u
ser
Low correlation in usage.
correlation coefficient is 0.26 for all users, and 0.12 for wireless users.
Microsoft Research 25
Correlation in Content Categories Approach
Classify notifications and browsing requests into content categories
For each individual user, compare his/her top N notification categories with top N browsing categories
Metric Average overlap
0
10
20
30
40
50
60
1 2 3 4 5 6 7 8 9N
Ave
rag
e a
mo
un
t o
f o
verl
ap
(%
)
All Users Wireless Users
•Wireless users have moderate correlation in content.•The correlation is much lower when considering all users.
Microsoft Research 26
Summary & Implications
Observations Implications
Top 1% notification objects account for 54-64% of total messages.
Delivering notifications via multicast would be effective.
Notification exhibits geographical locality.
Useful to provide localized notification services.
Microsoft Research 27
Summary & Implications (Cont.)
Observations Implications
0.1% - 0.5% full urls (i.e. 121-442) account for 90% requests.
Caching the results of popular queries would be very effective.
The set of popular urls remain stable.
Cache a stable set of popular query results or optimize query performance based on a stable workload.
Limited correlation between users’ browsing and notification pattern.
Service providers cannot solely rely on users’ notification profile to predict how much & what they will browse.
Microsoft Research 28
Comparison
Notification BrowsingZipf-like popularity distribution
Yes No
High concentration of msgs/requests to documents
Yes Yes
Spatial Locality Significant Little
Correlation Limited correlation in both usage and content