nonstationarities in teletraffic data which may spoil your statistical tests
DESCRIPTION
Nonstationarities in teletraffic data which may spoil your statistical tests. Piotr Żuraniewski (UvA/TNO/AGH) Felipe Mata (UAM), Michel Mandjes (UvA), Marco Mellia (POLITO). Stationarity. Many models assume stationarity: statistical properties do not change over time - PowerPoint PPT PresentationTRANSCRIPT
Nonstationarities in teletraffic data which may spoil your
statistical tests
Piotr Żuraniewski (UvA/TNO/AGH)
Felipe Mata (UAM), Michel Mandjes (UvA), Marco Mellia (POLITO)
Stationarity
• Many models assume stationarity: statistical properties do not change over time– strong stationarity: all statistical properties
remain the same over time– weak stationarity: statistical properties up to
second order (mean, variance, covariance) remain unchanged
Nonstationarity – problems
• Real life: things are changing…• Bad news: sample stationarity can not be
positively verified• Best answer we can get: ‘we found no
evidence of given type of nonstationarity’• Some examples:
– mean shift– polynomial deterministic trend– variance change
Example
• Change in the number of users in VoIP system
• Model: load change in M/G/inf queue
• Sample ACF suggests very high correlation– slow decay?– long range
dependency?
0 50 100 150 200250
300
350
400
450
time
no.
of u
sers
0 5 10 15 20-0.2
0
0.2
0.4
0.6
0.8
lag
sam
ple
AC
F
Example
0 50 100 150 200250
300
350
400
450
time
no.
of u
sers
0 5 10 15 20-0.2
0
0.2
0.4
0.6
0.8
lag
sam
ple
AC
F
• Changepoint detection procedure we developed allows to separate parts with different load
• There is no significant correlation in either of this parts
• Sample ACF does not estimate ACF in case of nonstationarity
0 5 10 15 20-0.2
0
0.2
0.4
0.6
0.8
lag
sam
ple
AC
F
Changepoint detection
• Window of 50 samples presented to detection procedure
• Add newest observation, drop oldest and repeat detection procedure
• In this example: true change in window number 51
• Changepoint detection works well – see output of 500 experiments
0 50 100 1500
0.2
0.4
0.6
0.8
1
window no.
dete
ctio
n ra
tio
0 50 100 150 200250
300
350
400
450
time
no.
of u
sers
Changepoint detection
• However, if we add deterministic trend, things go wrong
• Observe high false alarm ratio after polluting data with trend
0 50 100 150 200250
300
350
400
450
500
time0 50 100 150 200
025
0 50 100 1500
0.2
0.4
0.6
0.8
1
window no.
dete
ctio
n ra
tio
Work in progress
• Real VoIP data from Italian service provider and aggregated IP data from Spanish university backbone network
• Current research: estimate and remove trend from traffic
• Only than apply changepoint detection procedure(s)
1.2912 1.2914 1.2916 1.2918 1.292 1.2922 1.2924 1.2926 1.2928 1.293
x 109
0
100
200
300
400
500
600
700
800
900
Work in progress
• Trend estimation methods:– moving average?– kernel/wavelets smoothing?– parametric methods?– time series regression?
• How to judge if estimated trend is really significant?
• Models different than M/G/inf?
Conclusions
• Different types of nonstationarities may severely influence statistical tests or values of estimators
• Even if we try to detect one type of nonstationarity, the other type may ruin our original test
• We always have to pay attention to the assumptions of the theorems used
• Share your experience!