automating the selection of a simulation warm-up period stewart robinson, katy hoad, ruth davies...
Post on 22-Dec-2015
213 views
TRANSCRIPT
Automating The Selection of a Simulation Warm-up Period
Stewart Robinson, Katy Hoad, Ruth DaviesWarwick Business School
Cardiff University - October 2008
The AutoSimOA Project
A 3 year, EPSRC funded project in collaboration with SIMUL8 Corporation.
http://www.wbs.ac.uk/go/autosimoa
Research Aim
• To create an automated system for advising a non-expert user on how to obtain accurate measures of model performance
• i.e. warm-up, run-length and number of replications
• For implementation into simulation software
AutoSimOA = Automated SIMulation Output Analysis
Simulationmodel
Warm-upanalysis
Run-lengthanalysis
Replicationsanalysis
Use replicationsor long-run?
Recommendationpossible?
Recommend-ation
Output data
AnalyserO
btai
n m
ore
outp
ut d
ata
The Initial Bias Problem
• Model may not start in a “typical” state.• Can cause initial bias in the output.• Many methods proposed for dealing with
initial bias, e.g.: – Initial steady-state conditions– Run model for ‘long’ time– Deletion of initial transient data (‘warm-up
period’)
The Initial Bias Problem
0
20
40
60
80
100
120
140
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Day
Dai
ly t
hrou
ghpu
t
Initial transient Steady-state
f (steady-state)
Steady-statemean
This project uses: Deletion of the initial transient data by specifying a warm-
up period (or truncation point).
The question is:
How do you estimate the length of the warm-up period required?
Methods fall into 5 main types :
1. Graphical Methods.
2. Heuristic Approaches.
3. Statistical Methods.
4. Initialisation Bias Tests.
5. Hybrid Methods.
Literature search – 44 methods
Summary of methods and
literature references on project
web site:
http://www.wbs.ac.uk/go/autosimoa
Short-listing warm-up methods for automation using literature
Short-listing Criteria
» Accuracy & robustness
» Ease of automation
» Generality
» Computer running time
Short-listing results: reasons for rejection of methods
64
3
3
52
2
1
1
0
2
4
6
8
10
12
Graphical Statistical Heuristic
Warm-up method type
Nu
mb
er
of m
eth
od
s
Ease of Automation Accuracy
Generality Excessive running time
Other
• Statistical methods:– Goodness of Fit (GoF) test– Algorithm for a static data set (ASD)– Algorithm for a Dynamic data set (ADD)
• Heuristics:– MSER-5– Kimbler’s Double Exponential Smoothing– Euclidean Distance Method (ED)
Short-listing results: 6 Methods taken forward to testing
Preliminary testing of shortlisted methods
• Rejected methods:– ASD & ADD required a prohibitively large
number of replications– GoF & Kimbler’s method consistently and
severely underestimated truncation point.– ED failed to give any result on majority of
occasions
• MSER-5 most accurate and robust method
MSER-5 warm-up method
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
0.02
0 50 100 150 200 250 300 350 400
Truncation Point
Test
Sta
tistic
0
1
2
3
4
5
6
Batc
h M
eans
MSER-5 test statistic
Output data (batched means values)
Rejection zone
Estimated warm-up period
Estimated truncation point
Further Testing of MSER-5
1. Artificial data – controllable & comparable initial bias functions steady state functions
2. Full factorial design.
3. Set of performance criteria.
Parameters Levels
Data Type Single run
Data averaged over 5 reps
Error type N(1,1), Exp(1)
Auto-correlation
None, AR(1), AR(2), MA(2), AR(4), ARMA(5,5)
Bias Severity 1, 2, 4
Bias Length 0%, 10%, 40%, 100%
Bias direction Positive, Negative
Bias shape 7 shapes
Artificial Data Parameters
• Mean Shift:
• Linear:
• Quadratic:
• Exponential:
• Oscillating (decreasing):
Quadratic ExponentialLinear
Add Initial Bias to Steady state:
Superpostion: Bias Fn, a(t), added onto end of steady state function:
e.g.
2. Full factorial design: 3048 types of artificial data set
MSER-5 run with each type 100 times
...
)(1
etc
taXY
XX
tt
ttt
i. Coverage of true mean.
ii. Closeness of estimated truncation point (Lsol) to true truncation point (L).
iii. Percentage bias removed by truncation.
iv. Analysis of the pattern & frequency of rejections of Lsol (i.e. Lsol > n/2).
3. Performance Criteria
MSER-5 Results
Does the true mean fall into the 95% CI for the estimated mean?
Non-truncated data sets
Truncated data sets
% of cases
yes yes 7.7%
yes no 0%
no no 19.8%
no yes 72.5%
i. Coverage of true mean.
-70
-50
-30
-10
10
30
50
0 20 40 60 80 100run
Lsol -
L
Quadratic bias Mean-shift bias
ii. Closeness of Lsol to L:
• Wide range of Lsol values.
e.g.
(Positive bias functions, single run data, N(1,1) errors, MA(2) auto-correlation, bias severity value of 2 and true L = 100.)
iii. Percentage bias removed by truncation.
0
5
10
15
20
25
300-4
0
40-5
0
50-6
0
60-7
0
70-8
0
80-9
0
90-9
5
95-9
9
99-1
00
100+
% bias removed
% o
f to
tal v
alid
runs
All valid runs
Effect of data parameters on bias removal
No significant effect: Error type
Bias direction
Significant effect: Data type
Auto-correlation type
Bias shape
Bias severity
Bias length
0
50
100
0-4
0
40
-50
50
-60
60
-70
70
-80
80
-90
90
-95
95
-99
99
-10
0
10
0+
% of bias removed
cum
ula
tive
% o
f va
lid c
ase
s Single run
Averaged replications
More bias removed by using averaged replications rather than a single run.
0
50
1000
-40
40
-50
50
-60
60
-70
70
-80
80
-90
90
-95
95
-99
99
-10
0
10
0+
% of bias removed
cu
mu
lative
% o
f va
lid
ca
se
s no a-c AR(1)
AR(2) AR(4)
MA(2) ARMA(5,5)
The stronger the auto-correlation, the less accurate the bias removal.
Effect greatly reduced by using averaged data.
0
50
100
0-4
0
40
-50
50
-60
60
-70
70
-80
80
-90
90
-95
95
-99
99
-10
0
10
0+
% of bias removed
cu
mu
lative
% o
f va
lid
ca
se
s
mean-shift Linear
Quad Exp
OscL OscQ
OscE
The more sharply the initial bias declines, the more likely MSER-5 is to underestimate the warm-up period and to remove increasingly less bias.
0
50
1000
-40
40
-50
50
-60
60
-70
70
-80
80
-90
90
-95
95
-99
99
-10
0
10
0+
% of bias removed
cum
ula
tive
% o
f va
lid c
ase
s 1
2
4
As the bias severity increases, MSER-5 removes an increasingly higher percentage of the bias.
0
50
100
0-4
0
40
-50
50
-60
60
-70
70
-80
80
-90
90
-95
95
-99
99
-10
0
10
0+
reje
ctio
ns
% of bias removed
cum
ula
tive
% o
f va
lid c
ase
s
10%
40%
Longer bias removed slightly more efficiently than shorter bias.
Shorter bias - more overestimations - partly due to longer bias overestimations being more likely to be rejected.
0
100
200
300
400
500
600
700
800
900
x=
0
0<
x≤1
1<
x≤5
5<
x≤1
0
10
<x≤2
0
20
<x≤4
0
40
<x≤6
0
60
<x≤8
0
80
<x≤1
00
x = no. of Lsol rejections
no
. o
f ca
se
s
ARMA(5,5)
MA(2)
AR(4)
AR(2)
AR(1)
No auto-correlation
Rejections caused by: high auto-correlation, bias close to or over n/2, smooth end to data = ‘end point’ rejection.
Averaged data slightly increases probability of getting ‘end point’ rejection but increases probability of more accurate L estimates.
iv. Lsol rejections
0
10
20
30
40
50
1000 1100 1200 1300 1400 1500 1600 1700 1800n
Lso
l re
ject
ion
co
un
t
+ meanshift
+ linear
+ quadratic
+ exp
+ osclinear
+ oscquad
+ oscexp
Giving more data to MSER-5 in an iterative fashion produces a valid Lsol value where previously the Lsol value had been rejected.
e.g. ARMA(5,5)
Lsol values Percentage of cases
Lsol = 0 71%
Lsol ≤ 50 93%
Testing MSER-5 with data that has no initial bias.
Want Lsol = 0
Lsol > 50 mainly due to highest auto-correlated data sets - AR(1) & ARMA(5,5).
Rejected Lsol values: 5.6% of the 2400 Lsol values produced. 93% from the highest auto-correlated data ARMA(5,5).
Testing MSER-5 with data that has 100% bias.
Want 100% rejection rate: Actual rate = 61%
0
1020
30
4050
60
70
8090
100
Line
ar
Qua
d
Exp
Osc
Line
ar
Osc
Qua
d
Osc
Exp
Bias shape
Per
cent
age
of L
sol
reje
ctio
ns
0
10
20
30
40
50
60
70
80
90
M1 M2 M4
Bias severity
Per
cent
age
of L
sol
rej
ectio
ns
Single data Averaged data
Summary
• MSER-5 most promising method for automation– Not model or data type specific. – No estimation of parameters needed. – Can function without user intervention. – Shown to perform robustly and effectively
for the majority of data sets tested. – Quick to run. – Fairly simple to understand.
Heuristic framework around MSER-5
Run k (= 5) replications of length, n ≥ 100
Create averaged
data
Batch data into b batches of length m, where number of
batches = bmn and n* =
b×m ≤ n
MSER-5 returns Lsol value
Produce more data to create
batches of no. orig of %10 or a user specified
number.
Dynamic graph of batched data; single reps, or
MSER-5 statistic
Graph of batched data; single reps,
or MSER-5 statistic with valid Lsol value shown.
Input data into MSER-5 algorithm.
Yes
Yes
No
No
Does User wish to keep running with more data? END
Lsol valid.
Lsol invalid.
Is Lsol ≤ (n* - (m × 5))/2
?
Yes
Have there been 10 invalid Lsol
values in a row?
No
Yes No
Does User wish to keep running with more data?
Produce more data to create
batches of no. orig of %10
Iterative procedure for procuring more data when required.
‘Failsafe’ mechanism - to deal with possibility of data not in steady state; insufficient data provided when highly auto-correlated.
Being implemented in SIMUL8.