round table long series: connection and methodological changes craig mclaren (office for national...
TRANSCRIPT
Round tableLong Series: Connection and
Methodological Changes
Craig McLaren (Office for National Statistics, UK)
Rio de Janerio, 21-25th August 2006
2
Office for National Statistics
Introduction
• Thank you for inviting me here today• Dealing with methodological change from a
time series perspective is a challenging issue• Users should have continuity of time series
• Common issue across National Statistics Institutes
• Talk about previous experiences and issues
3
Office for National Statistics
Issues covered in slides
• Smoothing in methodological change• Description of a generalised backcasting tool
under development at ABS• Multivariate approach for short time spans• Lessons learnt from previous exercises• Upcoming issues at ONS and ABS
4
Office for National Statistics
Dealing with methodological change?
• Want to remove the effect of methodological change• E.g. Change to estimation method, classification structures• Improves the quality of data by increasing consistency and
coherence across time• Should not remove effect of real world change
• Appropriate to still reflect this in the time series
• Two approaches• Edit unit records: typically used for Population data• Alter final estimates: typically used for Economic data
• Consider monthly and quarterly time series with altering the final estimates
5
Office for National Statistics
Motivation: Example
• Level-shift caused by methodological change• Want to raise level of early part of series
6
Office for National Statistics
Motivation: Example of seasonally adjusted before and after backcasting
7
Office for National Statistics
Appropriate backcastingKnowledge required
• Whether or not to backcast• Estimation and significance of impact
• Estimate a suitable backcasting length• How far back does the change in series exist?• Was the change constant or gradual?
• Determine if change is definitional or not• Measure the difference between the old and new
series• For example: conduct parallel sample, do parallel estimation
• Quality control• Assessing the magnitude of revisions
8
Office for National Statistics
Parallel estimation Different scenarios
• Different scenarios• Multiple time point overlap• One time point overlap• Can even have no overlap: use forecasting
• Recompile historical data under new definition• High quality but expensive• How far back to go?
• New data points• Model historical data and evaluate impact by intervention
analysis• Smoothing back based on the impact assessment
9
Office for National Statistics
Parallel estimationMeasuring impact
• Impact can have different components • Trend break: more parallel estimates, more accuracy• Seasonal break: need at least a year + one period of
parallel estimates
• Usually assume no seasonal break• Multiplicative relationship:
impact = 100*average(Onew / Oold) - 100
• Additive relationship: impact = average(Onew - Oold)
10
Office for National Statistics
Parallel estimationSignificance of impact
• Need to assess quality of impact estimate• Is impact significantly different from zero?• Need relative standard error for (Onew - Oold)
• Should assess if statistically significant difference
• Options for different levels• Aggregation level at which impacts are significant
determines backcast process• Quality assurance given for this level and upwards• Lower level series adopt impact of higher level series
11
Office for National Statistics
Aggregation structures
• Directly backcast series• Lowest-level directly seasonally adjusted series
• Indirectly backcast series• All other series formed by aggregation
1-D 2-D
12
Office for National Statistics
Backcast objective
• Objective function for ABS backcasting• Maintain the historical seasonally adjusted movement• Minimise methodological change effect to ensure the
continuity of a time series• Need to avoid misinterpretation of a measurement
change as a real world change
• Aim: bound revisions to movements and maintain stable seasonal factors
• Alternative objective functions equally valid
13
Office for National Statistics
Shape of backcast factors
• Directly backcast series• Multiplicative: exponential shape: Obt = Ot * xt/N
• Additive: linear shape: Obt = Ot + tx
• Index series only: infinite shape: Obt = Ot * x
• Indirectly backcast series• Follow aggregation structure• No particular shape
14
Office for National Statistics
Shape of backcast factors(continued)
real data: exponentialshape comparison
15
Office for National Statistics
Shape of backcast factorsFinite versus infinite length
• Infinite length: multiply whole series by constant• For directly backcast series: no revisions• For indirectly backcast series: small revisions• Good for index series
• Usually bound length for conceptual reasons
• New definition inappropriate long ago: e.g. new technology• Nonsensical to increase an old estimate too much• Risk management
16
Office for National Statistics
Quality measure
• Assess absolute change to the period to period movement in the seasonally adjusted estimates due to backcasting
• Delta = • In general
• Either percentage or quantity change• Multiplicatively seasonally adjusted: percentage• Additively seasonally adjusted: quantity
|| tBt SASA
17
Office for National Statistics
Quality measureDelta
and the absolute differences, pre- vs post- backcasting==> quality measure
seasonally adjusted %-movements
18
Office for National Statistics
Quality measureDelta (continued)
• Maximum percentage change of movements in seasonally adjusted estimates pre and post backcast over a series
• Used as quality measure
• delta maximum (user specified)• Effectively a bound on revisions in seasonally adjusted
movements• Compare maximum delta to delta maximum• Can be normalised to allow comparisons between series
19
Office for National Statistics
Quality measureLength of backcast
• Governed by choice of how much tolerance in the change to movements pre and post backcast
• Smaller tolerance (delta) => longer backcast (typically)
• delta maximum selection• Choose this so the backcast doesn't adversely affect
published values• For example, at most one significant figure in published
movements
• In practice one common length for entire group of time series
20
Office for National Statistics
want delta < threshold
backcast shape and length N
Backcasting process
21
Office for National Statistics
Generalised backcasting toolAustralian Bureau of Statistics
• Consistent ABS approach to backcasting• Standard shapes to smooth in impacts• Consistent diagnostics• Consistent language
• Client areas can perform backcasts without input from Time Series Analysis experts
• Streamlined process• Directly updates stored original estimates
• Currently in development
22
Office for National Statistics
Generalised backcasting toolProcess flow
generalised backcasting
facility
collection of series
impacts
settings
new backcast originals(overwriting old originals)
clearance report
diagnosticsfinal lengthselection
23
Office for National Statistics
Example: Generalised backcasting tool
24
Office for National Statistics
Example: Generalised backcasting tool
25
Office for National Statistics
Example: Generalised backcasting toolSeasonally adjusted estimates
26
Office for National Statistics
Example: Generalised backcasting toolChange in seasonally adjusted movements
27
Office for National Statistics
Example: Generalised backcasting toolClearance report
28
Office for National Statistics
Multivariate approachDealing with methodological change
• Title: Estimation of seasonal factors for a short time span using multi-level modelling
• Number of overlap periods is typically short• This solution was used to assist with transition
between two surveys and the results were used within ABS National Accounts
• Joint work with Xichuan (Mark) Zhang, ABS
29
Office for National Statistics
Multivariate approachAssumptions
• New survey measures the same underlying activity as the old survey
• Trend movement is the same for different surveys but may be at a different trend level
• Seasonal factors are assumed to be different for different surveys
• Can use multilevel models if there is a hierarchical structure
30
Office for National Statistics
Multivariate approachModel
• Mixed model
j : industry
i : old and new
k : state
totals
1 k...
j
...
y X Zb 2~ (0, ), ~ (0, )N I b N
X
Z
is design matrix for fixed effects
is design matrix for random effects
31
Office for National Statistics
Multivariate approachFinal model
Assume a log additive model for the time series decomposition
t
i j
t
c
s
where survey indicator, group indicator,
vector of seasonal factors,
seasonal matrix indicator, time
errorrandomfixed
log( ) log( ) log( ) log( )
( ) ( ) ( )
( ) ( )
ijt ijt ijt ijt
ijt ijt ijt ijt
ij j ij t ijt
t ij j ij t ijt
y T S I
y T S I
a a b b t
a bt a b t
c c s
c s c s
32
Office for National Statistics
Multivariate approachOutline of steps
• 1. Run full model• 2. Remove "trend" of each series• 3. Estimate seasonal factors
• 4. Test if old and new surveys have same seasonal factors
• 5. Convert from model into X-11 framework
* ˆ ˆˆ ˆlog( ) log( ) ( )
( )ijt ijt ij j
i t j ij t ijt
y y a a bt b t
c s c c s
1 0New Old c c
ABS exp(Model )
ABS exp(Model )
new new
old old
33
Office for National Statistics
Multivariate approachExample: real world application
• Two surveys • 15 industries• 8 states and Australia• Data available: Four parallel quarter estimates over
2001
i
j
k
where survey indicator (old,new),
group indicator (15 industries),
8 states, Australia
log( ) ( ) ( )ijkt k k k kt ijk jk ijk kt ijkty a b t a b t c s c s
34
Office for National Statistics
Example: New and old original data for 2 different industries
35
Office for National Statistics
Example: Selected state results* seasonal factors are not significantly different between new and old survey
Series Mar 2001 Jun 2001 Sep 2001 Dec 2001
NSW New 0.984 1.004 0.983 1.029
Old 0.981 0.994 0.995 1.030
* Victoria New 0.983 0.999 0.984 1.034
Old 0.985 0.993 0.992 1.031
Queensland New 0.924 0.979 1.019 1.081
Old 0.972 0.992 0.991 1.045
* ACT New 0.986 1.021 0.967 1.024
Old 0.962 1.043 0.990 1.005
* Total x Ind New 0.977 0.999 0.988 1.036
Old 0.980 0.996 0.993 1.031
36
Office for National Statistics
Multivariate approachComments
• A mixed model (random and fixed effects) with multi-level modelling allowed realistic seasonal factors to be estimated for a short time series
• Provided a framework• More cases like this will occur in practice• Further work by Carole Birrell and David Steel
at University of Wollongong
37
Office for National Statistics
General pointsPrevious lessons
• Quality assurance of the new estimation methodology• Prepare users for revisions in time series estimates
• One overlap time point is simply not enough to make a good impact assessment because of rotation effects
• May required re-backcast once additional information available
• Variance of impact assessments were not available
38
Office for National Statistics
General pointsPrevious lessons (continued)
• Managing statistical risk• Classification and estimation methodology changing at
same time versus consecutive change approach• Different measurement methods: parallel estimates and
parallel run• Rehearsal of backcast environment in relation to regular
production schedule• Ensure additivity of estimates for shallow aggregation
structure• Decision on new estimates should be published e.g. if the
new estimate is good enough for publication or only for impact measurement purpose
39
Office for National Statistics
Upcoming examplesOffice for National Statistics
• Industry classification changes• Approximately once every 12 years• Strong link to European needs• Long agreement process
• Currently in planning stage for upcoming changes
• Broad time frame• 2007 Adding new industry codes to business register• 2008 Annual surveys selected on new codes• 2009 Short-term surveys selected on new codes• 2011 National Accounts first moves to new codes
40
Office for National Statistics
Upcoming examplesOffice for National Statistics (continued)
• Need to be able to re-construct results from old (2003) and new (2007) classifications
• Options available:• Domain estimation• Conversion matrices
• Parallel running for a limited period?• Problems of compiling and publishing results
on two bases simultaneously• Constrain results
41
Office for National Statistics
Upcoming examplesAustralian Bureau of Statistics
• Range of issues• New industry classification
• Generalised regression sample and estimation including using ABS Survey Facilities estimation approach
• Building Activity Survey turnover stratification changes• …
• Developing tools to assist • Generalised backcasting tool
42
Office for National Statistics
Upcoming examplesAustralian Bureau of Statistics (continued)
• Currently: creation of new frames and specification of backcasting facility
• 2006 to 2009: Transition phase with dual frames involving parallel estimation based on top-up samples
• September 2009 onwards: Implementation where sample design and estimation is based on new classification
43
Office for National Statistics
Upcoming examplesAustralian Bureau of Statistics (continued)
• Measuring the impact• For quarterly subannual surveys: 5 overlap points• For monthly subannual surveys: 13 overlap points• Methodology Division will provide advice on the
significance of the impact, i.e. is there any real impact?• If no seasonal change then minimum of 3 overlap
points To calculate the trend factor for backcasting
all new and old estimates are needed (TrendNew / TrendOld)=(1/5)Sum(OriginalNew/OriginalOld)t
44
Office for National Statistics
Some further information
• ONS MD contact: [email protected] • ABS MD contact: [email protected] • McLaren and Zhang (2003) Estimation of
seasonal factors for a short time span using multi-level modelling with mixed effects, Working Paper No. 2003/1, www.abs.gov.au