Download - Econ2209 Week 3
Business Forecasting ECON2209
Slides 03
Lecturer: Minxian Yang
BF-03 1 my, School of Economics, UNSW
Ch.4 Statistical Graphics
• Lecture Plan – Graphs of data: merits and limitations – Examples: use graphs to show data features – Time series differ from random sample – Components in time series – Classical decomposition
BF-03 my, School of Economics, UNSW 2
Ch.4 Statistical Graphics
Statistical Graphics • Example: Anscombe’s quartet
– Why identical regression line?
BF-03 my, School of Economics, UNSW 3
obs X1 Y1 X2 Y2 X3 Y3 X4 Y4 1 10 8.04 10 9.14 10 7.46 8 6.58 2 8 6.95 8 8.14 8 6.77 8 5.76 3 13 7.58 13 8.74 13 12.74 8 7.71 4 9 8.81 9 8.77 9 7.11 8 8.84 5 11 8.33 11 9.26 11 7.81 8 8.47 6 14 9.96 14 8.10 14 8.84 8 7.04 7 6 7.24 6 6.13 6 6.08 8 5.25 8 4 4.26 4 3.10 4 5.39 19 12.5 9 12 10.84 12 9.13 12 8.15 8 5.56 10 7 4.82 7 7.26 7 6.42 8 7.91 11 5 5.68 5 4.74 5 5.73 8 6.89
(0.12) (1.12) 67.0 ,50.0 00.3ˆ 2 =⋅+= Rxy
Ch.4 Statistical Graphics
• Example – Scatter plots explain it vividly.
BF-03 my, School of Economics, UNSW 4
4
5
6
7
8
9
10
11
3 4 5 6 7 8 9 10 11 12 13 14 15
X1
Y1
Y1 vs. X1
3
4
5
6
7
8
9
10
11
3 4 5 6 7 8 9 10 11 12 13 14 15
X2
Y2
Y2 vs. X2
4
5
6
7
8
9
10
11
12
13
3 4 5 6 7 8 9 10 11 12 13 14 15
X3
Y3
Y3 vs. X3
5
6
7
8
9
10
11
12
13
6 8 10 12 14 16 18 20
X4
Y4
Y4 vs. X4
Ch.4 Statistical Graphics
• Advantages of graphs – Graphs represent data visually and help us to see
data features/patterns. • “A graph is worth a thousand of words.”
– Graphs make anomalies/outliers apparent. – Graphs are effective in comparing data sets.
– But it is hard to visualise high dimensional data.
BF-03 my, School of Economics, UNSW 5
Ch.4 Statistical Graphics
• Scatter plots Useful to reveal the relationship between two
variables. eg. “xyz.dat”
BF-03 my, School of Economics, UNSW 6
5
6
7
8
9
10
11
12
13
14
-3 -2 -1 0 1 2 3 4
X
Y
Y vs. X
5
6
7
8
9
10
11
12
13
14
-3 -2 -1 0 1 2
Z
Y
Y vs. Z
ls y c z ====================================================== Variable Coefficient Std. Error t-Statistic Prob. C 10.04891 0.272825 36.83280 0.0000 Z -0.364783 0.242923 -1.501641 0.1400
Weak relation? It could be
y = b0+b1z+b2x + u
Ch.4 Statistical Graphics
• Scatter plots eg. “xyz.dat”. Partial relationship of y and z (after controlling for x):
regress y on c x to get resid01; regress z on c x to get resid02; scatter plot resid01 against resid02.
BF-03 my, School of Economics, UNSW 7
EViews: Type in top panel ls y c x On the result window, click Proc, Make Residual Series, resid01 (in Name for resid series), OK Type in top panel ls z c x On the result window, click Proc, Make Residual Series, resid02 (in Name for resid series), OK Type in top panel group grp resid02 resid01 grp.linefit Also compare the result of ls resid01 resid02 against the result of ls y c x z
Dependent Variable: Y, Sample: 1 48 Variable Coefficient Std. Error t-Statistic Prob. C 9.884732 0.190297 51.94359 0.0000 X 1.073140 0.150341 7.138031 0.0000 Z -0.638011 0.172499 -3.698642 0.0006
-3
-2
-1
0
1
2
3
4
-3 -2 -1 0 1 2 3
RESID02
RES
ID01
RESID01 vs. RESID02
Ch.4 Statistical Graphics
• Time series plot eg. “liquor.dat”, monthly sales, 1967.01 – 1994.12 trend (increase over time on average) seasonality (reoccurring pattern: Dec high, Feb low)
BF-03 my, School of Economics, UNSW 8
400
800
1200
1600
2000
2400
2800
68 70 72 74 76 78 80 82 84 86 88 90 92
Liquor Sales
Ch.4 Statistical Graphics
• Time series plot eg. US 10-year treasury bond yield (monthly, %), 530obs persistent (gentle moves with few large jumps) random trend (ups & downs without a clear pattern)
BF-03 my, School of Economics, UNSW 9
2
4
6
8
10
12
14
16
65 70 75 80 85 90 95 00 05
10-year T-bond Yield
Ch.4 Statistical Graphics
• Time series plot eg. Change of US 10-year treasury bond yield (monthly) fluctuate about 0 (never move away for long) volatile (change direction frequently with large spikes)
BF-03 my, School of Economics, UNSW 10
-2
-1
0
1
2
65 70 75 80 85 90 95 00 05
Change of 10-year T-bond Yield
1−−=∆ ttt yyy
EViews: Read in bond10y.csv File, Open, Foreign Data as Workfile, bond10y.csv (in File name), Open, Finish Type in top panel plot close genr dy=close-close(-1) plot dy hist dy
Ch.4 Statistical Graphics
• Time series plot – Time series plots reveal
• trends • seasonalities • volatilities (mount of variation) • breaks (pattern changes) • outliers (unusual observations)
in data. – First thing in time series analysis: plotting data
BF-03 my, School of Economics, UNSW 11
Ch.4 Statistical Graphics
• Histogram Describes how data are distributed. (frequency distribution) eg. Change of US 10-year treasury bond yield (monthly) Thick-tailed (caused by a small number of large jumps)
BF-03 my, School of Economics, UNSW 12
0
20
40
60
80
100
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
Sample 1962M01 2006M02Observations 529
Mean 0.000926Median 0.010000Maximum 1.590000Minimum -1.880000Std. Dev. 0.348751Skewness -0.280792Kurtosis 6.565498
Jarque-Bera 287.1622Probability 0.000000
Change of 10-year T-bond Yield
Normal distribution: skewness = 0, kurtosis = 3. At the 5% level, reject normality if Jarque-Bera > 5.99.
Ch.4 Statistical Graphics
• Empirical cumulative distribution (cdf) Another way to look at how data are distributed. eg. Change of US 10-year treasury bond yield (monthly)
BF-03 my, School of Economics, UNSW 13
0.0
0.2
0.4
0.6
0.8
1.0
-1.6 -1.2 -0.8 -0.4 0.0 0.4 0.8 1.2
Change
Prob
abilit
y
Empirical CDF
80% of observations are
below 0.23
Ch.4 Statistical Graphics
• QQ-plot Check how a theoretical distribution fits data. For a perfect fit, the QQ-plot is a straight line. eg. Change of US 10-year treasury bond yield: It appears non-normal.
data with 529 observations: (1/529)th quantile = -1.88
std normal distribution: (1/529)th quantile = -2.90
BF-03 my, School of Economics, UNSW 14
-6
-4
-2
0
2
4
6
-2 -1 0 1 2
Change
Nor
mal
Qua
ntile
Theoretical Quantile-Quantile
(-1.88, -2.90)
Ch.4 Statistical Graphics
• EViews Commands for the examples
BF-03 my, School of Economics, UNSW 15
EViews: Read in bond10y.csv File, Open, Foreign Data as Workfile, bond10y.csv (in File name), Open, Finish Type in top panel genr dy=close-close(-1) plot close dy dy.line dy.hist dy.distplot cdf dy.qqplot scalar q80=@quantile(dy, .8) scalar qn529=@qnorm(1/529)
Ch.4 Statistical Graphics
• Style of graphs – Easy to understand eg. indicate the meaning of axes Highlight the point you are trying to make – Informative eg. make self-explained graphs – Attractive eg. use of proper colours and symbols – Avoid “chart junk” eg. abuse of colours, shadings, grids,…
BF-03 my, School of Economics, UNSW 16
Ch.4 Statistical Graphics
Decomposition of a Time Series • Time series versus random sample
– A random sample is a set of independent observations on a variable,
often collected at one point in time. eg. Income of a randomly-selected household Med-insurance status of a randomly-selected household
– A time series is a set of observations on a variable observed
over consecutive time intervals. eg. T-bond rate, gold price, retail sales, … (daily, annual,...)
BF-03 my, School of Economics, UNSW 17
Ch.4 Statistical Graphics
• Features of economic time series data – Trend, seasonality, fluctuation/cycle – Autocorrelation (future is influenced by present) eg. Department stores turnover: 1982.04 – 1999.10 Trend (sales growing), Seasonality (peak & trough repeats) Cycle (random fluctuations)
BF-03 my, School of Economics, UNSW 18
0
400
800
1200
1600
2000
82 84 86 88 90 92 94 96 98
Retail Turnover ($M)
5.6
6.0
6.4
6.8
7.2
7.6
8.0
82 84 86 88 90 92 94 96 98
log Retail Turnover
Ch.4 Statistical Graphics
• Features of economic time series data eg. Gold price ($US per fine ounce, London 3pm, 1/80-11/99) - sub-samples very different; varying trends (randomly); - persistent with few large jumps
BF-03 my, School of Economics, UNSW 19
200
300
400
500
600
700
80 82 84 86 88 90 92 94 96 98
Gold Price (USD/fine ounce, London 3pm)
Ch.4 Statistical Graphics
• Trend, seasonality and cycle Let yt be a time series (observable). – Trend, mt, is the smoothly evolving part of yt. It represents the long-run movement of yt.
eg. Trend in log retail turnover
BF-03 my, School of Economics, UNSW 20
5.6
6.0
6.4
6.8
7.2
7.6
8.0
82 84 86 88 90 92 94 96 98
log Retail Turnover: Trend
Ch.4 Statistical Graphics
• Trend, seasonality and cycle – Seasonality, st, is the repetitive part of yt. It repeats over a fixed number of periods. eg. quartly seasonality repeats over 4 quarters. eg. Log retail turnover: raw – trend = detreded = seasonality + cycle
BF-03 my, School of Economics, UNSW 21
-1
0
1
2
3
4
5
6
7
8
82 84 86 88 90 92 94 96 98
log Retail Turnover: Trend & Seasonality
-.12-.08-.04.00.04.08.12
-.4
.0
.4
.8
84 86 88 90 92 94 96 98
Cycle Raw-Trend Seasonality
Ch.4 Statistical Graphics
• Trend, seasonality and cycle – Cycle, xt, is the random fluctuation in yt, aka
irregular component. It is a RV for each t. eg. Interesting to know how xt and xt -1 are associated. eg. Log retail turnover: cycle = raw – trend – seasonality
BF-03 my, School of Economics, UNSW 22
-3
-2
-1
0
1
2
3
-.12 -.08 -.04 .00 .04 .08 .12
Cycle
Nor
mal
Qua
ntile
Theoretical Quantile-Quantile
-.12
-.08
-.04
.00
.04
.08
.12
82 84 86 88 90 92 94 96 98
Cycle
Ch.4 Statistical Graphics
• Classical decomposition of yt – CD is a model that splits the observable time
series, yt, into three unobserved components: trend (mt), seasonality(st), cycle(xt) – Additive decomposition
where p is the number of periods in a season. eg. p = 12 for monthly series – Multiplicative decomposition: Yt = Mt St Xt But log(Yt) has an additive decomposition.
BF-03 my, School of Economics, UNSW 23
,0)(E ,0 , ,1
===++= ∑=
++ t
p
iittpttttt xsssxsmy
Normalized to zero: so that each component is identified.
Ch.4 Statistical Graphics
• Classical decomposition of yt e.g. Department stores turnover: 1982.04 – 1999.10
BF-03 my, School of Economics, UNSW 24
5.6
6.0
6.4
6.8
7.2
7.6
8.0
82 84 86 88 90 92 94 96 98
log Retail Turnover
-.3
-.2
-.1
.0
.1
.2
.3
82 84 86 88 90 92 94 96 98
Cycle
-1.0
-0.5
0.0
0.5
1.0
1.5
82 84 86 88 90 92 94 96 98
Seasonality
5.5
6.0
6.5
7.0
7.5
8.0
82 84 86 88 90 92 94 96 98
Trend
EViews: Read in data by clicking File, New, Workfile, Dated (in Workfile structure), Monthly (in Frequency), 1982:01 (in Start date), 1999:10 (in End date), OK; Proc (on Workfile window), Import, Import from file…, departstoresTurnover05.xls (in File name), Open, Finish, OK Time series plot of log sales by typing in top panel genr y=log(sales) y.line Generate MA trend genr ma12b=(y(-6)+y(-5)+y(-4)+y(-3)+y(-2)+y(-1)+y+y(1)+y(2)+y(3)+y(4)+y(5))/12 genr ma12f=(y(-5)+y(-4)+y(-3)+y(-2)+y(-1)+y+y(1)+y(2)+y(3)+y(4)+y(5)+y(6))/12 genr ma12=0.5*(ma12b+ma12f) De-trended series genr ydt=y-m12 plot y ma12 plot y ma12 ydt
Ch.4 Statistical Graphics
• Data and EViews
BF-03 my, School of Economics, UNSW 25
Ch.4 Statistical Graphics
• Summary – List the types of graphs we used today. – Why plotting data is important? – What are the major components in a time series? – What is the additive decomposition? – Why must the seasonal component sum to zero
over a season? – Why is the mean of cycle component normalised
to zero? BF-03 my, School of Economics, UNSW 26