Download - Part 1 Financial Variables
-
8/11/2019 Part 1 Financial Variables
1/18
1
ETF5930 Financial econometrics
Part 1: Some financial variables, their time series and distributions
As an expert in Finance, you are interested in quantities such as:
The value of an equity (share, unit etc.)
The return on an equity over a particular time period
The dividend paid on a share
Dividend yields
The value of an index: stock market indices such as the Dow Jones or the S&P 500
Interest rates
Exchange rates, etc.
In Financial Econometrics, we think of these quantities as statistical variables, and we study the
distributions of these variables, and test for evidence of relationships between them. We consider
their behaviour over time, and also the cross-sectional distribution.
The distributions
Cross-sectional distributions
We can study the cross-sectional distributions: for example, consider the variable return on a share
and consider all companies included in the S&P 500, and the return on a share in each of these
companies between close on 3 February 2014 and 4 February 2014. So the variable return on a
share takes different values on different companiesobserved at the same time. Here we can graph
a histogram of the returns and identify whether a particular company is for example in the top 10%
of returns.
(Data fromhttp://www.asxallordinaries.com/)
Linear regression can be used to study the relationship between two or more cross-sectional
variables. However, in finance we are frequently concerned with variables that vary across time.
0
50
100
150
200
Numberof
companies
Distribution of Proportional change in price from COB 3 Feb 2014 to COB 4 Feb 2014,
All Ordinaries
http://www.asxallordinaries.com/http://www.asxallordinaries.com/http://www.asxallordinaries.com/http://www.asxallordinaries.com/ -
8/11/2019 Part 1 Financial Variables
2/18
2
Variation across time
We can study for a particular company how the price of a share for example varies across time.
Then the price of a share is regarded as a random variable. It takes different values at different
points in time and might be denoted by tY where truns from 1 to T.(Here we may be considering
the values tick by tick, hourly, daily, annually.) We can graph the distribution of prices, throwing
away the information about what time each price occurred. Or we can graph the price of the share
against the time variable. We can also study the relationship between two or more variables that
vary across time.
Dynamical models
And eventually we can consider dynamical modelsof the variation across time. We will start
discussing this in about Week 6 and much of the second half of the semester will be devoted to
dynamic models.
Index numbers
It is often of interest to compare the behaviour over time of a single equity with the behaviour of the
market as a whole. Index numbers measuring the overall movement of the market are commonly
used and can be tracked and studied in the same way as a single equity.
The following graphs were obtained from data on the yahoo website
(http://au.finance.yahoo.com/q/hp?s=%5EAORD etc.)
Note that the two mining shares have fairly similar behaviour over time.
Question: Can you tell from this graph which of the mining shares is performing better? Why or why
not?
0
10
20
3040
50
60
70
80
90
100
Sha
reprice($)
Time series for share prices of several
Australian companies
NAB Fairfax RIO BHP
http://au.finance.yahoo.com/q/hp?s=%5EAORDhttp://au.finance.yahoo.com/q/hp?s=%5EAORDhttp://au.finance.yahoo.com/q/hp?s=%5EAORDhttp://au.finance.yahoo.com/q/hp?s=%5EAORD -
8/11/2019 Part 1 Financial Variables
3/18
3
Next we graph the All Ordinaries index, showing the behaviour of the market overall.
Notice that all the example shares exhibit some of the variation shown by the index, but sometimes
other things are going on as well. For example compare Fairfax with the index:
In weeks 3-5 we will discuss the relationship of individual share prices to the market.
The next graph is of an American index: the time series for the American S&P 500 from 1957 to the
present:
0
1000
2000
3000
4000
5000
6000
7000
8000
Closing value of the ASX
All Ordinaries index, 1988 - 2013
0
1
2
3
4
5
Shareprice($)
Fairfax
-
8/11/2019 Part 1 Financial Variables
4/18
4
Time series graphs of equities and indexes are often given in terms of the log(Price) or log(index)instead of the price itself. For example if we graph the log of the S&P 500 we get:
An increase in the logarithm by a fixed amount at any point corresponds to the sameproportionate
increase in the index. For example an increase of 50% in the index corresponds to an increase of
approximately 0.4 in the logarithm of the index. And a fall of one third corresponds to a decrease of
approximately 0.4. This means that catastrophic falls for example that happened a long time ago are
not made to look negligible relative to recent events just because the base value was lower then.
As another example, we show a time series of the Australian All Ordinaries index, and of the log of
the index.
0
200
400
600
800
1000
1200
1400
1600
1800
2000S&P 500 index 1957 - 2013
3.0
4.0
5.0
6.0
7.0
8.0
2/01/1957 11/09/1970 20/05/1984 27/01/1998 6/10/2011
ln(S&P500 index)
-
8/11/2019 Part 1 Financial Variables
5/18
5
0
1000
2000
3000
4000
5000
6000
7000
8000
Closing value of the ASX All Ordinaries index,
1984 - 2013
6
6.5
7
7.5
8
8.5
9
ln(ASX All Ordinaries index), daily closing value
1984 - 2013
-
8/11/2019 Part 1 Financial Variables
6/18
6
Prices, returns and dividends
Returns
It is already clear from the examples above that the actual value of the shares can be widely
different. (Fairfax shares are around a few dollars while BHP shares are around a few tens of dollars.)
These kinds of differences are not of importance to us. We are more interested in the rate at which
they are rising or falling, and again the logarithm will be relevant.
Consider the daily rise and fall of an equity. The simple return over one unit in time is defined to be
(Orexpressed as a percentage - )
Solving fort
P :
andt
t
tR
P
P
11
We usually consider continuously compounded returnsinstead of simple returns.
The continuous rate of return can be defined as
1
lnt
t
t
P
Pr so it is related to
tR by
The continuously compounded returnis also referred to as the log return.
(The definition of the continuously compounded rate of return is analogous to compound interest in
the limit as compounding happens at smaller and smaller intervals: details in the following box:)
1
1
t tt
t
P PR
P
1
1
100%t ttt
P PR
P
1(1 )t t tP R P
ln(1 )t tr R
Continuous compounding:
Recall the compounding of interest: If the annual interest rate is specified as r, or 100 %r , and
it is to be compounded m times per year, then after tyears, the principle 0P has grown to
0 1
mt
t
rP P
m
and as we approach continuous compounding, that is, let m go to infinity,
0 0lim 1
mt
rt
tm
rP P P e
m
where1
lim(1 ) 2.718mm
em
And of course,
( 1)
( 1)
1 0 0lim 1
m t
r t
tm
r
P P P em
so that
1
rt
t
Pe
P
, or1
ln t
t
Pr
P
So far, we are considering a pre-determined interest rate, but we wish to consider returns that
vary with time. Ift
P is the price of an asset at time t, we can definethe continuously
compounded return at time tto be:
1
lnt
t
t
P
Pr
Since the simple return is1
1
t t
t
t
P PRP
, we have tt
tR
PP
11
so that ln(1 )t tr R .
-
8/11/2019 Part 1 Financial Variables
7/18
7
Additivity of returns
Continuously compounded returns are additive across time.
Recall that: ln( ) ln( ) ln( )x y xy so that ln( ) ln( ) ln( ) ln( )A B A B AB C B C C
It follows that weekly continuous compounded returns can be obtained by adding daily returns:
Simple returns are additive across a portfolio. (this is a weighted sum.)
where ptR is portfolio return at time t, itR is return of individual share at time t, iw is the weight of
the ithequity in the portfolio.
Returns and pricesIf you know the time series of prices, then you know the time series of (log) returns since
1
lnt
t
t
P
Pr .
If you know the initial price and the returns at each time point, then you know the time series of
prices: 1tr
t tP e P .
Since there is a one-to-one relationship between returns and prices, we can say that prices and
returns explain the same occurrence differently: are different representations of the same data.
Different representations of the same data are common in finance. Each representation has
different statistical properties and reveals different features of the underlying phenomena. The
different representations require different approaches to the statistical analysis.
Consider first what we learn from the log returns. Consider a time series graph of the daily log
returns:
-0.250
-0.200
-0.150
-0.100
-0.050
0.000
0.050
0.100
0.150 log Returns, S&P500
3 5 51 2 4
0 1 2 3 4 0
ln ln ln ln ln lnp p pp p p
p p p p p p
pt i it
i
R w R
-
8/11/2019 Part 1 Financial Variables
8/18
8
The log returns appear to fluctuate about a mean. That mean value appears to be close to zero.
There are surprisingly manybig positive and negative values. For example, Black Monday October
19 1987. (In what sense surprising? See below.)
Also volatility clustering occurs: there are periods when large changes are followed by more large
variation (positive or negative), and when small changes are followed by small changes. Thus the
volatility is time-varying, and there is clustering of volatility levels.
Returns generally do not trend up or down over time. They form a stationary series. (Stationarity
will be defined more formally later.) So we can investigate the distribution of returns even when the
observations have been made over time.
Thus the returns data can also be graphed in a histogram rather than a time series. Here we just look
at the values, not when they occurred. So the following graph is rather similar in nature to the cross
sectional graph, even though the values did occur at different points in time.
The red line shows the normal distribution with the same mean and standard deviation as this data.
Relative to the normal distribution, the returns data has a sharper central peak (more very small
values) and heavier tails (more extreme values, both positive and negative). This is the sense in
which there are surprisingly many big fluctuations.
Prices vs returns
Log returns have the following advantages.
They are unit free
They show the short-term variation well
They are statistically quite easy to handle.
However,
0
10
20
30
40
50
60
70
80
-.10 -.08 -.06 -.04 -.02 .00 .02 .04 .06 .08 .10
Normal
Density
Histogram of daily log returns for S&P500, 1957 - 2013
-
8/11/2019 Part 1 Financial Variables
9/18
-
8/11/2019 Part 1 Financial Variables
10/18
10
1 2 3
2 3
1 2 31 (1 ) (1 )
t t tt t
t t t
D D DP
R R R
E
wherePt = price at time t, Et= expected value at time t, t nD = dividend at time t + n,Rt+n=discount
rate at time tfor dividend to be paid at time t + n.
If the expected dividend is constant and equal tot
D , and if the discount rateRis assumed constant,
(using formula for sum of geometric series)
Rearranging,t
t
DR
P . Here, remember,
tP is the present value, and
tD is the dividend level, which
we have assumed constant for the purposes of the calculation.
Of course over the long term dividends are not constant, and investors are interested in the current
level of dividends relative to price. Therefore dividend yieldat time tis definedast
t
t
DYIELD
P .
(In the newspaper, what is quoted is Dividend for previous yearcurrent price
.)
Taking logarithms and rearranging, ln ln lnt t t
P D YIELD .
Here is a graph of the dividend yield for the shares of the S&P500, monthly since 1870:
Notice that there is considerable variability, some of which can be associated with well-known
events. However, it appears to wander randomly around the level 0.05.
0
0.05
0.1
0.15
1850 1900 1950 2000 2050
S&P Dividend Yield, monthly since 1870
2 3
2
1
(1 )
...1 1 1
1 11
(1 ) (1 ) (1 )
1
(1 ) 1
t t t
t
t
t
R
t
D D DP
R R R
D
R R R
D
R
D
R
-
8/11/2019 Part 1 Financial Variables
11/18
11
Numerical summary statistics
An analysis always starts with a graphical investigation, because the quickest way to get a feel for
the qualitative properties of a data set is by producing relevant graphical plots. However, numerical
summary statistics are needed in order to make quantitative comparisons between series in a more
systematic analysis.
The central locationof a distribution is measured numerically as mean, median or mode.
The variabilityof a distribution is measured numerically using the variance, standard
deviation or interquartile range.
The asymmetryof a distribution is measured numerically using skewness.
The heaviness of tails is measured numerically as kurtosis.
The relatednessof the movement of two time series over time is measured numerically as
covariance or correlation.
Dependence on past values and periodic tendenciesof a single time series are measured
numerically by autocovariance and autocorrelation.
The quantities calculated from the data are sample statistics. They are estimates of the population
parameters.
Sample statistic Name Population parameter
Mean
Variance
Skewness
Kurtosis
Covariance
Correlation
Autocovariance
Autocorrelation
2=( )X
X E
= ( )X E
3 3( )
XX E
1
1 T
t
t
X XT
2 2
1
1( )
1
T
X t
t
s X XT
3
1
1
1
T
t
t X
X XS
T s
4
1
1
1
T
t
t X
X XK
T s
4 4( )XX E
1
1( )( )
1
T
XY t t
t
s X X Y YT
cov( , )
[( )( )]X Y
X Y
X Y
E
cov( , )
X Y
X Y
XY
X Y
s
s s
1
1( ) ( )( )
T
t t k
t k
k X X X X T k
( )
(0)
k
( )
(0)
k
( ) [( )( )]t X t k X k x x E
-
8/11/2019 Part 1 Financial Variables
12/18
12
Mean, variance and covariance
The mean return is calculated from sample data as1
1 T
t
t
r rT
.
In the absence of any other information about returns, the expected return on any given day is
estimated to be r . This is the best estimate of the expected value of the return [ ]r tr E .
However, the returns vary from day to day, and the magnitude of this variation is indicated by
calculating the variance of the returns. The population value of the variance is2 2( )tr E .
The best estimate of the variance based on the data from t= 1to Tis 2 2
1
1( )
1
T
r t
t
s r rT
.
Note that the mean of a multiple of r
[ ] [ ]t twr w r E E ,
which can also be written wr rw .
But since the variance involves the square of r, we find2 2 2
wr rw . For example if you double the
whole return series, then you double the expected value, but you multiply the variance by 4.
Since the variance is a measure of the unpredictable variation in the returns, it is a measure of risk.
Now consider two assets with return series r1andr2. The covariance between the two series is given
by 1 2 1, 1 2, 2cov( , ) [( )( )]t tr r r r E and so 1 1 2 2 1 2 1 2cov( , ) cov( , )w r w r w w r r .
Also, 2cov( , ) 0a r where ais a constant.
Portfolio risk management: Mean-variance model of relationship between risk and return
Variance is a measure of riskin the sense that if a stock is purchased because of its expected future
returns, but its variance is also high, this means there is a risk that its returns will differ from their
expected value. It is well-known that this risk can be diminished by holding a diversified portfolio. So
now we consider the variance of a portfolio.
An investor usually holds a portfolio of a variety of assets. We consider how to measure the variance
of the whole portfolio. Consider the simplest example where the portfolio consists of just two assets.
Suppose the portfolio consists of two investments, investments 1 and 2 :
Mean: ][ ,11 trE ][ ,22 trE
Variance: )][( 21,121 trE )][(
22,2
22 trE 140
Covariance: )])([( 2,21,11,22,1 tt rrE
Current weight in value of portfolio:1w 2w
(The weights sum to 1. )
-
8/11/2019 Part 1 Financial Variables
13/18
13
The return rpon the portfolio is the weighted sum of the returns on each of the two assets:
, 1 1, 2 2.p t t tr w r w r and the expected value of the return is 1 1 2 2p w w
The variance 2p of the portfolio is given by:
2
1 1, 2 2,
1 1, 2 2, 1 1, 2 2,
2 2
1 1, 2 2, 1 2 1, 2,
2 2 2 2
1 1 2 2 1 2 1,2
var( )
var( ) var( ) 2cov( , )
var( ) var( ) 2 cov( , )
2
p t t
t t t t
t t t t
w r w r
w r w r w r w r
w r w r w w r r
w w w w
(Note: 1, 2, 2, 1,cov( , ) cov( , )t t t t r r r r in other words 1,2 2,1 .)
We can choose the weights so as to minimise the portfolio variance.
By straightforward calculus, (not required) it can be shown that the values of1w and 2w that
minimise the portfolio variance 2
p are:
2 2
2 12 1 121 22 2 2 2
1 2 1,2 1 2 1,2
;2 2
w w
Example 1
Suppose the two assets have a variance-covariance matrix:
2
1 1,2
22,1 2
0.011332 0.002380
0.002380 0.005759
Calculate the variance-minimising weights of the two assets in a portfolio.
2
2 1,2
1 2 2
1 2 1,2
0.005759 0.0023800.274
2 0.011332 0.005759 2 0.002380w
2 11 0.726w w
Note that more weight is given to the asset with the lower variance.
Example 2
If the covariance1,2
0 then2 2
2 11 22 2 2 2
1 2 1 2
,w w
so the weight of one corresponds to
the proportion of the variance of the other.
Skewness and kurtosis and (non-)normality
Approximating a distribution by the normal distribution makes analysis easier. Assumption of normal
returns is widely used in portfolio allocation models and value-at-risk calculations and in pricing
options. But financial series are often non-normal.
-
8/11/2019 Part 1 Financial Variables
14/18
14
Deviation from the normal distribution: Skewness
Sample skewness:
If there is a long tail to the left, Swill be large and negative andXis said to be negatively skewed, orskewed to the left.
If there is a long tail to the right, Swill be large and positive, andXis said to be positively skewed, or
skewed to the right.
For the distribution of S&P500 daily returns 1950-2011, S = 1.0479, so the distribution is
somewhat skewed to the left.
Recall that the median is the value such that half the values are greater than it and half are less.
Note that if the mean is greater than the median, then the distribution tends to be skewed to the
right. The intuition behind this is that very large values in the right tail will increase the mean
strongly, but the median cannot tell the difference between a value that is just above the middle
value, and one that is extremely large. So you can think of the large values as pulling the mean to the
right, but not affecting the median. A similar discussion holds for negative or left skew.
Deviation from the normal distribution: kurtosis
A symmetrical distribution can deviate from the normal distribution by being more or less peaked
than the normal distribution with the same variance, and/or by having heavier or lighter tails. We
have seen the case of more peaked and heavier tails in the distribution of daily log returns.
0
10
20
30
40
50
60
70
80
-.10 -.08 -.06 -.04 -.02 .00 .02 .04 .06 .08 .10
Normal
Density
Histogram of daily log returns for S&P500, 1957 - 2013
3
1
1
1
T
t
t X
X XS
T s
-
8/11/2019 Part 1 Financial Variables
15/18
15
Sample kurtosis is . If there are many values far from the mean,K
will be large.
If Kis large, the distribution is said to be leptokurtic; if Kis small the distribution is said to be
platykurtic.
For the normal distribution, Kurtosis = 3; for the distribution of log returns for S&P500, Kurtosis =
32.0597.
Evidence of non-normality: Quantile-quantile graph:
4
1
1
1
T
t
t X
X XK
T s
-
8/11/2019 Part 1 Financial Variables
16/18
16
To explain a quantile-quantile plot, consider as an example the point on the plot corresponding to
the 99th
percentile. The 99th
percentile of the standard normal distribution is 2.327, meaning that
there is a probability of 1% that a standard normal variable is greater than 2.327. The coordinates of
the little blue circle corresponding to the 90th
percentile will be (2.327,x) wherexis the value such
that 1% of the sample log returns are greater thanxand 99% are belowx.
The line represents the expected quantiles if the distribution were normal. If the daily log returns
had a normal distribution2( , )rN r s with their actual mean and variance, then a graph of the
percentiles of the returns against the percentiles of the standard normal distribution would be the
straight line shown. Since the quantile points in the left tail are below the line, this means there are
more observations than expected for the normal distribution at low values. And at the upper end,
the fact that the quantile points are above the line means that there are more large observations
than would be expected for a normal distribution (e.g. the 99th
percentile occurs at a higher value so
the 1% are further out). We conclude that the log returns have heavy tails relative to a normal
distribution.
The Jarque-Bera testuses the values of S and K to determine whether there is sufficient evidence to
conclude that a distribution is not normal. The Jarque-Bera test statistic is
22 ( 3)
6 4
T KJB S
so that if the distribution is normal,JBshould be 0.
The hypotheses are:
0
1
: The variable is normally distributed (so 0, 3)
: The variable is not normally distributed
H S K
H
If the null hypothesis is true, the test statisticJBhas a chi square distribution with 2 degrees of
freedom,2
2~ JB . Note that the critical value of2
2 cutting off an upper tail of area 1% is 9.21.
Therefore if we wish to perform the hypothesis test at the 1% level of significance, we would reject
the null hypothesis if the sample value of JB was greater than 9.21.
-
8/11/2019 Part 1 Financial Variables
17/18
17
Example: Consider the daily log returns of the S&P500 index. A histogram and descriptive stats of
this series was obtained using EViews.
Question:
(a) Read off the value of skewness and kurtosis from the EViews output, and hence calculate
the sample value of the statistic JB. Compare your answer with the value given by EViews.
(b)
Complete the test.
(c) How do you interpret the line Probability 0.000000?
Covariance and correlation
Investors wish to understand the relationship between the returns on the assets in their portfolio.
For example, if they tend to move in opposite directions then the combination can be less risky than
the individual holdings. Sample covariance between two time series of returns on two assets 1r and
2r is defined by
.
A positive sample covariance between returns on two assets suggests that their returns have a
tendency to move in the same direction, while a negative covariance suggests that they have a
tendency to move in opposite directions. If the covariance is approximately zero, then there is no
particular co-movement. The absolute size of covariance is not particularly informative. For this we
need to use instead correlation, which is a normalised unit-free version of the covariance.
The sample correlation coefficient is: where
Thus correlation is the covariance, divided by the two standard deviations. The correlation is unit-
free and lies between1 and 1.
Some examples of covariances and correlations along with graphs of the time series:
RIO and BHP NAB and BHP Fairfax and BHP Fairfax and NAB
Covariance 320.97 54.46 3.56 0.12
Correlation 0.97 0.71 0.33 0.03
1 2, 1 1 2 21
1
1
T
r r t t t t
ts r r r rT
1 2
1 2
,
2 2
r r
r r
s
s s
22
1
1
1i
T
r it i
i
s r rT
-
8/11/2019 Part 1 Financial Variables
18/18
18
A company that has negative correlation with the market as a wholeas represented by the market
index is known as a negative beta asset (the reason will be explained in future lectures) and hasthe advantage that it is a hedging instrument against the overall market.
Auto-covariance and autocorrelation will be discussed in the dynamic time series part of the unit
(Week 6 onwards).
0
20
40
60
80
100
Shareprice($)
Time series for share prices of several Australian companies
NAB Fairfax RIO BHP