portfolio selection with a bayesian approachlup.lub.lu.se/.../record/5155710/file/5155711.docx ·...

Portfolio Selection with a Bayesian Approach

Bachelor’s thesis, 15 ECTS

Author: Martin de Caprétz

Supervisor: Stepan Mazur

Spring 2015

AbstractThis paper deals with a traditional method for creating portfolios of financial assets known as the

mean-variance portfolio theory and especially a specific case of the theory known as the global

minimum variance portfolio. A disadvantage with many of the models that exist within portfolio

theory is that they do often only consider a few quantified variables in the calculations and this could

cause problems in areas such as finance where a lot of the information regarding investments can be

difficult to quantify into numbers.

In the paper Bayesian statistical methods are used to try to implement the beliefs of an investor and by

testing and evaluate different approaches examine if it is possible to find a method that take both

calculated variables and an investor’s beliefs into account.

Table of Contents

1. Introduction................................................................................................................................3

2. Modern Portfolio Theory............................................................................................................4

2.1 Mean –Variance Portfolio Theory.............................................................................................4

2.2 The portfolio’s characteristics...................................................................................................5

2.3 Global Minimum Variance Portfolio (GMVP)............................................................................6

2.4 Diversification...........................................................................................................................7

2.5 Estimating expected return and standard deviation.................................................................8

3. Data............................................................................................................................................9

4. Bayesian Statistics....................................................................................................................10

4.1 Bayes’ Theorem......................................................................................................................10

4.2 Informative and non-informative priors.................................................................................11

4.3 Prior distributions...................................................................................................................11

4.4 Posterior distributions............................................................................................................13

5. Results......................................................................................................................................16

5.1 Global Minimum Variance Portfolios......................................................................................16

5.2 Distribution of the weights.....................................................................................................18

6. Summary..................................................................................................................................23

References........................................................................................................................................24

Appendix..........................................................................................................................................25

Matrices........................................................................................................................................25

2

1. Introduction

The financial markets have grown extremely fast over the last couple of decades, the liquidity of the

markets has drastically increased and the number of financial instruments has virtually exploded. This

has led to a need and demand for models that try to determine the optimal financial decisions to make

in any given situation. One of the earliest and still most popular methods to use when constructing a

portfolio is the mean-variance theory; it is a fairly simple way of finding the optimal weights for your

assets under relatively simple assumptions.

Most theoretical models generally have one thing in common and that is that they are often very strict

and impersonal in that they doesn’t take all the factors that a human investor would into consideration.

It could be specific personal beliefs in the future of a market, specific conditions that might occur or

rumors about future events that aren’t possible to quantify. The purpose of this paper is to see if a

model can be created for which both the theoretical model and the subjective beliefs of an investor are

taken into consideration while constructing the portfolio.

This paper will use one special case of the mean-variance portfolio theory known as the global

minimum variance portfolio; it is a method to find the combination of the available assets with the

lowest possible risk.

In the paper five different indices from Morgan Stanley Capital International (MSCI) will be used as

assets and the indices themselves will not be of important in any other way other than that they will be

used to demonstrate the results from the calculations.

In the second part we will use a Bayesian approach to try and implement the subjective beliefs into the

models. Four different ways will be examined and analyzed to see if they manage to take the

investor’s beliefs into account and create a portfolio which utilizes both theoretical data and personal

beliefs.

All the formulas in the paper will be expressed in the form of matrices and a short explanation of the

matrix operations used in the paper are in the appendix; matrices are throughout the paper denoted by

bold letters.

3

2. Modern Portfolio Theory

The foundation to modern portfolio theory is considered to be laid with an article by Harry Markowitz

(1952). The idea in the article that made it catch on was that Markowitz introduced variance as a

measurement of risk and that it should complement the expected return as the main criteria for

portfolio construction. The variance of a portfolio of assets depends on the covariance between the

assets and that is what made the method plausible. Markowitz later received the Nobel Prize in

Economy in 1990 for his contributions to the portfolio theory.

The article led to the development of the mean-variance portfolio theory. It is a method were the

calculations of the weights for the assets in the portfolio are only based on two factors, the expected

return and variance of each asset. Due to the strict relationship between the variance and the standard

deviation the latter is often used as a risk measurement as well.

2.1 Mean –Variance Portfolio Theory

Mean-variance is in its simplest form a very easily calculated method and the purpose of the method is

to make more efficient investment decisions. The fundamental task with the method is to find an

efficient relation between the expected return on your capital and the risks that you have to take to get

it.

Figure 1

The basic idea of the theory can be shown in a simple figure with four different assets represented by

their expected return and standard deviation. The optimal scenario would be to have a high expected

return with a low standard deviation thus placing us in the top left corner. Of the four assets in the

figure asset A is the one that would be considered the best one since it has got the same expected

return as B but with lower standard deviation and it has got the same standard deviation as C but a

4

C

A

D

B

Standard deviation

Expected Return

higher expected return. The same arguments could be used to argue for that D is the worst asset of the

four. It is however not possible to make any conclusions on if B or C is to be preferred, that decision

depends on the investor’s personal preferences when it comes to risk. Asset B is the more risky asset

but it is compensated with a higher expected return and if you are not risk averse you would consider

B to be better.

2.2 The portfolio’s characteristics

The characteristics for a portfolio with n assets are calculated through two common statistical

formulas. The expected return for the portfolio is calculated by multiplying the weights invested in

each asset with that assets expected return and sum the factors

r p=∑i=1

n

w ir i ,

where

(1)

∑i=1

n

wi=1.

The variable ri is the expected returns and w i represents the percentage weights, they weights have to

add up to one to represent a full portfolio. It is possible for the w i to be negative or larger than one if

short sales are allowed. Short sales are a negative position in an asset which in reality means that an

investor is borrowing and selling the asset and this will give a profit for the investor if the assets price

falls since eventually the asset has to be bought back to return to the lender.

The variance of the portfolio is calculated by multiplying the squared weights of an asset with the

variance of that asset and this is done for all assets. In a portfolio of multiple assets the correlation

between the assets has to be taken into account and this is done by multiplying the weights of two

assets with their covariance σ ij. These calculations can be expressed in an equation as

σ p2=∑

i=1

n

∑j=1

n

wi w jσ ij .(2)

5

Equation 1 and 2 can also be expressed by matrices as

r p=wT R , (3)

σ p2=wT Σw, (4)

where

w is a vector of the weights;

R is a vector of the expected returns;

Σ is the covariance matrix.

2.3 Global Minimum Variance Portfolio (GMVP)

The GMVP is the specific portfolio of a number of assets where the variance is minimized

GMVP=min ( wT Σw ) such that wT 1=1. (5)

It is easy to calculate by using modern computer software and if short sales are allowed it can also be

found through the expression

wGMV=Σ−11

1T Σ−11.

(6)

The true covariance matrix Σ is in reality never known so an estimate S will be used instead and it is

calculated by the equation

S= 1n−1∑i=1

n

(X i−X)( X i−X )T (7)

with

X= 1n−1∑i=1

n

X i ,

where X i is the independent observations of daily return for the assets.

6

2.4 Diversification

Diversification is the reason to why it is possible to create a GMVP and the essence behind it is

equivalent to the saying “don’t put all your eggs in the same basket”

There are different types of risks associated with an investment in different market and in specific

asset. There are risks that originate from the general economy such as interest rates, exchange rates,

inflation and business cycles. These are all macroeconomic factors and none of them can be predicted

with certainty but they affect all companies and commodities in some way. Besides these risks for the

broader economy there are risks associated with specific assets, a mining company might be exposed

to risks in the price on minerals, a farmer on the other hand might consider the weather the biggest risk

for their business.

Just holding one asset in a portfolio makes you very exposed for the risks associated with that specific

asset; if you instead include two assets in the portfolio from companies with very different businesses

you would be able to reduce the overall risk for your portfolio. There is of course no reason to stop

with just two stocks; it would be possible to continue adding assets to the portfolio and thus spreading

the asset-specific risks. There is however no way to reduce risk all together, even with a very large

number of assets there would still be an exposure to risk of the general market. There is no way to

diversify the portfolio in such a way so that it is made risk neutral to all the risks associated with the

general economy. If there were such an portfolio the expected return of that portfolio would due to

arbitrage theory be equal to the risk free rate in the economy.

When common sources of risk are connected with all assets in the portfolio, risk cannot be reduces

altogether even with extensive diversification. The risk that remains after diversification is called

market risk or systematic risk. The risk that we can diversify away is called asset-specific risk or

nonsystematic risk.

This could be expressed mathematically, if all covariances σ ijare assumed to be positive and an equal

amount1n is invested in the n different assets then

σ p2=∑

i=1

n

(1n)

2

σ i2+∑

i=1

n

∑J=1i ≠ j

n

( 1n )( 1

n )σ i jn−1n−1

=¿ 1n

σ2+ n−1n ∑

i=1

n

∑j=1i ≠ j

n σ ij

n (n−1 )=1

nσ2+ n−1

nσ ij ,¿

(8)

where σ is the mean of the variances and σ ij is the mean of the covariances between the assets.

In the case when n→ ∞ it holds that σ p2→ σ ij.

7

The variance of the portfolio would equal the average of the covariances and this can be illustrated

through figure 2.

Figure 2

2.5 Estimating expected return and standard deviation

The main question that hasn’t been answered so far and the constant problem with predictions for the

future are how the expected returns and standard deviations should be estimated.

It is difficult to predict the expected return on an asset for the future since there are many factors that

could affect it and very often asset prices are considered to be stochastic and thus making it impossible

to predict the expected return. If an investor is well informed and follows the market closely then that

person might have a personal belief for what the future holds for different companies and those beliefs

could be used as estimates. There are also companies that make opinion polls regarding expectations

in the market and they usually asks investors about their beliefs of the future stock prices and later

derives statistics from the polls.

There are theoretical methods for calculating the markets expected return on assets and the most

prominently is probably the Black-Litterman model.

The easiest and often the easiest approach is to base the predictions of the future on the past; it is a

unsophisticated approach since there is no guarantee that a company or an asset that has been

producing a high return in the past will continue to produce a high return. The focus of this paper is to

find a way to combine a traditional portfolio model with personal beliefs of an investor, the expected

returns and variances will therefore just be used to illustrate the calculations. This means that the

method used to estimate the expected returns and variances will not be of any important to the results

in the paper and thus the simplest approach to estimate these variables will be used and that is to

estimate them from the historical data.

8

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 490

0.10.20.30.40.50.60.70.80.9

Diversification

Number of Assets

Portfolio Variance

3. Data

The data that will be used in this paper are from the MSCI (Morgan Stanley Capital International).

They are one of the world’s leading providers of support tools for investment decisions and they

compute indices for different capital markets around the world which are frequently used as

benchmarks or as tracks for fund managers.

The time series for the indices are, as is standard with financial time series, converted into log-returns

according to the formula

rt=ln( S t

S t−1) ,

(9)

where St is the index value for the day t and St-1 is the index value for day t-1.

The indices used in this paper have been chosen because they are major economies in the world or

because they were in another way deemed interesting. The five indices are for Germany, Japan, the

Nordic countries, the United Kingdom and the United States and the observations are from all business

days in the period between January 1, 2010 and December 31, 2013.

The data will be split into two subsets, one set contains data for the years 2010 and 2011 and the

second set is for the years 2012 and 2013. The reason to why the data is split into two samples is that

the first period will serve as a simulation of the investor’s beliefs while the second period will be

considered as the data for which the mean-variance analysis would have been done in January of 2014.

9

4. Bayesian Statistics

The most common approach in traditional statistical theory is that there is a parameter θ that one

wishes to estimate, give confidence intervals or do hypothesis tests on. This parameter is usually

considered to be fixed but unknown, within the Bayesian field of statistics this parameter will instead

be a stochastic variable with some distribution, known as the prior distribution. The prior distribution

can be based on previous estimates of θ or could be a subjective belief about the likelihood of different

values on θ. It is also possible for it to be uniformly distributed and thus show no or very limited

information about the distribution of the parameter θ.

The traditional way of formulating a statistical model for n observations (x1,….,xn) is to treat them as

turn-outs of stochastic variables (X1,….,Xn) with a distribution that depends on the parameter θ. With a

continuous distribution it would be expressed as

f X1 , …, X n(x1 ,…, xn;θ) ( x1 ,…, xn )∈Rn .

The probabilities for getting the data we have got is usually calculated by the likelihood function

L(x1 ,…, xn;θ), in short the Maximum Likelihood method means that an estimate θ̂ is the value that

maximizes L(x1 ,…, xn;θ).

4.1 Bayes’ Theorem

The fundamental part of Bayesian statistics is Bayes’ theorem which can be expressed as

P ( A|B )=P (B|A ) P( A)

P(B),

(10)

where

P(A|B) is the conditional probability and is the beliefs in A when we take B into account;

P(A) is the prior probability, what are the beliefs for that A happens;

P(B|A)/P(B) is a quotient that represents the support B gives to A.

Often in Bayesian inference the event B is fixed and the effects of varying A are what one is interested

in. We can from Bayes theorem show that the posterior probabilities are proportional to the numerator,

this leads to that the posterior is proportional to the prior times the likelihood and it can

mathematically be expressed as

P ( A|B )∝ P( A)∙ P(B∨A)

10

or in words as

posterior∝ prior ∙ likelihood .

4.2 Informative and non-informative priors

One aspect about Bayesian inference that makes it special is that you have to guess of assume the

distribution of the parameters you are estimating and there is many ways in which this can be done.

You could for example have some experience about the area you are investigating and thus have a

fairly good belief of how these parameters vary; it would then be possible for you to choose a specific

distribution to go with your beliefs. If you on the other hand don’t have much information about the

distribution of the parameters you estimate then it would be possible to use a more diffuse distribution

where you keep the options open about the distribution and base your assumption solely on the

gathered observations. The distributions that you choose for the parameters are known as prior

distributions or simply priors and this is because you choose them before you actually have any data to

work with.

It is usually possible to divide the prior distribution into two categories depending on the amount of

information you include in your prior, the priors are either informative or non-informative.

In an informative prior we express specific information about the parameters that we might know or

have a strong belief in. The non-informative prior on the other hand are vaguer and expresses very

limited information about our parameters.

4.3 Prior distributions

Four priors will be considered in this paper and they are all from a paper by Bodnar et al. (2015) and a

summary of them will be issued here.

We will consider the linear transformations of the GMVP weights θ

θ=L wGMV ,

where L is an arbitrary p × k matrix of non-zero constants with p < k.

11

4.3.1 Priors for and μ Σ

The first two priors focus on statistical models for the average returns μ and the covariance matrix Σ.

The first prior is a standard diffuse prior, applied in portfolio theory by Barry(1974), Brown(1976)

and Klein and Bawa(1976). This is a non-informative prior and its densities is given by

pd ( μ , Σ )∝|Σ|−k +1

2 .(11)

The second prior is a conjugate prior proposed by Frost and Savarino(1986), the conjugate is an

informative prior with a normal prior for μ(conditional on Σ) and an inverse Wishart prior for Σ, the

joint prior for these two is

pc (μ , Σ )∝|Σ|−υc+1

2 exp {−κ c

2 ( μ−μc )T Σ−1 ( μ−μc )−12

tr [Sc Σ−1 ]},(12)

where

μcis the prior mean;

κ c is a parameter representing the precision of μc;

υc is a parameter representing the precision of Σ;

Sc is a known prior matrix of Σ.

4.3.2 Priors for the weights

The next two priors make statements directly about the portfolio weights. This can tend to make more

sense from an investors perspective since it is natural for them to have preferences about the

composition of their portfolios rather than about average returns or covariance matrices.

The Jefferys non-informative prior which is given by

pn (θ ,Ψ , Ϛ )∝Ϛp2−1

|Ψ|−p2 −1

.(13)

Both Ψ and Ϛ represents linear transformations of the covariance matrix but they will not be

considered in greater detail since they will not affect the later calculations.

The last prior is an informative prior similar to the one developed by Tunaru (2002):

12

θ N p(wI , 1Ϛ

Ψ−1) ,

Ψ W p (υI , SI ) ,

Ϛ Gamma ( δ1 ,2 δ2 ) ,

where

w Iis the prior mean;

υI is a parameter representing the precision of Ψ;

S I is a known prior matrix of Σ;

δ 1∧δ 2are prior constants.

4.4 Posterior distributions

The posterior distributions are calculated from the priors by multiplying the prior distribution with the

likelihood function for the data and subsequently integrate out unwanted parameters. The theoretical

calculations are done by Bodnar et al. (2015) and the results from those calculations will be used in

this paper.

For all of the posteriors expressed below Xi is considered to be independent and identically distributed

with Xi Nk(μ,Σ)..

4.4.1 Models based on and μ Σ

The posterior for θ under the diffuse prior is a multivariate t-distribution expressed as

θ∨X1 … Xn t p(n−1 ; θ̂ ; 1n−1

L Rd LT

1T S−11 ) ,(14)

where

θ̂= L S−1 11T S−11

,

Rd=S−1−S−111T S−1

1T S−11.

13

The t-distribution is expressed through three parameters, the first is the degrees of freedom, the second

is the mean vector and the third is dispersion matrix which has got a similar function as the covariance

matrix in the normal distribution.

The posterior for θ under the conjugate prior also results in a multivariate t-distribution

θ∨X1 … Xn t p(υc+n−k−1 ;LV c

−111T V c

−11; 1

υc+n−k−1L R c LT

1T V c−11 ), (15)

where

rc=n X+κc μc

n+κc,

V c=( n−1 ) S+Sc+ (n+κc ) rc r cT +n X XT+κc μc μc

T ,

Rc=V c−1−

V c−111T V c

−1

1T V c−11

.

4.4.2 Models based on the weights

The posterior that have been derived under the Jeffreys non-informative prior for the GMVP weights

θ are

θ∨X1 … Xn t p(n−k+ p ; θ̂ ; 1n−k+p

L Rd LT

1T S−11 ) .(16)

The posterior is a multivariate t-distribution just as we have had on previous two posterior

distributions; the only difference to the posterior under the diffuse prior is the degrees of freedom.

The posterior for the informative prior is in contrast to the others not a t-distribution and it is

expressed as

pI (θ∨X1 … Xn )∝ [ (θ−w I )T (S I−1+(n−1 ) (L Rd LT )−1)−1

(θ−w I )](n−k +2 p+2 δ1 )

2 ∗U ( (n−k+2 p+2 δ1 )2

;( p+2 δ 1−υI+1 )

2;g (θ )) ,

(17)

14

where U(∙;∙;∙) is an confluent hypergeometric function expressed by Abramowitz and Stegun(1972)

and

g (θ )=n−12

( (θ−θ̂ )T (L Rd LT )−1 (θ−θ̂ )+( 1T S−11 )−1)+ δ2−1

n−1

(θ−w I )T (S I−1+ (n−1 ) ( L Rd LT )−1 )−1

(θ−w I ).

This expression is difficult to compute so a stochastic representation for θ will be used and it can be

expressed as

θ=rI+Ϛ−1

2 (V I)12 z0 ,

(18)

where

z0 N (0p , I p),

Ϛ Gamma( n−k+2 p+2δ 1

2, 2hI

) ,

τ Gamma( n−k+p+υI−12

,2) ,

with

P1=(S I−1+(n−1 ) ( L Rd LT )−1)−1

,

P2=(n−1 ) (L Rd LT )−1 ,

r=δ2−1+ (n−1 ) (1T S−1 1 )−1 ,

V I=(τ P1+P2)−1 ,

r I=( τ P1+P2 )−1 (τ P1 w I+P2 θ̂ ) ,

15

h I=r+τ w IT P1 w I+¿ θ̂T P2 θ̂−rI

T V I−1 rI .

For the posteriors there are some assumptions that have to be considered and they will be the same as

in the paper by Bodnar et al. (2015)

δ 1=1 ;δ2=0,5 ;υc=κc=υI=n .

L is set equal to the basis vector ei for each dimension which means that for the first dimension it

equals

e1=[ 1 0 0 0 0 ] .

This will allow for the distributions to be plotted individually as independent t-distribution.

5. Results

5.1 Global Minimum Variance Portfolios

The first results are from the mean-variance analysis of the historic data, the data were split into two

different samples. The first one consists of the period from 1st of January 2010 to 31st of December

2011 and the second is from 1st of January 2012 to 31st of December 2013, both periods contain 522

observations each. All calculations and simulations of the graphs in this part have been done in R.

The average daily log-return for the first period is displayed in table 1.

Table 1 German

y

Japan Nordi

c

UK US

Average log-return(×10-4) -1,253 -4,646 0,019 0,491 2,356

The most noticeable aspect of this result is that the results differ a lot for different indices; both

Germany and Japan have had a negative return for the period while the US have been doing a lot

better. All of the average daily returns are very close to zero and it is not uncommon in some financial

calculations to assume that the average daily return are zero for short time periods.

The covariance matrix for the period is summarized in table 2.

Table 2 Germ. Japan Nordic UK US

Germany 2,275 0,499 1,894 1,617 1,416

Japan 0,499 1,564 0,452 0,405 0,238

Nordic 1,894 0,452 1,945 1,479 1,305

16

UK 1,617 0,405 1,479 1,438 1,090

US 1,416 0,238 1,305 1,090 1,675

These two results are enough to calculate the GMVP based on the first periods results. This is done by

finding the portfolio that minimizes the portfolio variance and the weights of that portfolio is

presented in table 3.


GMVP weights -0,394 0,427 -0,030 0,63

9

0,358

The portfolio has a large negative weight in the German index and just a slight negative weight in the

Nordic index. The remaining three indices have solid positive weights which suggest positive

investments in those markets. This result is a bit peculiar in that the result stipulates that a lot of the

investments should be in the asset with the lowest average return for the period. This is due to the fact

that in the definition of the GMVP the only criteria used are that the portfolio variance should be

minimized and no consideration is taken to the expected return. The main reason to the large weight in

the Japanese index is due to the fact that it has got a weaker correlation with the other markets and

including it in the portfolio results in a larger diversification effect.

The fact is that if the expected return for the portfolio with the weights above were to be calculated it

would actually be a negative expected return which might not be what an investor is looking for.

The same calculations are carried out for the second period and they gave following results:


Average log-return(×10-4) 8,227 11,31

2

6,072 3,62

2

7,434

The average returns are presented in table 4 and they are a lot higher for this period and the two

indices with the worst return for the previous period now have the two highest returns. The weak

results for the previous period can largely be explained by the euro crisis in 2011 when a number of

countries within the Euro-zone were in severe financial problems and that caused concerns on the

financial markets.

17


Germany 1,065 0,256 0,806 0,694 0,472

Japan 0,256 1,622 0,280 0,221 0,121

Nordic 0,806 0,280 0,819 0,600 0,393

UK 0,694 0,221 0,600 0,653 0,359

US 0,472 0,121 0,393 0,359 0,547

The covariance matrix is presented in table 5 and it can be noted that it is weaker covariances and

variances in this matrix compared to the one for the previous period and this is also an effect of the

more stable developments on the markets during the second period. It is a well-established fact within

finance that markets tend to have higher correlation in times of trouble.

The calculations of the GMVP for this are presented in table 6.


GMVP weights -0,310 0,167 0,136 0,44

2

0,565

It is still a negative weight for the German index which means that short selling it is the best option

under these conditions. All of the other indices have positive weights and most of the investments

should be placed in the UK and US indices.

There are quite clear differences between the two periods which is interesting for the future

calculations, the first period will now serve as the prior information and that is the beliefs of the

investor while the second period will be the GMVP for the start of 2014 based on the last two years of

data.

5.2 Distribution of the weights

For the diffuse prior the distribution for the weights are multivariate t-distributed with 521 degrees of

freedom, the means will be identical to the weights calculated for the GMVP. A dispersion matrix is

also calculated and it indicates how the distribution varies around its mean.

18

The means are presented in table 7 and the dispersions in table 8.


Means -0,310 0,167 0,136 0,442 0,565


Dispersions (×

10−3 ¿

3,364 0,446 4,244 4,094 1,736

Figure 3

The distribution for the Japanese weight has the lowest dispersion and thus results in a thin and tall

distribution; the distribution for the US weight is also slightly thinner and taller than the remaining

three indices which are all quite similar in shape. This posterior distribution is based on a non-

informative prior and does not take the investor’s beliefs into account; it is instead solely based on the

GMVP results.

The second prior introduced was the conjugate prior which is an informative prior; the posterior

distribution is a multivariate t-distribution just as in the case of the diffuse prior. The results are a bit

different since this posterior doesn’t use the GMVP weights as the means of the posterior distributions,

the degrees of freedom and the dispersions are also different compared to the diffuse prior.

The means for the posterior distributions are presented in table 9 and the dispersions in table 10.


19

Means -0,320 0,165 0,137 0,461 0,557


Dispersions (×

10−3 ¿

1,707 0,227 2,166 2,069 0,882

Figure 4

In a comparision between the diffuse and conjugate the means differs slightly for the distributions but

the main difference is that the conjugate prior produces distributions with a lower dispersion for all

five distributions, the conjugates dispersions are about half of the ones for the diffuse prior. The

similaritites in the results mean that the conjugate have failed to incorporate the prior information in a

satisfying way since the beliefs of the investor is not visible in the results from the posterior

distribution.

The first two distribution were derived from prior assumptions about the distribution of the mean and

covariance matrix, the following two distributions are derived directly from prior assumptions about

the GMVP weights.

This distribution was derived from the Jeffreys non-informative prior and as the previous two it is also

expressed through a multivariate t-distribution. The result for this distribution is very similar to the one

for the diffuse prior, the only difference is the number of degrees of freedom.

20

The means and dispersions are presented in table 11 and 12

Table 11 Germ. Japa

n

Nordic UK US

Means -0,310 0,167 0,136 0,442 0,565


Dispersions (×

10−3 ¿

3,338 0,448 4,269 4,118 1,746

Figure 5

This result is almost identical with the diffuse priors in that the distribution for the Japanese index is

tall and thin, the distribution for the US index is a bit taller than the remaining three are all similar to

each other. This is not surprising since they are both based on non-informative priors and are thus only

based on the distribution for the actual observations.

The last posterior distribution is for the informative prior with respect to the GMVP weights, the

characteristics of the distribution is based on 25.000 simulations from the stochastic representation in

equation 17 and the results from the simulations are presented in table 13 and 14.

Table 13 Germ. Japa

n

Nordic UK US

21

Means -0,380 0,247 -0,005 0,599 0,431

Table 14 Germ. Japan Nordi

c

UK US

Variances (×

10−2 ¿

5,433 2,940 6,421 8,089 5,775

Figure 6

It is clearly noticeable that the distributions here are significantly wider and not as spiky, the prior

information has been taken into account and the center for the distributions are located approximately

halfway between the investor’s beliefs and the weights for the GMVP.

Table 15 shows the comparison between the investor’s beliefs, the weights from the GMVP

calculations and the means of the posterior distribution with the informative prior.

Table 15 Investor’s beliefs GMVP weights Posterior weights Scaled weights

Germany - 0,394 - 0,310 - 0,380 -0,426

Japan 0,427 0,167 0,247 0,277

Nordic - 0,030 0,136 - 0,005 -0,006

UK 0,639 0,442 0,599 0,672

US 0,358 0,565 0,431 0,483

Sum 1 1 0,892 1

22

It seems as the results are in general closer to the investor’s beliefs but there is a problem in that the

weights from the simulation of the posterior distribution do not sum up to one as is necessary to have a

full portfolio. The weights would have to be adjusted in a way so that the sum will be one and the

easiest way of doing so is to divide all the posterior weights with 0,892, this will result in the scaled

weights presented in the table above. It is possible to consider different methods for the scaling of the

weights since the method used here give that the weight in the German index is actually smaller than

both the investor’s belief and the GMVP weight, the opposite is true for the UK index were the scaled

weight is larger than both the investor’s belief and the GMVP weight.

6. Summary

The time series for the indices were divided into two periods and a mean-variance analysis was

conducted for both periods. The result for the first period served as a representation for what could

have been an investor’s beliefs for the future while the second period served as the historic data for

which the weights of a portfolio based on traditional portfolio theory is derived from.

Four different prior distributions were introduced; two priors were based on the μ and Σ with one

informative and one non-informative prior. The posterior distributions for these priors were calculated

23

and they turned out to be very similar which wasn’t in line with the purpose since the informative prior

failed to incorporate the information that represented the investor’s beliefs.

The remaining two priors were instead based directly on the GMVP weights and just as with the

previous two there was one informative and one non-informative prior. The non-informative prior

posed a result very similar to the first two priors while the informative prior instead used the prior

information to create distributions which considered both the investors beliefs and the result from the

mean-variance estimations. There is a flaw with the use of the informative prior and that is that the

weights of the simulated distributions don’t sum up to one, it is easy to scale the weights so that they

do sum up to one but this causes different problems. A better method for the scaling can probably be

used and it could be and interesting continuation of this paper. We can make the conclusion that the

informative prior was the only prior that created a result that was consistent with the goal of the paper

and could be a possible alternative in creating more personalized portfolios.

References

Abramowitz M. and Stegun I. A. (1972). “Handbook of Mathematical Functions With Formulas, Graphs and Mathematical Tables”. New York, Dover.

Barry C.B. (1974). “Portfolio Analysis under uncertain means, variances, and covariances”. Journal of Finance 29, 515-522

Bodnar T., Mazur S. and Okhrin Y. (2015). “ Bayesian Estimation of the Global Minimum Variance Portfolio”, Working Paper.

24

Brown S. J. (1976). Optimal Portfolio Choice under Uncertainty: A Bayesian Approach. Ph.D. Diss., University of Chicago

Frost P.A. and Savarino J.E. (1986). “An empirical Bayes approach to efficient portfolio selection”. Journal of Financial and Quantitative Analysis 21, 293-305.

Klein R.W. and Bawa V.S. (1976). “The effect of estimation risk on optimal portfolio choice”. Journal of Financial Economics 3, 215-231

Markowitz H.M. (1952). “Mean-variance analysis in portfolio choice and capital markets”. Journal of Finance 7, 77-91.

MSCI. ”MSCI Index Performance”. [2014-10-15]. <http://www.msci.com/products/indexes/country_and_regional/dm/performance.html>

Tunaru R. (2002). “Hierarchial Bayesian models for multiple count data”. Austrian Journal of Statistics 31, 221-229

Appendix

Matrices

The majority of formulas used in the paper are expressed in the form of matrices and they generally

simplify the calculations when we use computers to perform our calculations. A few basic definitions

and applications of matrix calculations will be introduced below and matrices will throughout the

paper be defined with a bold letters.

25

A matrix is a rectangular array consisting of a number of rows and columns, a matrix with three rows

and three columns would look like this:

M=[a b cd e fg h i ]

A matrix with the same number of rows as columns is known as a square matrix.

A vector is a matrix that only consists of one column (or row) and thus could look like

V=[abc ]A transpose of a matrix is when it is tipped over so that the first row becomes the first column, the

transpose of the matrix M is popularly noted with a superscript T as MT and it would be

M T=[a d gb e hc f i ]

The transpose of the vector V is

V T=[ a b c ]

Two matrices of the same size can be added together by simply adding the corresponding elements

and the result will be a matrix of the same size as the first two.

A+B=[a bc d ]+[e f

g h]=[ a+e b+ fc+g d+h]

Multiplication of matrices is only possible if the number of columns in the left matrix equals the

number of rows in the right matrix. The values in the matrix product are given by the formula

[ AB ]i , j=∑r=1

n

A i ,r Br , j

26

where A have got n columns and B have n rows, the i and j represents the row and column of the

product matrix. Two 2x2-matrices would give calculations like these:

[a bc d ][ e f

g h ]=[ae+bg af +bhce+dg cf +dh ]

It is worth noticing that matrix multiplication is not commutative which means that AB ≠ BA which is

demonstrated by the result below

[ e fg h] [a b

c d ]=[ ea+ fc eb+fdga+hc gb+hd ]

The identity matrix is a square matrix that consists of ones on the main diagonal (the upper left to the

bottom right) and zeros everywhere else. It is denoted as I n.

I 3=[1 0 00 1 00 0 1]

The inverse of a matrix is a corresponding matrix that if it is multiplied with the original gives the

identity matrix. The inverse of A is denoted with the superscript -1 as A-1.

A A−1=I n

The determinant of a matrix can be defined through the Leibniz formula

det ( A )=|A|=∑x∈ Sn

sgn(x )∏i=1

n

ai , x i

Where the sum is computed over all permutations of the set {1,2,…,n} and the set of all such

permutations is Sn, the x:s are the permutations within Sn. Sgn(x) is a function that either takes the

value +1 or -1, it is +1 whenever the reordering given by x can be achieved by successively

interchanging two entries an even number of times, and −1 whenever it can be achieved by an odd

number of such interchanges

The trace of a matrix is the sum of the elements on the main diagonal.

27

tr ( A )=tr [a b cd e fg h i ]=a+e+ i

28

portfolio selection with a bayesian approachlup.lub.lu.se/.../record/5155710/file/5155711.docx ·...

Documents