ex 2 solution

8/13/2019 Ex 2 Solution

1/13

Econ 353 Spring 2006 Page 1 of 13

Exercise 2: Numerical Analysis and Simulation using Matlab

Part I Time costs of computing (10 marks)

Write a Matlab script that performs the following:

1. Using the r andcommand, generate a vector of 500 observations uniformly distributedbetween 0 and 10000; call this vector a.

2. Again using the r andcommand, generate another vector of 500 observationsuniformly distributed between 0 and 1; call this vector b.

3. Repeat the following operations for 10000 iterations:

i. The sum of aand bii. The difference of aand b

iii. The scalar product of aand biv. The scalar division of aand bv. The square root of a

vi. The exponential function of a.

vii. The sine of aviii. The tangent of a

Look up the pr of i l ecommand in the Matlab help files. Use the pr of i l ecommandto compare the computing time of these operations. Express the time costs of eachoperation as a ratio relative to i.; report these ratios in a table.

Suggested solutionThe M-file ops.mexecutes steps 1 to 3 as follows:

T = 10000;n = 500;rand( ' s tat e' , 1) ;a = 10000*r and( n, 1) ;b = r and( n, 1) ;

f or i = 1: Ta+b;

a- b;a. *b;a. / b;sqr t (a ) ;exp( a) ;si n(a) ;t an( a) ;

end


2/13


The Matlab profiler traces the sequence of calls to functions and scripts and analyses how

much time is spent executing specific operations. The pr of i l ecommand includes anoption for reporting the detail for built-in functions and operators like addition andmultiplication. Running the commands

>> pr of i l e on - det ai l oper at or>> ops>> pr of i l e r epor t opsr epor t

produces a report in HTML format. The report states how much time was spent on thefollowing lines:

7: f or i = 1: T

0. 12 1% 8: a+b;

0. 19 2% 9: a- b;

0. 13 1%10: a. *b;0. 23 3%11: a. / b;

1. 88 21%12: sqr t ( a) ;

2. 54 29%13: exp( a) ;

1. 57 18%14: si n( a) ;

2. 09 24%15: t an( a) ;

0. 04 0%16: end

Expressing the computing times as ratios:

Operat i on I ndex

a+b 1. 00a- b 1. 58

a. *b 1. 08

a. / b 1. 92

sqr t ( a) 15. 67

exp( a) 21. 17

si n( a) 13. 08

t an( a) 17. 42

Thus we can see that the last 4 operations are an order of magnitude more costly relative

to the elementary arithmetic operations.


3/13


Part II Stopping rules in iterative methods (15 marks)

In Lab 1, you looked at a Matlab implementation of the simple Walrasian iterative.

Modify the code in walras1.m to consider two different stopping rules:

Rule 1: Stop if |p

k

p

k+1

| / (1 + |p

k

| ) Rule 2: Stop if |pk p

k+1| (1 *) where * = max j=1,,k|p

k+1-j- p

k+1| / |p

k-j- p

k|

(Here, |.| means absolute value of.) For each rule, report the final value of the iterativeand the number of iterations for = 10

-2, 10

-4, 10

-6, 10

-8. Define a suitable accuracy

measure and use it to evaluate each rule-combination: which is the most accurate?

Based on your results, provide an estimate for the iteratives rate of (linear) convergence.

Suggested solution

See the M-files walras_rule1.mand walras_rule2.mfor the suggested Matlabimplementations. The main loops of these M-files appear below.

[Marking guide: I would suggest allocating 10 marks to the quality of the Matlab

implementation and reported results, 5 marks for the analysis.]

Results for Rule 1:

Value of Final no. of

iterations, k

Final value of the

iterative, pk

Excess demand

evaluated at final pk

10-2

10 1.05197659091279 -0.01755008852421

10-4

21 1.00048080455497 -0.00016822440116

10-6

32 1.00000420959831 -0.00000147335502

10-8

42 1.00000005667300 -0.00000001983555

[walras_rule1.m]f or k=1: maxi t

i f k>maxi tmaxi t _r eached = 1;break

end

E_k = 0. 5*p_k ( - 0. 2) + 0. 5*p_k ( - 0. 5) - 1; % excess demand at p( k)p_k1 = p_k + l ambda * E_k;

i f abs( p_k- p_k1)


4/13


Results for Rule 2:

Value of Final no. of

iterations, k

Final value of the

iterative, pk

Excess demand

evaluated at final pk

10-2

16 1.00412743509784 -0.00144039997406

10-4

26 1.00005581309872 -0.00001953381360

10-6

37 1.00000048843772 -0.0000001709531410

-848 1.00000000427421 -0.00000000149598

[walras_rule2.m]f or k=1: maxi t

i f k>maxi tmaxi t _r eached = 1;break

end

E_k = 0. 5*p( k) ( - 0. 2) + 0. 5*p( k) ( - 0. 5) - 1; % excess demand at p( k)p(k+1) = p( k) + l ambda * E_k;

bet a_st ar = 0;bet a = zer os( 1, k) ;i f k>1

f or j =1: k- 1bet a( k+1- j ) = abs( p( k+1- j ) - p( k+1) ) / abs( p( k- j ) - p( k) ) ;

endbet a_st ar = max( beta) ;

end

i f abs( p( k)- p( k+1) )


5/13


0 5 10 15 20 25 30 35 40 45 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Iteration

beta


6/13


Part III A Monte Carlo experiment (20 marks)

One of the main tasks of econometrics is to study the sampling distribution of an

estimator or test statistic. For instance, we may be interested in the bias the expected

difference from the population mean of an estimator in repeated samples. There are

powerful theorems (e.g. central limit theorems) that characterize the samplingdistributions of estimators as the sample size tends to infinity. With finite samples,

however, these asymptotic theorems are often uninformative or misleading.

Monte Carlo simulationis a method for studying the finite-sample distribution of an

estimator or test statistic. It uses simulated experimental data to evaluate the performance

of the estimator or test procedure. The basic steps in a Monte Carlo experiment are thefollowing:

1. Specify a model for the data-generating process (DGP), i.e. make assumptionsabout functional relationships among the variables, probability distributions, and

the true values of the associated parameters.2. Generate R data sets (samples) by simulating R random draws from the DGP.

Typically, the sampling process is based on a computer-generated sequence ofpseudo-random numbers.

3. Calculate the statistic of interest for each data set. The R calculated values

represent the sampling distribution of the statistic.4. Calculate the desired sampling measure for the statistic. E.g. The bias of the

statistic is estimated as the average deviation from the true mean taken over the

simulated sampling distribution.

Your task is to write a Matlab program that performs Monte Carlo experiments. Indeveloping your program, observe the following coding guidelines:

- Declare and initialize the main variables before performing operations on them.- Where possible, use parameters rather than hard-coded numbers.

- Use vectorized code rather than loops when it is efficient to do so.

- Include concise and informative comments throughout.

A. Preparation

Consider the following DGP:

yt= + vt; t = 1,2,,T

where is a constant and vtis independently and identically distributed (i.i.d.) as Normal

with mean 0 and variance 2.

1. Write the pseudo-codefor a Monte Carlo experiment that estimates the bias of a given

estimator of . Use the following notation:


7/13


R number of samples generated by the experiment

T number of observations in each sampleyrt t-th observation in sample r

br value of the estimator for sample r

B(y) the estimator for as a function of the input vector y

r andn( ) a function that returns a value drawn at random from a Normal distributionwith mean 0 and variance 1

The bias estimate is to be computed as the average of br over the R samples. Assume

the estimates brare stored in an R-by-1 vector named BDi st . You may declareadditional scalar, vector, or matrix variables as needed. Use the symbolto denoteassignment and use = to denote a test for equality. (Refer to the pseudo-code example

from Session 3 as a style guide.)

Suggested solution

[Marking guide: I would suggest allocating 4 marks for the pseudo-code, 8 marks for theMatlab implementation, 8 marks for the analysis. The bonus question is worth 5 marks.]

For r = 1, 2, , R:

For t = 1, 2, , T:

yrt+ r andn( ) // need to multiply by to get Var(vrt) = 2

.

ysample(yr1,yr2,,yrT) // collect the observations for sample r in a vector

brB(ysample).

.bias(r=1,..,Rbr)/R

2. Implement your pseudo-code in Matlab. Begin by creating two M-files:

mc.m A script file that implements the Monte Carlo simulation steps.

B.m A function file that implements the estimator B(y). The function should take a

vector as input and return a scalar value. (Leave B.mas an empty stub for now;

the specific estimation rules will be defined later.)

Your main script should call the function B.m to obtain estimates of . Use the built-in

Matlab function r andn( ) to draw values of vt. Include the following line of code to

initialize the Matlab random number generator:

r andn( ' stat e' , 1) ;


8/13


Suggested solution:

% mc. m%% A si mpl e Mont e Carl o si mul at i on f or est i mat i ng the mean.% Assumes a normal di st r i but i on f or t he er r or t er m.

%% Mi ng Kang% Febr uar y 2006

cl ear al l ;

% Set parametersR = 100;

T = r ound( 10 4. 5) ;bet a = 2;si gma2 = 4;

% I ni t i al i ze var i abl esBDi st = zer os( R, 1) ;

y = zer os( T, 1) ;

% Set r andom- number gener at orrandn( ' stat e' , 1) ;

% Si mul at i on st epf or r = 1: R

y = bet a + sqr t ( si gma2) *r andn(T, 1) ;BDi st ( r ) = B( y) ;

end

% Cal cul at e sampl i ng st at i st i csmu = mean( BDi st ) ;

bi as = mu - bet a;mse = mean(( BDi st - bet a) . 2) ;

hi st( BDi st , 20) ;

% B. m%f unct i on b = B( y)b = mean( y) ;% b = medi an( y) ;


9/13


B. Analysis

1. Explain how the Monte Carlo algorithm is affected by each of the three types of

numerical error discussed in class. Which do you think is the greatest source of error and

why? Describe the effect of R on the accuracy of the algorithm.

2. Estimate with the sample mean: that is, B.m should compute the average of the

values in the input vector. Set = 2, 2= 4, R = 100, T = 25 and run the Monte Carlo

experiment. Use the hi st command to plot a histogram of BDi st (use bins of size 1)and comment on its shape: is the distribution symmetric, smooth, bell-shaped, etc.

Report the mean and variance of BDi st .

3. Repeat Question 2 but this time estimate with the sample median; that is, B.m

should compute the median (i.e. the 50th

percentile) of the values in the input vector.Compare your results with Question 1 and comment on any differences.

4. Repeat the experiments in Questions 2 and 3 for T = 10k

; k = 2.0, 2.5, 3.0, , 5.0. Foreach value of T, estimate the mean-squared error(MSE) of each estimator: that is,

calculate the average value of (br-)2over the R estimates. Use the l ogl ogcommand to

plot the MSE of both estimators against T and comment on any trend that may be

apparent. Compute the ratio of the MSEfor the sample median to the MSE for the

sample mean; record the values for each T in a table. What happens to the MSE ratio asT grows large?

5. (Bonus question) Consider the same DGP but now suppose v tis drawn from the

Students-t distribution instead of the Normal distribution. The Students t-distributiondepends on a degrees-of-freedom parameter, d = 1,2,. The procedure for sampling from

the Students-t distribution with d degrees of freedom is the following:

i. Sample d+1 values from the standard Normal distribution (mean 0, variance 1);

call these zn; n = 1,2,,d+1

ii. Compute the ratio:

=

+=d

n

n

d

zd

zt

1

2

1

1

Modify your code for mc.mso that the vts are generated using this procedure. Repeat the

steps in Question 4 and fill in the table below with values for the MSE ratio. Compareyour results for T = 25 with those from Question 4; comment on the effects of d and T.

Degrees of freedom (d)

No. of obs. (T) 3 6 9

10

25

100


10/13


Suggested solution:

1. The three sources of numeric error discussed in class were modeling error,

approximation error, and round-off error. With Monte Carlo experiments, modeling error

is negligible by design: the experimenter chooses the parameters, distributions, etc.

associated with the data generating process and thus there is an exact correspondencebetween the statistical model and the truth. (Whether the model generates realistic data

is not of direct interest in this problem.) Approximation error is present because the

simulated sampling distribution is an estimate; we only observe R discrete values of thesample statistic where in fact the statistic has a continuous distribution over an infinity of

values. (There is also approximation error in how the normal random variable is

generated from a sequence of pseudo-random numbers but this is less significant.)Roundoff error is probably small given that the range of computed numbers is not

extreme and also averaging tends to smooth out these errors. Therefore, approximation

error seems to be the most significant source of numeric error. The parameter R controlsthe number of replicated draws from the sampling distribution; the higher is R, the better

is the discrete approximation to the true sampling distribution.

2. Results for the sample mean with T = 25:

0.5 1 1.5 2 2.5 30

5

10

15

Here, BDi st has mean 1.9514 and variance 0.1644. As illustrated in the histogram, thesampling distribution is rough but it appears to be single-peaked and not noticeablyskewed.


11/13


3. Results for the sample median with T=25:

0.5 1 1.5 2 2.5 3 3.5

0

5

10

15

Now BDi st has mean 1.9274 and variance 0.2550. Thus the sample median appears tohave greater bias and higher variance. The distribution as illustrated in the histogram is

noticeably choppier and more spread out than before.

4. Results:

log10(T) MSE for the mean MSE for the median MSE ratio

(median/mean)

2.0 0.0442581 0.0694461 1.569122.5 0.0111912 0.0196407 1.755023.0 0.0043804 0.0068140 1.555573.5 0.0013336 0.0022508 1.68768

4.0 0.0003182 0.0005183 1.628674.5 0.0001036 0.0001838 1.774905.0 0.0000382 0.0000565 1.47814

For both estimators, the l ogl ogplot suggests log10(MSE) is a linear function of log10(T)with negative slope, which suggests that MSE declines exponentially with T. Theposition of the trend is lower for the sample mean; that is, the MSE for the sample

median is always greater than for the sample mean. The MSE ratio is positive for all T;the ratio fluctuates as T increases, no increasing or decreasing trend is apparent.


12/13


102

103

104

105

10-5

10-4

10-3

10-2

10-1

T

mseformean(solid),median(dash)

5.

Results for R = 100:

Degrees of freedom (d)

No. of obs. (T) 3 6 910 0.63276 1.07076 0.96472

25 0.90520 0.96347 1.22535

100 0.58819 1.29718 1.19873

Notice that the median has lower MSE than the mean in some cases which contrasts with

the earlier result that the MSE ratio was always positive. No clear trends are apparentexcept for the case of T=25 in which the MSE ratio increases with d; this makes sense as

the t-distribution actually looks more like the normal distribution as d gets large.

Repeating the experiment (not required) with R = 1000 confirms this result:

(R = 1000) Degrees of freedom (d)No. of obs. (T) 3 6 9

10 0.58435 1.03613 1.20269

25 0.68058 1.12398 1.30109

100 0.58292 1.17798 1.25101


13/13


The new code:

% mc2. m%% A si mpl e Mont e Carl o si mul at i on f or est i mat i ng the mean.% Assumes a t - di st r i but i on f or t he er r or t er m.

%% Mi ng Kang% Febr uar y 2006

cl ear al l ;

% Set parametersR = 1000;

T = 100;bet a = 2;d = 3;

% I ni t i al i ze var i abl esBDi st 1 = zer os( R, 1) ; % sampl e mean

BDi st 2 = zeros( R, 1) ; % sampl e medi any = zer os( T, 1) ;

% Set r andom- number gener at orrandn( ' stat e' , 1) ;

% Si mul at i on st epf or r = 1: R

numer = r andn( T, 1) ;denom= sqrt ( sum( r andn( T, d) . 2, 2) ) ;z = numer . / denom;y = beta + z;BDi st 1( r ) = mean( y) ;

BDi st 2( r ) = medi an( y) ;end

% Cal cul at e sampl i ng st at i st i csmse1 = mean(( BDi st 1 - bet a) . 2) ;mse2 = mean(( BDi st 2 - bet a) . 2) ;mse_r = mse2/ mse1

ex 2 solution

Documents