ex 2 solution
TRANSCRIPT
-
8/13/2019 Ex 2 Solution
1/13
Econ 353 Spring 2006 Page 1 of 13
Exercise 2: Numerical Analysis and Simulation using Matlab
Part I Time costs of computing (10 marks)
Write a Matlab script that performs the following:
1. Using the r andcommand, generate a vector of 500 observations uniformly distributedbetween 0 and 10000; call this vector a.
2. Again using the r andcommand, generate another vector of 500 observationsuniformly distributed between 0 and 1; call this vector b.
3. Repeat the following operations for 10000 iterations:
i. The sum of aand bii. The difference of aand b
iii. The scalar product of aand biv. The scalar division of aand bv. The square root of a
vi. The exponential function of a.
vii. The sine of aviii. The tangent of a
Look up the pr of i l ecommand in the Matlab help files. Use the pr of i l ecommandto compare the computing time of these operations. Express the time costs of eachoperation as a ratio relative to i.; report these ratios in a table.
Suggested solutionThe M-file ops.mexecutes steps 1 to 3 as follows:
T = 10000;n = 500;rand( ' s tat e' , 1) ;a = 10000*r and( n, 1) ;b = r and( n, 1) ;
f or i = 1: Ta+b;
a- b;a. *b;a. / b;sqr t (a ) ;exp( a) ;si n(a) ;t an( a) ;
end
-
8/13/2019 Ex 2 Solution
2/13
Econ 353 Spring 2006 Page 2 of 13
The Matlab profiler traces the sequence of calls to functions and scripts and analyses how
much time is spent executing specific operations. The pr of i l ecommand includes anoption for reporting the detail for built-in functions and operators like addition andmultiplication. Running the commands
>> pr of i l e on - det ai l oper at or>> ops>> pr of i l e r epor t opsr epor t
produces a report in HTML format. The report states how much time was spent on thefollowing lines:
7: f or i = 1: T
0. 12 1% 8: a+b;
0. 19 2% 9: a- b;
0. 13 1%10: a. *b;0. 23 3%11: a. / b;
1. 88 21%12: sqr t ( a) ;
2. 54 29%13: exp( a) ;
1. 57 18%14: si n( a) ;
2. 09 24%15: t an( a) ;
0. 04 0%16: end
Expressing the computing times as ratios:
Operat i on I ndex
a+b 1. 00a- b 1. 58
a. *b 1. 08
a. / b 1. 92
sqr t ( a) 15. 67
exp( a) 21. 17
si n( a) 13. 08
t an( a) 17. 42
Thus we can see that the last 4 operations are an order of magnitude more costly relative
to the elementary arithmetic operations.
-
8/13/2019 Ex 2 Solution
3/13
Econ 353 Spring 2006 Page 3 of 13
Part II Stopping rules in iterative methods (15 marks)
In Lab 1, you looked at a Matlab implementation of the simple Walrasian iterative.
Modify the code in walras1.m to consider two different stopping rules:
Rule 1: Stop if |p
k
p
k+1
| / (1 + |p
k
| ) Rule 2: Stop if |pk p
k+1| (1 *) where * = max j=1,,k|p
k+1-j- p
k+1| / |p
k-j- p
k|
(Here, |.| means absolute value of.) For each rule, report the final value of the iterativeand the number of iterations for = 10
-2, 10
-4, 10
-6, 10
-8. Define a suitable accuracy
measure and use it to evaluate each rule-combination: which is the most accurate?
Based on your results, provide an estimate for the iteratives rate of (linear) convergence.
Suggested solution
See the M-files walras_rule1.mand walras_rule2.mfor the suggested Matlabimplementations. The main loops of these M-files appear below.
[Marking guide: I would suggest allocating 10 marks to the quality of the Matlab
implementation and reported results, 5 marks for the analysis.]
Results for Rule 1:
Value of Final no. of
iterations, k
Final value of the
iterative, pk
Excess demand
evaluated at final pk
10-2
10 1.05197659091279 -0.01755008852421
10-4
21 1.00048080455497 -0.00016822440116
10-6
32 1.00000420959831 -0.00000147335502
10-8
42 1.00000005667300 -0.00000001983555
[walras_rule1.m]f or k=1: maxi t
i f k>maxi tmaxi t _r eached = 1;break
end
E_k = 0. 5*p_k ( - 0. 2) + 0. 5*p_k ( - 0. 5) - 1; % excess demand at p( k)p_k1 = p_k + l ambda * E_k;
i f abs( p_k- p_k1)
-
8/13/2019 Ex 2 Solution
4/13
Econ 353 Spring 2006 Page 4 of 13
Results for Rule 2:
Value of Final no. of
iterations, k
Final value of the
iterative, pk
Excess demand
evaluated at final pk
10-2
16 1.00412743509784 -0.00144039997406
10-4
26 1.00005581309872 -0.00001953381360
10-6
37 1.00000048843772 -0.0000001709531410
-848 1.00000000427421 -0.00000000149598
[walras_rule2.m]f or k=1: maxi t
i f k>maxi tmaxi t _r eached = 1;break
end
E_k = 0. 5*p( k) ( - 0. 2) + 0. 5*p( k) ( - 0. 5) - 1; % excess demand at p( k)p(k+1) = p( k) + l ambda * E_k;
bet a_st ar = 0;bet a = zer os( 1, k) ;i f k>1
f or j =1: k- 1bet a( k+1- j ) = abs( p( k+1- j ) - p( k+1) ) / abs( p( k- j ) - p( k) ) ;
endbet a_st ar = max( beta) ;
end
i f abs( p( k)- p( k+1) )
-
8/13/2019 Ex 2 Solution
5/13
Econ 353 Spring 2006 Page 5 of 13
0 5 10 15 20 25 30 35 40 45 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Iteration
beta
-
8/13/2019 Ex 2 Solution
6/13
Econ 353 Spring 2006 Page 6 of 13
Part III A Monte Carlo experiment (20 marks)
One of the main tasks of econometrics is to study the sampling distribution of an
estimator or test statistic. For instance, we may be interested in the bias the expected
difference from the population mean of an estimator in repeated samples. There are
powerful theorems (e.g. central limit theorems) that characterize the samplingdistributions of estimators as the sample size tends to infinity. With finite samples,
however, these asymptotic theorems are often uninformative or misleading.
Monte Carlo simulationis a method for studying the finite-sample distribution of an
estimator or test statistic. It uses simulated experimental data to evaluate the performance
of the estimator or test procedure. The basic steps in a Monte Carlo experiment are thefollowing:
1. Specify a model for the data-generating process (DGP), i.e. make assumptionsabout functional relationships among the variables, probability distributions, and
the true values of the associated parameters.2. Generate R data sets (samples) by simulating R random draws from the DGP.
Typically, the sampling process is based on a computer-generated sequence ofpseudo-random numbers.
3. Calculate the statistic of interest for each data set. The R calculated values
represent the sampling distribution of the statistic.4. Calculate the desired sampling measure for the statistic. E.g. The bias of the
statistic is estimated as the average deviation from the true mean taken over the
simulated sampling distribution.
Your task is to write a Matlab program that performs Monte Carlo experiments. Indeveloping your program, observe the following coding guidelines:
- Declare and initialize the main variables before performing operations on them.- Where possible, use parameters rather than hard-coded numbers.
- Use vectorized code rather than loops when it is efficient to do so.
- Include concise and informative comments throughout.
A. Preparation
Consider the following DGP:
yt= + vt; t = 1,2,,T
where is a constant and vtis independently and identically distributed (i.i.d.) as Normal
with mean 0 and variance 2.
1. Write the pseudo-codefor a Monte Carlo experiment that estimates the bias of a given
estimator of . Use the following notation:
-
8/13/2019 Ex 2 Solution
7/13
Econ 353 Spring 2006 Page 7 of 13
R number of samples generated by the experiment
T number of observations in each sampleyrt t-th observation in sample r
br value of the estimator for sample r
B(y) the estimator for as a function of the input vector y
r andn( ) a function that returns a value drawn at random from a Normal distributionwith mean 0 and variance 1
The bias estimate is to be computed as the average of br over the R samples. Assume
the estimates brare stored in an R-by-1 vector named BDi st . You may declareadditional scalar, vector, or matrix variables as needed. Use the symbolto denoteassignment and use = to denote a test for equality. (Refer to the pseudo-code example
from Session 3 as a style guide.)
Suggested solution
[Marking guide: I would suggest allocating 4 marks for the pseudo-code, 8 marks for theMatlab implementation, 8 marks for the analysis. The bonus question is worth 5 marks.]
For r = 1, 2, , R:
For t = 1, 2, , T:
yrt+ r andn( ) // need to multiply by to get Var(vrt) = 2
.
ysample(yr1,yr2,,yrT) // collect the observations for sample r in a vector
brB(ysample).
.bias(r=1,..,Rbr)/R
2. Implement your pseudo-code in Matlab. Begin by creating two M-files:
mc.m A script file that implements the Monte Carlo simulation steps.
B.m A function file that implements the estimator B(y). The function should take a
vector as input and return a scalar value. (Leave B.mas an empty stub for now;
the specific estimation rules will be defined later.)
Your main script should call the function B.m to obtain estimates of . Use the built-in
Matlab function r andn( ) to draw values of vt. Include the following line of code to
initialize the Matlab random number generator:
r andn( ' stat e' , 1) ;
-
8/13/2019 Ex 2 Solution
8/13
Econ 353 Spring 2006 Page 8 of 13
Suggested solution:
% mc. m%% A si mpl e Mont e Carl o si mul at i on f or est i mat i ng the mean.% Assumes a normal di st r i but i on f or t he er r or t er m.
%% Mi ng Kang% Febr uar y 2006
cl ear al l ;
% Set parametersR = 100;
T = r ound( 10 4. 5) ;bet a = 2;si gma2 = 4;
% I ni t i al i ze var i abl esBDi st = zer os( R, 1) ;
y = zer os( T, 1) ;
% Set r andom- number gener at orrandn( ' stat e' , 1) ;
% Si mul at i on st epf or r = 1: R
y = bet a + sqr t ( si gma2) *r andn(T, 1) ;BDi st ( r ) = B( y) ;
end
% Cal cul at e sampl i ng st at i st i csmu = mean( BDi st ) ;
bi as = mu - bet a;mse = mean(( BDi st - bet a) . 2) ;
hi st( BDi st , 20) ;
% B. m%f unct i on b = B( y)b = mean( y) ;% b = medi an( y) ;
-
8/13/2019 Ex 2 Solution
9/13
Econ 353 Spring 2006 Page 9 of 13
B. Analysis
1. Explain how the Monte Carlo algorithm is affected by each of the three types of
numerical error discussed in class. Which do you think is the greatest source of error and
why? Describe the effect of R on the accuracy of the algorithm.
2. Estimate with the sample mean: that is, B.m should compute the average of the
values in the input vector. Set = 2, 2= 4, R = 100, T = 25 and run the Monte Carlo
experiment. Use the hi st command to plot a histogram of BDi st (use bins of size 1)and comment on its shape: is the distribution symmetric, smooth, bell-shaped, etc.
Report the mean and variance of BDi st .
3. Repeat Question 2 but this time estimate with the sample median; that is, B.m
should compute the median (i.e. the 50th
percentile) of the values in the input vector.Compare your results with Question 1 and comment on any differences.
4. Repeat the experiments in Questions 2 and 3 for T = 10k
; k = 2.0, 2.5, 3.0, , 5.0. Foreach value of T, estimate the mean-squared error(MSE) of each estimator: that is,
calculate the average value of (br-)2over the R estimates. Use the l ogl ogcommand to
plot the MSE of both estimators against T and comment on any trend that may be
apparent. Compute the ratio of the MSEfor the sample median to the MSE for the
sample mean; record the values for each T in a table. What happens to the MSE ratio asT grows large?
5. (Bonus question) Consider the same DGP but now suppose v tis drawn from the
Students-t distribution instead of the Normal distribution. The Students t-distributiondepends on a degrees-of-freedom parameter, d = 1,2,. The procedure for sampling from
the Students-t distribution with d degrees of freedom is the following:
i. Sample d+1 values from the standard Normal distribution (mean 0, variance 1);
call these zn; n = 1,2,,d+1
ii. Compute the ratio:
=
+=d
n
n
d
zd
zt
1
2
1
1
Modify your code for mc.mso that the vts are generated using this procedure. Repeat the
steps in Question 4 and fill in the table below with values for the MSE ratio. Compareyour results for T = 25 with those from Question 4; comment on the effects of d and T.
Degrees of freedom (d)
No. of obs. (T) 3 6 9
10
25
100
-
8/13/2019 Ex 2 Solution
10/13
Econ 353 Spring 2006 Page 10 of 13
Suggested solution:
1. The three sources of numeric error discussed in class were modeling error,
approximation error, and round-off error. With Monte Carlo experiments, modeling error
is negligible by design: the experimenter chooses the parameters, distributions, etc.
associated with the data generating process and thus there is an exact correspondencebetween the statistical model and the truth. (Whether the model generates realistic data
is not of direct interest in this problem.) Approximation error is present because the
simulated sampling distribution is an estimate; we only observe R discrete values of thesample statistic where in fact the statistic has a continuous distribution over an infinity of
values. (There is also approximation error in how the normal random variable is
generated from a sequence of pseudo-random numbers but this is less significant.)Roundoff error is probably small given that the range of computed numbers is not
extreme and also averaging tends to smooth out these errors. Therefore, approximation
error seems to be the most significant source of numeric error. The parameter R controlsthe number of replicated draws from the sampling distribution; the higher is R, the better
is the discrete approximation to the true sampling distribution.
2. Results for the sample mean with T = 25:
0.5 1 1.5 2 2.5 30
5
10
15
Here, BDi st has mean 1.9514 and variance 0.1644. As illustrated in the histogram, thesampling distribution is rough but it appears to be single-peaked and not noticeablyskewed.
-
8/13/2019 Ex 2 Solution
11/13
Econ 353 Spring 2006 Page 11 of 13
3. Results for the sample median with T=25:
0.5 1 1.5 2 2.5 3 3.5
0
5
10
15
Now BDi st has mean 1.9274 and variance 0.2550. Thus the sample median appears tohave greater bias and higher variance. The distribution as illustrated in the histogram is
noticeably choppier and more spread out than before.
4. Results:
log10(T) MSE for the mean MSE for the median MSE ratio
(median/mean)
2.0 0.0442581 0.0694461 1.569122.5 0.0111912 0.0196407 1.755023.0 0.0043804 0.0068140 1.555573.5 0.0013336 0.0022508 1.68768
4.0 0.0003182 0.0005183 1.628674.5 0.0001036 0.0001838 1.774905.0 0.0000382 0.0000565 1.47814
For both estimators, the l ogl ogplot suggests log10(MSE) is a linear function of log10(T)with negative slope, which suggests that MSE declines exponentially with T. Theposition of the trend is lower for the sample mean; that is, the MSE for the sample
median is always greater than for the sample mean. The MSE ratio is positive for all T;the ratio fluctuates as T increases, no increasing or decreasing trend is apparent.
-
8/13/2019 Ex 2 Solution
12/13
Econ 353 Spring 2006 Page 12 of 13
102
103
104
105
10-5
10-4
10-3
10-2
10-1
T
mseformean(solid),median(dash)
5.
Results for R = 100:
Degrees of freedom (d)
No. of obs. (T) 3 6 910 0.63276 1.07076 0.96472
25 0.90520 0.96347 1.22535
100 0.58819 1.29718 1.19873
Notice that the median has lower MSE than the mean in some cases which contrasts with
the earlier result that the MSE ratio was always positive. No clear trends are apparentexcept for the case of T=25 in which the MSE ratio increases with d; this makes sense as
the t-distribution actually looks more like the normal distribution as d gets large.
Repeating the experiment (not required) with R = 1000 confirms this result:
(R = 1000) Degrees of freedom (d)No. of obs. (T) 3 6 9
10 0.58435 1.03613 1.20269
25 0.68058 1.12398 1.30109
100 0.58292 1.17798 1.25101
-
8/13/2019 Ex 2 Solution
13/13
Econ 353 Spring 2006 Page 13 of 13
The new code:
% mc2. m%% A si mpl e Mont e Carl o si mul at i on f or est i mat i ng the mean.% Assumes a t - di st r i but i on f or t he er r or t er m.
%% Mi ng Kang% Febr uar y 2006
cl ear al l ;
% Set parametersR = 1000;
T = 100;bet a = 2;d = 3;
% I ni t i al i ze var i abl esBDi st 1 = zer os( R, 1) ; % sampl e mean
BDi st 2 = zeros( R, 1) ; % sampl e medi any = zer os( T, 1) ;
% Set r andom- number gener at orrandn( ' stat e' , 1) ;
% Si mul at i on st epf or r = 1: R
numer = r andn( T, 1) ;denom= sqrt ( sum( r andn( T, d) . 2, 2) ) ;z = numer . / denom;y = beta + z;BDi st 1( r ) = mean( y) ;
BDi st 2( r ) = medi an( y) ;end
% Cal cul at e sampl i ng st at i st i csmse1 = mean(( BDi st 1 - bet a) . 2) ;mse2 = mean(( BDi st 2 - bet a) . 2) ;mse_r = mse2/ mse1