ex 2 solution

Upload: mian-almas

Post on 04-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Ex 2 Solution

    1/13

    Econ 353 Spring 2006 Page 1 of 13

    Exercise 2: Numerical Analysis and Simulation using Matlab

    Part I Time costs of computing (10 marks)

    Write a Matlab script that performs the following:

    1. Using the r andcommand, generate a vector of 500 observations uniformly distributedbetween 0 and 10000; call this vector a.

    2. Again using the r andcommand, generate another vector of 500 observationsuniformly distributed between 0 and 1; call this vector b.

    3. Repeat the following operations for 10000 iterations:

    i. The sum of aand bii. The difference of aand b

    iii. The scalar product of aand biv. The scalar division of aand bv. The square root of a

    vi. The exponential function of a.

    vii. The sine of aviii. The tangent of a

    Look up the pr of i l ecommand in the Matlab help files. Use the pr of i l ecommandto compare the computing time of these operations. Express the time costs of eachoperation as a ratio relative to i.; report these ratios in a table.

    Suggested solutionThe M-file ops.mexecutes steps 1 to 3 as follows:

    T = 10000;n = 500;rand( ' s tat e' , 1) ;a = 10000*r and( n, 1) ;b = r and( n, 1) ;

    f or i = 1: Ta+b;

    a- b;a. *b;a. / b;sqr t (a ) ;exp( a) ;si n(a) ;t an( a) ;

    end

  • 8/13/2019 Ex 2 Solution

    2/13

    Econ 353 Spring 2006 Page 2 of 13

    The Matlab profiler traces the sequence of calls to functions and scripts and analyses how

    much time is spent executing specific operations. The pr of i l ecommand includes anoption for reporting the detail for built-in functions and operators like addition andmultiplication. Running the commands

    >> pr of i l e on - det ai l oper at or>> ops>> pr of i l e r epor t opsr epor t

    produces a report in HTML format. The report states how much time was spent on thefollowing lines:

    7: f or i = 1: T

    0. 12 1% 8: a+b;

    0. 19 2% 9: a- b;

    0. 13 1%10: a. *b;0. 23 3%11: a. / b;

    1. 88 21%12: sqr t ( a) ;

    2. 54 29%13: exp( a) ;

    1. 57 18%14: si n( a) ;

    2. 09 24%15: t an( a) ;

    0. 04 0%16: end

    Expressing the computing times as ratios:

    Operat i on I ndex

    a+b 1. 00a- b 1. 58

    a. *b 1. 08

    a. / b 1. 92

    sqr t ( a) 15. 67

    exp( a) 21. 17

    si n( a) 13. 08

    t an( a) 17. 42

    Thus we can see that the last 4 operations are an order of magnitude more costly relative

    to the elementary arithmetic operations.

  • 8/13/2019 Ex 2 Solution

    3/13

    Econ 353 Spring 2006 Page 3 of 13

    Part II Stopping rules in iterative methods (15 marks)

    In Lab 1, you looked at a Matlab implementation of the simple Walrasian iterative.

    Modify the code in walras1.m to consider two different stopping rules:

    Rule 1: Stop if |p

    k

    p

    k+1

    | / (1 + |p

    k

    | ) Rule 2: Stop if |pk p

    k+1| (1 *) where * = max j=1,,k|p

    k+1-j- p

    k+1| / |p

    k-j- p

    k|

    (Here, |.| means absolute value of.) For each rule, report the final value of the iterativeand the number of iterations for = 10

    -2, 10

    -4, 10

    -6, 10

    -8. Define a suitable accuracy

    measure and use it to evaluate each rule-combination: which is the most accurate?

    Based on your results, provide an estimate for the iteratives rate of (linear) convergence.

    Suggested solution

    See the M-files walras_rule1.mand walras_rule2.mfor the suggested Matlabimplementations. The main loops of these M-files appear below.

    [Marking guide: I would suggest allocating 10 marks to the quality of the Matlab

    implementation and reported results, 5 marks for the analysis.]

    Results for Rule 1:

    Value of Final no. of

    iterations, k

    Final value of the

    iterative, pk

    Excess demand

    evaluated at final pk

    10-2

    10 1.05197659091279 -0.01755008852421

    10-4

    21 1.00048080455497 -0.00016822440116

    10-6

    32 1.00000420959831 -0.00000147335502

    10-8

    42 1.00000005667300 -0.00000001983555

    [walras_rule1.m]f or k=1: maxi t

    i f k>maxi tmaxi t _r eached = 1;break

    end

    E_k = 0. 5*p_k ( - 0. 2) + 0. 5*p_k ( - 0. 5) - 1; % excess demand at p( k)p_k1 = p_k + l ambda * E_k;

    i f abs( p_k- p_k1)

  • 8/13/2019 Ex 2 Solution

    4/13

    Econ 353 Spring 2006 Page 4 of 13

    Results for Rule 2:

    Value of Final no. of

    iterations, k

    Final value of the

    iterative, pk

    Excess demand

    evaluated at final pk

    10-2

    16 1.00412743509784 -0.00144039997406

    10-4

    26 1.00005581309872 -0.00001953381360

    10-6

    37 1.00000048843772 -0.0000001709531410

    -848 1.00000000427421 -0.00000000149598

    [walras_rule2.m]f or k=1: maxi t

    i f k>maxi tmaxi t _r eached = 1;break

    end

    E_k = 0. 5*p( k) ( - 0. 2) + 0. 5*p( k) ( - 0. 5) - 1; % excess demand at p( k)p(k+1) = p( k) + l ambda * E_k;

    bet a_st ar = 0;bet a = zer os( 1, k) ;i f k>1

    f or j =1: k- 1bet a( k+1- j ) = abs( p( k+1- j ) - p( k+1) ) / abs( p( k- j ) - p( k) ) ;

    endbet a_st ar = max( beta) ;

    end

    i f abs( p( k)- p( k+1) )

  • 8/13/2019 Ex 2 Solution

    5/13

    Econ 353 Spring 2006 Page 5 of 13

    0 5 10 15 20 25 30 35 40 45 500

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    Iteration

    beta

  • 8/13/2019 Ex 2 Solution

    6/13

    Econ 353 Spring 2006 Page 6 of 13

    Part III A Monte Carlo experiment (20 marks)

    One of the main tasks of econometrics is to study the sampling distribution of an

    estimator or test statistic. For instance, we may be interested in the bias the expected

    difference from the population mean of an estimator in repeated samples. There are

    powerful theorems (e.g. central limit theorems) that characterize the samplingdistributions of estimators as the sample size tends to infinity. With finite samples,

    however, these asymptotic theorems are often uninformative or misleading.

    Monte Carlo simulationis a method for studying the finite-sample distribution of an

    estimator or test statistic. It uses simulated experimental data to evaluate the performance

    of the estimator or test procedure. The basic steps in a Monte Carlo experiment are thefollowing:

    1. Specify a model for the data-generating process (DGP), i.e. make assumptionsabout functional relationships among the variables, probability distributions, and

    the true values of the associated parameters.2. Generate R data sets (samples) by simulating R random draws from the DGP.

    Typically, the sampling process is based on a computer-generated sequence ofpseudo-random numbers.

    3. Calculate the statistic of interest for each data set. The R calculated values

    represent the sampling distribution of the statistic.4. Calculate the desired sampling measure for the statistic. E.g. The bias of the

    statistic is estimated as the average deviation from the true mean taken over the

    simulated sampling distribution.

    Your task is to write a Matlab program that performs Monte Carlo experiments. Indeveloping your program, observe the following coding guidelines:

    - Declare and initialize the main variables before performing operations on them.- Where possible, use parameters rather than hard-coded numbers.

    - Use vectorized code rather than loops when it is efficient to do so.

    - Include concise and informative comments throughout.

    A. Preparation

    Consider the following DGP:

    yt= + vt; t = 1,2,,T

    where is a constant and vtis independently and identically distributed (i.i.d.) as Normal

    with mean 0 and variance 2.

    1. Write the pseudo-codefor a Monte Carlo experiment that estimates the bias of a given

    estimator of . Use the following notation:

  • 8/13/2019 Ex 2 Solution

    7/13

    Econ 353 Spring 2006 Page 7 of 13

    R number of samples generated by the experiment

    T number of observations in each sampleyrt t-th observation in sample r

    br value of the estimator for sample r

    B(y) the estimator for as a function of the input vector y

    r andn( ) a function that returns a value drawn at random from a Normal distributionwith mean 0 and variance 1

    The bias estimate is to be computed as the average of br over the R samples. Assume

    the estimates brare stored in an R-by-1 vector named BDi st . You may declareadditional scalar, vector, or matrix variables as needed. Use the symbolto denoteassignment and use = to denote a test for equality. (Refer to the pseudo-code example

    from Session 3 as a style guide.)

    Suggested solution

    [Marking guide: I would suggest allocating 4 marks for the pseudo-code, 8 marks for theMatlab implementation, 8 marks for the analysis. The bonus question is worth 5 marks.]

    For r = 1, 2, , R:

    For t = 1, 2, , T:

    yrt+ r andn( ) // need to multiply by to get Var(vrt) = 2

    .

    ysample(yr1,yr2,,yrT) // collect the observations for sample r in a vector

    brB(ysample).

    .bias(r=1,..,Rbr)/R

    2. Implement your pseudo-code in Matlab. Begin by creating two M-files:

    mc.m A script file that implements the Monte Carlo simulation steps.

    B.m A function file that implements the estimator B(y). The function should take a

    vector as input and return a scalar value. (Leave B.mas an empty stub for now;

    the specific estimation rules will be defined later.)

    Your main script should call the function B.m to obtain estimates of . Use the built-in

    Matlab function r andn( ) to draw values of vt. Include the following line of code to

    initialize the Matlab random number generator:

    r andn( ' stat e' , 1) ;

  • 8/13/2019 Ex 2 Solution

    8/13

    Econ 353 Spring 2006 Page 8 of 13

    Suggested solution:

    % mc. m%% A si mpl e Mont e Carl o si mul at i on f or est i mat i ng the mean.% Assumes a normal di st r i but i on f or t he er r or t er m.

    %% Mi ng Kang% Febr uar y 2006

    cl ear al l ;

    % Set parametersR = 100;

    T = r ound( 10 4. 5) ;bet a = 2;si gma2 = 4;

    % I ni t i al i ze var i abl esBDi st = zer os( R, 1) ;

    y = zer os( T, 1) ;

    % Set r andom- number gener at orrandn( ' stat e' , 1) ;

    % Si mul at i on st epf or r = 1: R

    y = bet a + sqr t ( si gma2) *r andn(T, 1) ;BDi st ( r ) = B( y) ;

    end

    % Cal cul at e sampl i ng st at i st i csmu = mean( BDi st ) ;

    bi as = mu - bet a;mse = mean(( BDi st - bet a) . 2) ;

    hi st( BDi st , 20) ;

    % B. m%f unct i on b = B( y)b = mean( y) ;% b = medi an( y) ;

  • 8/13/2019 Ex 2 Solution

    9/13

    Econ 353 Spring 2006 Page 9 of 13

    B. Analysis

    1. Explain how the Monte Carlo algorithm is affected by each of the three types of

    numerical error discussed in class. Which do you think is the greatest source of error and

    why? Describe the effect of R on the accuracy of the algorithm.

    2. Estimate with the sample mean: that is, B.m should compute the average of the

    values in the input vector. Set = 2, 2= 4, R = 100, T = 25 and run the Monte Carlo

    experiment. Use the hi st command to plot a histogram of BDi st (use bins of size 1)and comment on its shape: is the distribution symmetric, smooth, bell-shaped, etc.

    Report the mean and variance of BDi st .

    3. Repeat Question 2 but this time estimate with the sample median; that is, B.m

    should compute the median (i.e. the 50th

    percentile) of the values in the input vector.Compare your results with Question 1 and comment on any differences.

    4. Repeat the experiments in Questions 2 and 3 for T = 10k

    ; k = 2.0, 2.5, 3.0, , 5.0. Foreach value of T, estimate the mean-squared error(MSE) of each estimator: that is,

    calculate the average value of (br-)2over the R estimates. Use the l ogl ogcommand to

    plot the MSE of both estimators against T and comment on any trend that may be

    apparent. Compute the ratio of the MSEfor the sample median to the MSE for the

    sample mean; record the values for each T in a table. What happens to the MSE ratio asT grows large?

    5. (Bonus question) Consider the same DGP but now suppose v tis drawn from the

    Students-t distribution instead of the Normal distribution. The Students t-distributiondepends on a degrees-of-freedom parameter, d = 1,2,. The procedure for sampling from

    the Students-t distribution with d degrees of freedom is the following:

    i. Sample d+1 values from the standard Normal distribution (mean 0, variance 1);

    call these zn; n = 1,2,,d+1

    ii. Compute the ratio:

    =

    +=d

    n

    n

    d

    zd

    zt

    1

    2

    1

    1

    Modify your code for mc.mso that the vts are generated using this procedure. Repeat the

    steps in Question 4 and fill in the table below with values for the MSE ratio. Compareyour results for T = 25 with those from Question 4; comment on the effects of d and T.

    Degrees of freedom (d)

    No. of obs. (T) 3 6 9

    10

    25

    100

  • 8/13/2019 Ex 2 Solution

    10/13

    Econ 353 Spring 2006 Page 10 of 13

    Suggested solution:

    1. The three sources of numeric error discussed in class were modeling error,

    approximation error, and round-off error. With Monte Carlo experiments, modeling error

    is negligible by design: the experimenter chooses the parameters, distributions, etc.

    associated with the data generating process and thus there is an exact correspondencebetween the statistical model and the truth. (Whether the model generates realistic data

    is not of direct interest in this problem.) Approximation error is present because the

    simulated sampling distribution is an estimate; we only observe R discrete values of thesample statistic where in fact the statistic has a continuous distribution over an infinity of

    values. (There is also approximation error in how the normal random variable is

    generated from a sequence of pseudo-random numbers but this is less significant.)Roundoff error is probably small given that the range of computed numbers is not

    extreme and also averaging tends to smooth out these errors. Therefore, approximation

    error seems to be the most significant source of numeric error. The parameter R controlsthe number of replicated draws from the sampling distribution; the higher is R, the better

    is the discrete approximation to the true sampling distribution.

    2. Results for the sample mean with T = 25:

    0.5 1 1.5 2 2.5 30

    5

    10

    15

    Here, BDi st has mean 1.9514 and variance 0.1644. As illustrated in the histogram, thesampling distribution is rough but it appears to be single-peaked and not noticeablyskewed.

  • 8/13/2019 Ex 2 Solution

    11/13

    Econ 353 Spring 2006 Page 11 of 13

    3. Results for the sample median with T=25:

    0.5 1 1.5 2 2.5 3 3.5

    0

    5

    10

    15

    Now BDi st has mean 1.9274 and variance 0.2550. Thus the sample median appears tohave greater bias and higher variance. The distribution as illustrated in the histogram is

    noticeably choppier and more spread out than before.

    4. Results:

    log10(T) MSE for the mean MSE for the median MSE ratio

    (median/mean)

    2.0 0.0442581 0.0694461 1.569122.5 0.0111912 0.0196407 1.755023.0 0.0043804 0.0068140 1.555573.5 0.0013336 0.0022508 1.68768

    4.0 0.0003182 0.0005183 1.628674.5 0.0001036 0.0001838 1.774905.0 0.0000382 0.0000565 1.47814

    For both estimators, the l ogl ogplot suggests log10(MSE) is a linear function of log10(T)with negative slope, which suggests that MSE declines exponentially with T. Theposition of the trend is lower for the sample mean; that is, the MSE for the sample

    median is always greater than for the sample mean. The MSE ratio is positive for all T;the ratio fluctuates as T increases, no increasing or decreasing trend is apparent.

  • 8/13/2019 Ex 2 Solution

    12/13

    Econ 353 Spring 2006 Page 12 of 13

    102

    103

    104

    105

    10-5

    10-4

    10-3

    10-2

    10-1

    T

    mseformean(solid),median(dash)

    5.

    Results for R = 100:

    Degrees of freedom (d)

    No. of obs. (T) 3 6 910 0.63276 1.07076 0.96472

    25 0.90520 0.96347 1.22535

    100 0.58819 1.29718 1.19873

    Notice that the median has lower MSE than the mean in some cases which contrasts with

    the earlier result that the MSE ratio was always positive. No clear trends are apparentexcept for the case of T=25 in which the MSE ratio increases with d; this makes sense as

    the t-distribution actually looks more like the normal distribution as d gets large.

    Repeating the experiment (not required) with R = 1000 confirms this result:

    (R = 1000) Degrees of freedom (d)No. of obs. (T) 3 6 9

    10 0.58435 1.03613 1.20269

    25 0.68058 1.12398 1.30109

    100 0.58292 1.17798 1.25101

  • 8/13/2019 Ex 2 Solution

    13/13

    Econ 353 Spring 2006 Page 13 of 13

    The new code:

    % mc2. m%% A si mpl e Mont e Carl o si mul at i on f or est i mat i ng the mean.% Assumes a t - di st r i but i on f or t he er r or t er m.

    %% Mi ng Kang% Febr uar y 2006

    cl ear al l ;

    % Set parametersR = 1000;

    T = 100;bet a = 2;d = 3;

    % I ni t i al i ze var i abl esBDi st 1 = zer os( R, 1) ; % sampl e mean

    BDi st 2 = zeros( R, 1) ; % sampl e medi any = zer os( T, 1) ;

    % Set r andom- number gener at orrandn( ' stat e' , 1) ;

    % Si mul at i on st epf or r = 1: R

    numer = r andn( T, 1) ;denom= sqrt ( sum( r andn( T, d) . 2, 2) ) ;z = numer . / denom;y = beta + z;BDi st 1( r ) = mean( y) ;

    BDi st 2( r ) = medi an( y) ;end

    % Cal cul at e sampl i ng st at i st i csmse1 = mean(( BDi st 1 - bet a) . 2) ;mse2 = mean(( BDi st 2 - bet a) . 2) ;mse_r = mse2/ mse1