ch4. variance reduction techniques

70
Introduction The Basic Problem Variance Reduction Techniques Ch4. Variance Reduction Techniques Zhang Jin-Ting Department of Statistics and Applied Probability July 17, 2012 Zhang J.T. Ch4. Variance Reduction Techniques

Upload: others

Post on 12-Sep-2021

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Ch4. Variance Reduction Techniques

Zhang Jin-Ting

Department of Statistics and Applied Probability

July 17, 2012

Zhang J.T. Ch4. Variance Reduction Techniques

Page 2: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Outline

Introduction

The Basic Problem

Variance Reduction Techniques

Zhang J.T. Ch4. Variance Reduction Techniques

Page 3: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I This chapter aims to improve the Monte Carlo Integrationestimator via reducing its variance using some usefultechniques.

I Stratified SamplingI Importance SamplingI Control Variates MethodI Antithetic Variates Method

Zhang J.T. Ch4. Variance Reduction Techniques

Page 4: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I This chapter aims to improve the Monte Carlo Integrationestimator via reducing its variance using some usefultechniques.

I Stratified SamplingI Importance SamplingI Control Variates MethodI Antithetic Variates Method

Zhang J.T. Ch4. Variance Reduction Techniques

Page 5: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I This chapter aims to improve the Monte Carlo Integrationestimator via reducing its variance using some usefultechniques.

I Stratified SamplingI Importance SamplingI Control Variates MethodI Antithetic Variates Method

Zhang J.T. Ch4. Variance Reduction Techniques

Page 6: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I This chapter aims to improve the Monte Carlo Integrationestimator via reducing its variance using some usefultechniques.

I Stratified SamplingI Importance SamplingI Control Variates MethodI Antithetic Variates Method

Zhang J.T. Ch4. Variance Reduction Techniques

Page 7: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I This chapter aims to improve the Monte Carlo Integrationestimator via reducing its variance using some usefultechniques.

I Stratified SamplingI Importance SamplingI Control Variates MethodI Antithetic Variates Method

Zhang J.T. Ch4. Variance Reduction Techniques

Page 8: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Integration ProblemI Suppose we want to estimate an integral over some

region, such as

IA =

Sk(x)dx

where S is subset of Rd , x denotes a generic point of Rd ,and k is a given real-valued function on S; or

IB =

Rdh(x)f (x)dx

where h is a real-valued function on Rd and f is a given pdfon Rd .

Zhang J.T. Ch4. Variance Reduction Techniques

Page 9: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Transformed Problem: Monte Carlo IntegrationI It is clear that IB can be written as an expectation:

IB = E(h(X )) where X ∼ f .

I Also, extend the definition of k to all of Rd by saying thatk(x) = 0 for every x that is not in S, then

IA =

Rdk(x) =

Rd

k(x)

f (x)f (x)dx = E [

k(x)

f (x)]. (1)

Zhang J.T. Ch4. Variance Reduction Techniques

Page 10: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Transformed Problem: Monte Carlo IntegrationI It is clear that IB can be written as an expectation:

IB = E(h(X )) where X ∼ f .

I Also, extend the definition of k to all of Rd by saying thatk(x) = 0 for every x that is not in S, then

IA =

Rdk(x) =

Rd

k(x)

f (x)f (x)dx = E [

k(x)

f (x)]. (1)

Zhang J.T. Ch4. Variance Reduction Techniques

Page 11: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I Notice that k(x )f (x ) is well-defined except where f equals 0,

which is a set of probability 0.I This is a simple trick that will be especially useful in the

method known as Importance Sampling.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 12: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I Notice that k(x )f (x ) is well-defined except where f equals 0,

which is a set of probability 0.I This is a simple trick that will be especially useful in the

method known as Importance Sampling.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 13: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Simple SamplingI This leads to a natural Monte Carlo strategy for estimating

the value of IB, say.I If we can generate iid random variables X1, X2, . . . whose

common pdf is f , then for every n,

In =1n

n∑

i=1

h(X i)

is an unbiased estimator of IB.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 14: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Simple SamplingI This leads to a natural Monte Carlo strategy for estimating

the value of IB, say.I If we can generate iid random variables X1, X2, . . . whose

common pdf is f , then for every n,

In =1n

n∑

i=1

h(X i)

is an unbiased estimator of IB.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 15: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I Moreover, the strong law of large numbers implies that Inconverges to IB with probability 1 as n →∞.

I This method for estimating IB is called simple sampling.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 16: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I Moreover, the strong law of large numbers implies that Inconverges to IB with probability 1 as n →∞.

I This method for estimating IB is called simple sampling.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 17: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Variance Reduction ProblemI The variance of simple sampling estimator In of IB is

var(In) =var(h(X ))

n=

(∫

S h(x)2f (x)dx − I2B)

n. (2)

I The variance of the estimator determines the size of theconfidence interval.

I The n in the denominator is hard to avoid in Monte Carlo,but there are various ways to reduce the numerator.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 18: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Variance Reduction ProblemI The variance of simple sampling estimator In of IB is

var(In) =var(h(X ))

n=

(∫

S h(x)2f (x)dx − I2B)

n. (2)

I The variance of the estimator determines the size of theconfidence interval.

I The n in the denominator is hard to avoid in Monte Carlo,but there are various ways to reduce the numerator.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 19: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Variance Reduction ProblemI The variance of simple sampling estimator In of IB is

var(In) =var(h(X ))

n=

(∫

S h(x)2f (x)dx − I2B)

n. (2)

I The variance of the estimator determines the size of theconfidence interval.

I The n in the denominator is hard to avoid in Monte Carlo,but there are various ways to reduce the numerator.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 20: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I The goal of this chapter is to explore alternative samplingschemes which can achieve smaller variance for the sameamount of computational efforts.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 21: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Stratified SamplingStep 1: Range Partition

I Stratified sampling is a powerful and commonly usedtechnique in population survey and is also very useful inMonte Carlo computations.

I To evaluate IB, the stratified sampling is to partition S intoseveral disjoint sets S(1), . . . , S(M) (so that S = ∪M

i=1S(i)).

Zhang J.T. Ch4. Variance Reduction Techniques

Page 22: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Stratified SamplingStep 1: Range Partition

I Stratified sampling is a powerful and commonly usedtechnique in population survey and is also very useful inMonte Carlo computations.

I To evaluate IB, the stratified sampling is to partition S intoseveral disjoint sets S(1), . . . , S(M) (so that S = ∪M

i=1S(i)).

Zhang J.T. Ch4. Variance Reduction Techniques

Page 23: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I For i = 1, . . . , M, let

ai =

S(i)f (x)dx = P(X ∈ S(i)).

I Observe that a1 + · · ·+ aM = 1. Fix integers n1, . . . , nMsuch that n1 + · · ·+ nM = n.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 24: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I For i = 1, . . . , M, let

ai =

S(i)f (x)dx = P(X ∈ S(i)).

I Observe that a1 + · · ·+ aM = 1. Fix integers n1, . . . , nMsuch that n1 + · · ·+ nM = n.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 25: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Step 2: Sub-samplingI For each i , generate ni samples X (i)

1 , . . . , X (i)ni

from S(i)

having the conditional pdf

g(x) =

{f (x )

aiif x ∈ S(i)

0 otherwise

Zhang J.T. Ch4. Variance Reduction Techniques

Page 26: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I Let Ti = n−1i

∑nij=1 h(X (i)

j ). Then

E(Ti) =

S(i)h(x)

f (x)

aidx =

1ai

S(i)h(x)f (x)dx = Ii/ai ,

by defining Ii =∫

S(i) h(x)f (x)dx .

Zhang J.T. Ch4. Variance Reduction Techniques

Page 27: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Step 3: The Stratified EstimatorI Observe that I1 + · · ·+ IM = IB. The stratified estimator is

T =M∑

i=1

aiTi .

I It is unbiased because of

E(T ) =M∑

i=1

aiE(Ti) =M∑

i=1

ai Ii/ai = IB.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 28: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Step 3: The Stratified EstimatorI Observe that I1 + · · ·+ IM = IB. The stratified estimator is

T =M∑

i=1

aiTi .

I It is unbiased because of

E(T ) =M∑

i=1

aiE(Ti) =M∑

i=1

ai Ii/ai = IB.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 29: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I The variance of T is

var(T ) =M∑

i=1

a2i var(Ti),

where

var(Ti) =

∫S(i) h(x)2 f (x )

aidx − ( Ii

ai)2

ni.

following from (2).

Zhang J.T. Ch4. Variance Reduction Techniques

Page 30: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

TheoremThe Foundation Theory of the Stratified Sampling Ifni = nai for i = 1, . . . , M. then the stratified estimator hassmaller variance than the simple estimator In. In fact,

var(In) = var(T ) +1n

M∑

i=1

ai(Iiai− IB)2.

I The choice ni = nai , called “proportional allocation”, give astratified estimator which has smaller variance than thesimple estimator.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 31: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

TheoremThe Foundation Theory of the Stratified Sampling Ifni = nai for i = 1, . . . , M. then the stratified estimator hassmaller variance than the simple estimator In. In fact,

var(In) = var(T ) +1n

M∑

i=1

ai(Iiai− IB)2.

I The choice ni = nai , called “proportional allocation”, give astratified estimator which has smaller variance than thesimple estimator.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 32: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Importance SamplingProperty of the Important Sampling

I Importance sampling is a very powerful method that canimprove Monte Carlo efficiency by orders of magnitude insome problems.

I But it requires Caution: an inappropriate implementationcan reduce efficiency by orders of magnitude!

Zhang J.T. Ch4. Variance Reduction Techniques

Page 33: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Importance SamplingProperty of the Important Sampling

I Importance sampling is a very powerful method that canimprove Monte Carlo efficiency by orders of magnitude insome problems.

I But it requires Caution: an inappropriate implementationcan reduce efficiency by orders of magnitude!

Zhang J.T. Ch4. Variance Reduction Techniques

Page 34: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Basic IdeaI The method works by sampling from an artificial probability

distribution that is chosen by the user, and thenreweighting the observations to get an unbiased estimate.

I The idea is based on the identity (1)

IA =

Rdk(x) =

Rd

k(x)

f (x)f (x)dx = E [

k(x)

f (x)].

Zhang J.T. Ch4. Variance Reduction Techniques

Page 35: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Basic IdeaI The method works by sampling from an artificial probability

distribution that is chosen by the user, and thenreweighting the observations to get an unbiased estimate.

I The idea is based on the identity (1)

IA =

Rdk(x) =

Rd

k(x)

f (x)f (x)dx = E [

k(x)

f (x)].

Zhang J.T. Ch4. Variance Reduction Techniques

Page 36: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I It implies that IA can be estimated by

Jn =1n

n∑

i=1

k(Xi)

f (Xi),

where Xi ’s are iid from f .I We call Jn the importance sampling estimator based on f .I The identity (1) implies that Jn is unbiased.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 37: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I It implies that IA can be estimated by

Jn =1n

n∑

i=1

k(Xi)

f (Xi),

where Xi ’s are iid from f .I We call Jn the importance sampling estimator based on f .I The identity (1) implies that Jn is unbiased.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 38: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I It implies that IA can be estimated by

Jn =1n

n∑

i=1

k(Xi)

f (Xi),

where Xi ’s are iid from f .I We call Jn the importance sampling estimator based on f .I The identity (1) implies that Jn is unbiased.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 39: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Important Sampling ProcedureI Suppose now one is interested in evaluating

IB =

Rdh(x)f (x)dx ,

the procedure of the importance sampling is as follows:(a) Draw X1, . . . , Xn from a trial density g.(b) Calculate the importance weight

wj = f (Xj)/g(Xj), for j = 1, . . . , n.

(c) Approximate IB by

Jg,n =

∑nj=1 wjh(Xj)∑n

j=1 wj. (3)

Zhang J.T. Ch4. Variance Reduction Techniques

Page 40: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Important Sampling ProcedureI Suppose now one is interested in evaluating

IB =

Rdh(x)f (x)dx ,

the procedure of the importance sampling is as follows:(a) Draw X1, . . . , Xn from a trial density g.(b) Calculate the importance weight

wj = f (Xj)/g(Xj), for j = 1, . . . , n.

(c) Approximate IB by

Jg,n =

∑nj=1 wjh(Xj)∑n

j=1 wj. (3)

Zhang J.T. Ch4. Variance Reduction Techniques

Page 41: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Important Sampling ProcedureI Suppose now one is interested in evaluating

IB =

Rdh(x)f (x)dx ,

the procedure of the importance sampling is as follows:(a) Draw X1, . . . , Xn from a trial density g.(b) Calculate the importance weight

wj = f (Xj)/g(Xj), for j = 1, . . . , n.

(c) Approximate IB by

Jg,n =

∑nj=1 wjh(Xj)∑n

j=1 wj. (3)

Zhang J.T. Ch4. Variance Reduction Techniques

Page 42: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

The Important Sampling ProcedureI Suppose now one is interested in evaluating

IB =

Rdh(x)f (x)dx ,

the procedure of the importance sampling is as follows:(a) Draw X1, . . . , Xn from a trial density g.(b) Calculate the importance weight

wj = f (Xj)/g(Xj), for j = 1, . . . , n.

(c) Approximate IB by

Jg,n =

∑nj=1 wjh(Xj)∑n

j=1 wj. (3)

Zhang J.T. Ch4. Variance Reduction Techniques

Page 43: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I Thus, in order to make the estimation error small, onewants to choose g as “close” in shape to h(x)f (x) aspossible.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 44: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

An Alternative Important Sampling ProcedureI A major advantage of using (3) instead of the unbiased

estimate,

IB =1n

n∑

j=1

wjh(Xj)

is thatI in using the former, we need only to know the ratio

f (X )/g(X ) up to a multiplicative constant; whereas in thelatter, the ratio needs to be known exactly.

I Although introducing a small bias, (3) often has a smallermean squared error than the unbiased one IB.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 45: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

An Alternative Important Sampling ProcedureI A major advantage of using (3) instead of the unbiased

estimate,

IB =1n

n∑

j=1

wjh(Xj)

is thatI in using the former, we need only to know the ratio

f (X )/g(X ) up to a multiplicative constant; whereas in thelatter, the ratio needs to be known exactly.

I Although introducing a small bias, (3) often has a smallermean squared error than the unbiased one IB.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 46: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

An Alternative Important Sampling ProcedureI A major advantage of using (3) instead of the unbiased

estimate,

IB =1n

n∑

j=1

wjh(Xj)

is thatI in using the former, we need only to know the ratio

f (X )/g(X ) up to a multiplicative constant; whereas in thelatter, the ratio needs to be known exactly.

I Although introducing a small bias, (3) often has a smallermean squared error than the unbiased one IB.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 47: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Example 1I Let h(x) = 4

√1− x2, x ∈ [0, 1]. Let us imagine that we do

not know how to evaluate I =∫ 1

0 h(x)dx (which is π, ofcourse).

Zhang J.T. Ch4. Variance Reduction Techniques

Page 48: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Use Simple SamplingI The simple sampling estimate is

In =1n

n∑

i=1

4√

1− U2i ,

where Ui are iid U[0,1] random variables.I This is unbiased, with variance

var(In) =1n

(

∫ 1

0h(x)2dx−I2) =

1n

(

∫ 1

016(1−x2)dx−π2) =

0.797n

.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 49: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Use Simple SamplingI The simple sampling estimate is

In =1n

n∑

i=1

4√

1− U2i ,

where Ui are iid U[0,1] random variables.I This is unbiased, with variance

var(In) =1n

(

∫ 1

0h(x)2dx−I2) =

1n

(

∫ 1

016(1−x2)dx−π2) =

0.797n

.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 50: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Use Inappropriate Important SamplingI Consider the importance sampling estimate based on the

pdf gb(x) = 2x , x ∈ [0, 1].I It is easy to generate Yi ∼ gb ( the cdf is F (t) = t2, so we

can set Yi ← F−1(Ui) =√

Ui , where Ui ∼ U[0, 1]).I The importance sampling estimator is

J(b)n =

1n

n∑

i=1

h(Yi)/gb(Yi) =1n

n∑

i=1

4√

1− Y 2i

2Yi.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 51: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Use Inappropriate Important SamplingI Consider the importance sampling estimate based on the

pdf gb(x) = 2x , x ∈ [0, 1].I It is easy to generate Yi ∼ gb ( the cdf is F (t) = t2, so we

can set Yi ← F−1(Ui) =√

Ui , where Ui ∼ U[0, 1]).I The importance sampling estimator is

J(b)n =

1n

n∑

i=1

h(Yi)/gb(Yi) =1n

n∑

i=1

4√

1− Y 2i

2Yi.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 52: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Use Inappropriate Important SamplingI Consider the importance sampling estimate based on the

pdf gb(x) = 2x , x ∈ [0, 1].I It is easy to generate Yi ∼ gb ( the cdf is F (t) = t2, so we

can set Yi ← F−1(Ui) =√

Ui , where Ui ∼ U[0, 1]).I The importance sampling estimator is

J(b)n =

1n

n∑

i=1

h(Yi)/gb(Yi) =1n

n∑

i=1

4√

1− Y 2i

2Yi.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 53: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I The J(b)n has mean I and variance

var(J(b)n ) =

1n

var(h(Y )

gb(Y )) =

1n

∫ 1

0(

h(x)

gb(x)− I)2dx = +∞.

I Hence, the trial density g(x) = 2x is very bad, and weneed try a different one.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 54: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I The J(b)n has mean I and variance

var(J(b)n ) =

1n

var(h(Y )

gb(Y )) =

1n

∫ 1

0(

h(x)

gb(x)− I)2dx = +∞.

I Hence, the trial density g(x) = 2x is very bad, and weneed try a different one.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 55: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Use Appropriate Important SamplingI Let gc(x) = (4− 2x)/3, x ∈ [0, 1].I The importance sampling estimator is

J(c)n =

1n

n∑

i=1

4√

1− Y 2i

(4− 2Yi)/3,

whose variance is

var(J(c)n ) =

1n

var(h(Y )

gc(Y )) =

1n

∫ 1

0(

h(x)

gc(x)− I)2dx

=1n

[

∫ 1

0

16(1− x2)

(4− 2x)/3dx − π2] = 0.224/n.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 56: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Use Appropriate Important SamplingI Let gc(x) = (4− 2x)/3, x ∈ [0, 1].I The importance sampling estimator is

J(c)n =

1n

n∑

i=1

4√

1− Y 2i

(4− 2Yi)/3,

whose variance is

var(J(c)n ) =

1n

var(h(Y )

gc(Y )) =

1n

∫ 1

0(

h(x)

gc(x)− I)2dx

=1n

[

∫ 1

0

16(1− x2)

(4− 2x)/3dx − π2] = 0.224/n.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 57: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I Thus, the importance sampling estimate of (c) can achievethe same size confidence interval as the simple samplingestimate of (a) while using only one third as manygenerated random variables.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 58: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Control Variates MethodThe Main Idea

I In this method, one uses a control variate C, which isCorrelated with the sample X , to produce a better estimate.

The ProcedureI Suppose the estimation of µ = E(X ) is of interest and

µC = E(C) is known.I Then we can construct Monte Carlo samples of the form

X (b) = X + b(C − µC),

which have the same mean as X , but a new variance

var(X (b)) = var(X )− 2bCov(X , C) + b2var(C).

Zhang J.T. Ch4. Variance Reduction Techniques

Page 59: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Control Variates MethodThe Main Idea

I In this method, one uses a control variate C, which isCorrelated with the sample X , to produce a better estimate.

The ProcedureI Suppose the estimation of µ = E(X ) is of interest and

µC = E(C) is known.I Then we can construct Monte Carlo samples of the form

X (b) = X + b(C − µC),

which have the same mean as X , but a new variance

var(X (b)) = var(X )− 2bCov(X , C) + b2var(C).

Zhang J.T. Ch4. Variance Reduction Techniques

Page 60: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Control Variates MethodThe Main Idea

I In this method, one uses a control variate C, which isCorrelated with the sample X , to produce a better estimate.

The ProcedureI Suppose the estimation of µ = E(X ) is of interest and

µC = E(C) is known.I Then we can construct Monte Carlo samples of the form

X (b) = X + b(C − µC),

which have the same mean as X , but a new variance

var(X (b)) = var(X )− 2bCov(X , C) + b2var(C).

Zhang J.T. Ch4. Variance Reduction Techniques

Page 61: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I If the computation of Cov(X , C) and var(C) is easy, thenwe can let b = Cov(X , C)/Var(C), in which case

var(X (b)) = (1− ρ2XC)var(X ) < var(X ).

A Special CaseI Another situation is when we know only that E(C) is equal

to µ. Then, we can form X (b) = bX + (1− b)C.I It is easy to show that if C is Correlated with X , we can

always choose a proper b so that X (b) has a smallervariance than X .

Zhang J.T. Ch4. Variance Reduction Techniques

Page 62: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I If the computation of Cov(X , C) and var(C) is easy, thenwe can let b = Cov(X , C)/Var(C), in which case

var(X (b)) = (1− ρ2XC)var(X ) < var(X ).

A Special CaseI Another situation is when we know only that E(C) is equal

to µ. Then, we can form X (b) = bX + (1− b)C.I It is easy to show that if C is Correlated with X , we can

always choose a proper b so that X (b) has a smallervariance than X .

Zhang J.T. Ch4. Variance Reduction Techniques

Page 63: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I If the computation of Cov(X , C) and var(C) is easy, thenwe can let b = Cov(X , C)/Var(C), in which case

var(X (b)) = (1− ρ2XC)var(X ) < var(X ).

A Special CaseI Another situation is when we know only that E(C) is equal

to µ. Then, we can form X (b) = bX + (1− b)C.I It is easy to show that if C is Correlated with X , we can

always choose a proper b so that X (b) has a smallervariance than X .

Zhang J.T. Ch4. Variance Reduction Techniques

Page 64: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Antithetic Variates MethodThe Main Idea

I Suppose U is a random number used in the production ofa sample X that follows a distribution with cdf F, that is,X = F−1(U), then X ′ = F−1(1− U) also followsdistribution F .

I More generally, if g is a monotone function, then

[g(u1)− g(u2)][g(1− u1)− g(1− u2)] ≤ 0

for any u1, u2 ∈ [0, 1].

Zhang J.T. Ch4. Variance Reduction Techniques

Page 65: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Antithetic Variates MethodThe Main Idea

I Suppose U is a random number used in the production ofa sample X that follows a distribution with cdf F, that is,X = F−1(U), then X ′ = F−1(1− U) also followsdistribution F .

I More generally, if g is a monotone function, then

[g(u1)− g(u2)][g(1− u1)− g(1− u2)] ≤ 0

for any u1, u2 ∈ [0, 1].

Zhang J.T. Ch4. Variance Reduction Techniques

Page 66: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I For two independent uniform random variable U1 and U2,we have

E{[g(U1)−g(U2)][g(1−U1)−g(1−U2)]} = Cov(X , X ′) ≤ 0,

where X = g(U) and X ′ = g(1− U).I Therefore, var[(X + X ′)/2] ≤ var(X )/2, implying that using

the pair X and X ′ is better than using two independentMonte Carlo draws for estimating E(X ).

Zhang J.T. Ch4. Variance Reduction Techniques

Page 67: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I For two independent uniform random variable U1 and U2,we have

E{[g(U1)−g(U2)][g(1−U1)−g(1−U2)]} = Cov(X , X ′) ≤ 0,

where X = g(U) and X ′ = g(1− U).I Therefore, var[(X + X ′)/2] ≤ var(X )/2, implying that using

the pair X and X ′ is better than using two independentMonte Carlo draws for estimating E(X ).

Zhang J.T. Ch4. Variance Reduction Techniques

Page 68: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Example 2I We return once more to the problem of estimating the

integral I =∫ 1

0 4√

1− x2dx .I Choose a large even value of n. As usual, our Simple

Estimator and its Variance are

In =1n

n∑

i=1

h(Ui), var(In) = 0.797/n.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 69: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

Example 2I We return once more to the problem of estimating the

integral I =∫ 1

0 4√

1− x2dx .I Choose a large even value of n. As usual, our Simple

Estimator and its Variance are

In =1n

n∑

i=1

h(Ui), var(In) = 0.797/n.

Zhang J.T. Ch4. Variance Reduction Techniques

Page 70: Ch4. Variance Reduction Techniques

IntroductionThe Basic Problem

Variance Reduction Techniques

I Our corresponding Antithetic Estimator and its Varianceare

IAnn =

1n

n/2∑

i=1

(h(Ui) + h(1− Ui)).

var(IAnn ) =

1n2 {

n2

[var(h(U1) + 2Cov(h(U1), h(1− U1))

+ var(h(1− U1))]}

=1n

[var(h(U1) + Cov(h(U1), h(1− U1))]

= 0.219/n

Zhang J.T. Ch4. Variance Reduction Techniques