18.600: lecture 30 .1in weak law of large numbersmath.mit.edu/~sheffield/600/lecture30.pdf ·...
TRANSCRIPT
![Page 1: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/1.jpg)
18.600: Lecture 30
Weak law of large numbers
Scott Sheffield
MIT
18.600 Lecture 30
![Page 2: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/2.jpg)
Outline
Weak law of large numbers: Markov/Chebyshev approach
Weak law of large numbers: characteristic function approach
18.600 Lecture 30
![Page 3: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/3.jpg)
Outline
Weak law of large numbers: Markov/Chebyshev approach
Weak law of large numbers: characteristic function approach
18.600 Lecture 30
![Page 4: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/4.jpg)
Markov’s and Chebyshev’s inequalities
I Markov’s inequality: Let X be a random variable taking onlynon-negative values. Fix a constant a > 0. ThenP{X ≥ a} ≤ E [X ]
a .
I Proof: Consider a random variable Y defined by
Y =
{a X ≥ a
0 X < a. Since X ≥ Y with probability one, it
follows that E [X ] ≥ E [Y ] = aP{X ≥ a}. Divide both sides bya to get Markov’s inequality.
I Chebyshev’s inequality: If X has finite mean µ, variance σ2,and k > 0 then
P{|X − µ| ≥ k} ≤ σ2
k2.
I Proof: Note that (X − µ)2 is a non-negative random variableand P{|X − µ| ≥ k} = P{(X − µ)2 ≥ k2}. Now applyMarkov’s inequality with a = k2.
18.600 Lecture 30
![Page 5: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/5.jpg)
Markov’s and Chebyshev’s inequalities
I Markov’s inequality: Let X be a random variable taking onlynon-negative values. Fix a constant a > 0. ThenP{X ≥ a} ≤ E [X ]
a .
I Proof: Consider a random variable Y defined by
Y =
{a X ≥ a
0 X < a. Since X ≥ Y with probability one, it
follows that E [X ] ≥ E [Y ] = aP{X ≥ a}. Divide both sides bya to get Markov’s inequality.
I Chebyshev’s inequality: If X has finite mean µ, variance σ2,and k > 0 then
P{|X − µ| ≥ k} ≤ σ2
k2.
I Proof: Note that (X − µ)2 is a non-negative random variableand P{|X − µ| ≥ k} = P{(X − µ)2 ≥ k2}. Now applyMarkov’s inequality with a = k2.
18.600 Lecture 30
![Page 6: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/6.jpg)
Markov’s and Chebyshev’s inequalities
I Markov’s inequality: Let X be a random variable taking onlynon-negative values. Fix a constant a > 0. ThenP{X ≥ a} ≤ E [X ]
a .
I Proof: Consider a random variable Y defined by
Y =
{a X ≥ a
0 X < a. Since X ≥ Y with probability one, it
follows that E [X ] ≥ E [Y ] = aP{X ≥ a}. Divide both sides bya to get Markov’s inequality.
I Chebyshev’s inequality: If X has finite mean µ, variance σ2,and k > 0 then
P{|X − µ| ≥ k} ≤ σ2
k2.
I Proof: Note that (X − µ)2 is a non-negative random variableand P{|X − µ| ≥ k} = P{(X − µ)2 ≥ k2}. Now applyMarkov’s inequality with a = k2.
18.600 Lecture 30
![Page 7: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/7.jpg)
Markov’s and Chebyshev’s inequalities
I Markov’s inequality: Let X be a random variable taking onlynon-negative values. Fix a constant a > 0. ThenP{X ≥ a} ≤ E [X ]
a .
I Proof: Consider a random variable Y defined by
Y =
{a X ≥ a
0 X < a. Since X ≥ Y with probability one, it
follows that E [X ] ≥ E [Y ] = aP{X ≥ a}. Divide both sides bya to get Markov’s inequality.
I Chebyshev’s inequality: If X has finite mean µ, variance σ2,and k > 0 then
P{|X − µ| ≥ k} ≤ σ2
k2.
I Proof: Note that (X − µ)2 is a non-negative random variableand P{|X − µ| ≥ k} = P{(X − µ)2 ≥ k2}. Now applyMarkov’s inequality with a = k2.
18.600 Lecture 30
![Page 8: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/8.jpg)
Markov and Chebyshev: rough idea
I Markov’s inequality: Let X be a random variable taking onlynon-negative values with finite mean. Fix a constant a > 0.Then P{X ≥ a} ≤ E [X ]
a .
I Chebyshev’s inequality: If X has finite mean µ, variance σ2,and k > 0 then
P{|X − µ| ≥ k} ≤ σ2
k2.
I Inequalities allow us to deduce limited information about adistribution when we know only the mean (Markov) or themean and variance (Chebyshev).
I Markov: if E [X ] is small, then it is not too likely that X islarge.
I Chebyshev: if σ2 = Var[X ] is small, then it is not too likelythat X is far from its mean.
18.600 Lecture 30
![Page 9: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/9.jpg)
Markov and Chebyshev: rough idea
I Markov’s inequality: Let X be a random variable taking onlynon-negative values with finite mean. Fix a constant a > 0.Then P{X ≥ a} ≤ E [X ]
a .
I Chebyshev’s inequality: If X has finite mean µ, variance σ2,and k > 0 then
P{|X − µ| ≥ k} ≤ σ2
k2.
I Inequalities allow us to deduce limited information about adistribution when we know only the mean (Markov) or themean and variance (Chebyshev).
I Markov: if E [X ] is small, then it is not too likely that X islarge.
I Chebyshev: if σ2 = Var[X ] is small, then it is not too likelythat X is far from its mean.
18.600 Lecture 30
![Page 10: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/10.jpg)
Markov and Chebyshev: rough idea
I Markov’s inequality: Let X be a random variable taking onlynon-negative values with finite mean. Fix a constant a > 0.Then P{X ≥ a} ≤ E [X ]
a .
I Chebyshev’s inequality: If X has finite mean µ, variance σ2,and k > 0 then
P{|X − µ| ≥ k} ≤ σ2
k2.
I Inequalities allow us to deduce limited information about adistribution when we know only the mean (Markov) or themean and variance (Chebyshev).
I Markov: if E [X ] is small, then it is not too likely that X islarge.
I Chebyshev: if σ2 = Var[X ] is small, then it is not too likelythat X is far from its mean.
18.600 Lecture 30
![Page 11: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/11.jpg)
Markov and Chebyshev: rough idea
I Markov’s inequality: Let X be a random variable taking onlynon-negative values with finite mean. Fix a constant a > 0.Then P{X ≥ a} ≤ E [X ]
a .
I Chebyshev’s inequality: If X has finite mean µ, variance σ2,and k > 0 then
P{|X − µ| ≥ k} ≤ σ2
k2.
I Inequalities allow us to deduce limited information about adistribution when we know only the mean (Markov) or themean and variance (Chebyshev).
I Markov: if E [X ] is small, then it is not too likely that X islarge.
I Chebyshev: if σ2 = Var[X ] is small, then it is not too likelythat X is far from its mean.
18.600 Lecture 30
![Page 12: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/12.jpg)
Markov and Chebyshev: rough idea
I Markov’s inequality: Let X be a random variable taking onlynon-negative values with finite mean. Fix a constant a > 0.Then P{X ≥ a} ≤ E [X ]
a .
I Chebyshev’s inequality: If X has finite mean µ, variance σ2,and k > 0 then
P{|X − µ| ≥ k} ≤ σ2
k2.
I Inequalities allow us to deduce limited information about adistribution when we know only the mean (Markov) or themean and variance (Chebyshev).
I Markov: if E [X ] is small, then it is not too likely that X islarge.
I Chebyshev: if σ2 = Var[X ] is small, then it is not too likelythat X is far from its mean.
18.600 Lecture 30
![Page 13: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/13.jpg)
Statement of weak law of large numbers
I Suppose Xi are i.i.d. random variables with mean µ.
I Then the value An := X1+X2+...+Xnn is called the empirical
average of the first n trials.
I We’d guess that when n is large, An is typically close to µ.
I Indeed, weak law of large numbers states that for all ε > 0we have limn→∞ P{|An − µ| > ε} = 0.
I Example: as n tends to infinity, the probability of seeing morethan .50001n heads in n fair coin tosses tends to zero.
18.600 Lecture 30
![Page 14: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/14.jpg)
Statement of weak law of large numbers
I Suppose Xi are i.i.d. random variables with mean µ.
I Then the value An := X1+X2+...+Xnn is called the empirical
average of the first n trials.
I We’d guess that when n is large, An is typically close to µ.
I Indeed, weak law of large numbers states that for all ε > 0we have limn→∞ P{|An − µ| > ε} = 0.
I Example: as n tends to infinity, the probability of seeing morethan .50001n heads in n fair coin tosses tends to zero.
18.600 Lecture 30
![Page 15: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/15.jpg)
Statement of weak law of large numbers
I Suppose Xi are i.i.d. random variables with mean µ.
I Then the value An := X1+X2+...+Xnn is called the empirical
average of the first n trials.
I We’d guess that when n is large, An is typically close to µ.
I Indeed, weak law of large numbers states that for all ε > 0we have limn→∞ P{|An − µ| > ε} = 0.
I Example: as n tends to infinity, the probability of seeing morethan .50001n heads in n fair coin tosses tends to zero.
18.600 Lecture 30
![Page 16: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/16.jpg)
Statement of weak law of large numbers
I Suppose Xi are i.i.d. random variables with mean µ.
I Then the value An := X1+X2+...+Xnn is called the empirical
average of the first n trials.
I We’d guess that when n is large, An is typically close to µ.
I Indeed, weak law of large numbers states that for all ε > 0we have limn→∞ P{|An − µ| > ε} = 0.
I Example: as n tends to infinity, the probability of seeing morethan .50001n heads in n fair coin tosses tends to zero.
18.600 Lecture 30
![Page 17: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/17.jpg)
Statement of weak law of large numbers
I Suppose Xi are i.i.d. random variables with mean µ.
I Then the value An := X1+X2+...+Xnn is called the empirical
average of the first n trials.
I We’d guess that when n is large, An is typically close to µ.
I Indeed, weak law of large numbers states that for all ε > 0we have limn→∞ P{|An − µ| > ε} = 0.
I Example: as n tends to infinity, the probability of seeing morethan .50001n heads in n fair coin tosses tends to zero.
18.600 Lecture 30
![Page 18: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/18.jpg)
Proof of weak law of large numbers in finite variance case
I As above, let Xi be i.i.d. random variables with mean µ andwrite An := X1+X2+...+Xn
n .
I By additivity of expectation, E[An] = µ.
I Similarly, Var[An] = nσ2
n2= σ2/n.
I By Chebyshev P{|An − µ| ≥ ε
}≤ Var[An]
ε2= σ2
nε2.
I No matter how small ε is, RHS will tend to zero as n getslarge.
18.600 Lecture 30
![Page 19: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/19.jpg)
Proof of weak law of large numbers in finite variance case
I As above, let Xi be i.i.d. random variables with mean µ andwrite An := X1+X2+...+Xn
n .
I By additivity of expectation, E[An] = µ.
I Similarly, Var[An] = nσ2
n2= σ2/n.
I By Chebyshev P{|An − µ| ≥ ε
}≤ Var[An]
ε2= σ2
nε2.
I No matter how small ε is, RHS will tend to zero as n getslarge.
18.600 Lecture 30
![Page 20: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/20.jpg)
Proof of weak law of large numbers in finite variance case
I As above, let Xi be i.i.d. random variables with mean µ andwrite An := X1+X2+...+Xn
n .
I By additivity of expectation, E[An] = µ.
I Similarly, Var[An] = nσ2
n2= σ2/n.
I By Chebyshev P{|An − µ| ≥ ε
}≤ Var[An]
ε2= σ2
nε2.
I No matter how small ε is, RHS will tend to zero as n getslarge.
18.600 Lecture 30
![Page 21: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/21.jpg)
Proof of weak law of large numbers in finite variance case
I As above, let Xi be i.i.d. random variables with mean µ andwrite An := X1+X2+...+Xn
n .
I By additivity of expectation, E[An] = µ.
I Similarly, Var[An] = nσ2
n2= σ2/n.
I By Chebyshev P{|An − µ| ≥ ε
}≤ Var[An]
ε2= σ2
nε2.
I No matter how small ε is, RHS will tend to zero as n getslarge.
18.600 Lecture 30
![Page 22: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/22.jpg)
Proof of weak law of large numbers in finite variance case
I As above, let Xi be i.i.d. random variables with mean µ andwrite An := X1+X2+...+Xn
n .
I By additivity of expectation, E[An] = µ.
I Similarly, Var[An] = nσ2
n2= σ2/n.
I By Chebyshev P{|An − µ| ≥ ε
}≤ Var[An]
ε2= σ2
nε2.
I No matter how small ε is, RHS will tend to zero as n getslarge.
18.600 Lecture 30
![Page 23: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/23.jpg)
Outline
Weak law of large numbers: Markov/Chebyshev approach
Weak law of large numbers: characteristic function approach
18.600 Lecture 30
![Page 24: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/24.jpg)
Outline
Weak law of large numbers: Markov/Chebyshev approach
Weak law of large numbers: characteristic function approach
18.600 Lecture 30
![Page 25: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/25.jpg)
Extent of weak law
I Question: does the weak law of large numbers apply nomatter what the probability distribution for X is?
I Is it always the case that if we define An := X1+X2+...+Xnn then
An is typically close to some fixed value when n is large?
I What if X is Cauchy?
I Recall that in this strange case An actually has the sameprobability distribution as X .
I In particular, the An are not tightly concentrated around anyparticular value even when n is very large.
I But in this case E [|X |] was infinite. Does the weak law holdas long as E [|X |] is finite, so that µ is well defined?
I Yes. Can prove this using characteristic functions.
18.600 Lecture 30
![Page 26: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/26.jpg)
Extent of weak law
I Question: does the weak law of large numbers apply nomatter what the probability distribution for X is?
I Is it always the case that if we define An := X1+X2+...+Xnn then
An is typically close to some fixed value when n is large?
I What if X is Cauchy?
I Recall that in this strange case An actually has the sameprobability distribution as X .
I In particular, the An are not tightly concentrated around anyparticular value even when n is very large.
I But in this case E [|X |] was infinite. Does the weak law holdas long as E [|X |] is finite, so that µ is well defined?
I Yes. Can prove this using characteristic functions.
18.600 Lecture 30
![Page 27: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/27.jpg)
Extent of weak law
I Question: does the weak law of large numbers apply nomatter what the probability distribution for X is?
I Is it always the case that if we define An := X1+X2+...+Xnn then
An is typically close to some fixed value when n is large?
I What if X is Cauchy?
I Recall that in this strange case An actually has the sameprobability distribution as X .
I In particular, the An are not tightly concentrated around anyparticular value even when n is very large.
I But in this case E [|X |] was infinite. Does the weak law holdas long as E [|X |] is finite, so that µ is well defined?
I Yes. Can prove this using characteristic functions.
18.600 Lecture 30
![Page 28: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/28.jpg)
Extent of weak law
I Question: does the weak law of large numbers apply nomatter what the probability distribution for X is?
I Is it always the case that if we define An := X1+X2+...+Xnn then
An is typically close to some fixed value when n is large?
I What if X is Cauchy?
I Recall that in this strange case An actually has the sameprobability distribution as X .
I In particular, the An are not tightly concentrated around anyparticular value even when n is very large.
I But in this case E [|X |] was infinite. Does the weak law holdas long as E [|X |] is finite, so that µ is well defined?
I Yes. Can prove this using characteristic functions.
18.600 Lecture 30
![Page 29: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/29.jpg)
Extent of weak law
I Question: does the weak law of large numbers apply nomatter what the probability distribution for X is?
I Is it always the case that if we define An := X1+X2+...+Xnn then
An is typically close to some fixed value when n is large?
I What if X is Cauchy?
I Recall that in this strange case An actually has the sameprobability distribution as X .
I In particular, the An are not tightly concentrated around anyparticular value even when n is very large.
I But in this case E [|X |] was infinite. Does the weak law holdas long as E [|X |] is finite, so that µ is well defined?
I Yes. Can prove this using characteristic functions.
18.600 Lecture 30
![Page 30: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/30.jpg)
Extent of weak law
I Question: does the weak law of large numbers apply nomatter what the probability distribution for X is?
I Is it always the case that if we define An := X1+X2+...+Xnn then
An is typically close to some fixed value when n is large?
I What if X is Cauchy?
I Recall that in this strange case An actually has the sameprobability distribution as X .
I In particular, the An are not tightly concentrated around anyparticular value even when n is very large.
I But in this case E [|X |] was infinite. Does the weak law holdas long as E [|X |] is finite, so that µ is well defined?
I Yes. Can prove this using characteristic functions.
18.600 Lecture 30
![Page 31: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/31.jpg)
Extent of weak law
I Question: does the weak law of large numbers apply nomatter what the probability distribution for X is?
I Is it always the case that if we define An := X1+X2+...+Xnn then
An is typically close to some fixed value when n is large?
I What if X is Cauchy?
I Recall that in this strange case An actually has the sameprobability distribution as X .
I In particular, the An are not tightly concentrated around anyparticular value even when n is very large.
I But in this case E [|X |] was infinite. Does the weak law holdas long as E [|X |] is finite, so that µ is well defined?
I Yes. Can prove this using characteristic functions.
18.600 Lecture 30
![Page 32: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/32.jpg)
Characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I But characteristic functions have an advantage: they are welldefined at all t for all random variables X .
18.600 Lecture 30
![Page 33: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/33.jpg)
Characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I But characteristic functions have an advantage: they are welldefined at all t for all random variables X .
18.600 Lecture 30
![Page 34: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/34.jpg)
Characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I But characteristic functions have an advantage: they are welldefined at all t for all random variables X .
18.600 Lecture 30
![Page 35: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/35.jpg)
Characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I But characteristic functions have an advantage: they are welldefined at all t for all random variables X .
18.600 Lecture 30
![Page 36: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/36.jpg)
Characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I But characteristic functions have an advantage: they are welldefined at all t for all random variables X .
18.600 Lecture 30
![Page 37: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/37.jpg)
Characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I But characteristic functions have an advantage: they are welldefined at all t for all random variables X .
18.600 Lecture 30
![Page 38: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/38.jpg)
Characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I But characteristic functions have an advantage: they are welldefined at all t for all random variables X .
18.600 Lecture 30
![Page 39: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/39.jpg)
Characteristic functions
I Let X be a random variable.
I The characteristic function of X is defined byφ(t) = φX (t) := E [e itX ]. Like M(t) except with i thrown in.
I Recall that by definition e it = cos(t) + i sin(t).
I Characteristic functions are similar to moment generatingfunctions in some ways.
I For example, φX+Y = φXφY , just as MX+Y = MXMY , if Xand Y are independent.
I And φaX (t) = φX (at) just as MaX (t) = MX (at).
I And if X has an mth moment then E [Xm] = imφ(m)X (0).
I But characteristic functions have an advantage: they are welldefined at all t for all random variables X .
18.600 Lecture 30
![Page 40: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/40.jpg)
Continuity theorems
I Let X be a random variable and Xn a sequence of randomvariables.
I Say Xn converge in distribution or converge in law to X iflimn→∞ FXn(x) = FX (x) at all x ∈ R at which FX iscontinuous.
I The weak law of large numbers can be rephrased as thestatement that An converges in law to µ (i.e., to the randomvariable that is equal to µ with probability one).
I Levy’s continuity theorem (see Wikipedia): if
limn→∞
φXn(t) = φX (t)
for all t, then Xn converge in law to X .I By this theorem, we can prove the weak law of large numbers
by showing limn→∞ φAn(t) = φµ(t) = e itµ for all t. In thespecial case that µ = 0, this amounts to showinglimn→∞ φAn(t) = 1 for all t.
18.600 Lecture 30
![Page 41: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/41.jpg)
Continuity theorems
I Let X be a random variable and Xn a sequence of randomvariables.
I Say Xn converge in distribution or converge in law to X iflimn→∞ FXn(x) = FX (x) at all x ∈ R at which FX iscontinuous.
I The weak law of large numbers can be rephrased as thestatement that An converges in law to µ (i.e., to the randomvariable that is equal to µ with probability one).
I Levy’s continuity theorem (see Wikipedia): if
limn→∞
φXn(t) = φX (t)
for all t, then Xn converge in law to X .I By this theorem, we can prove the weak law of large numbers
by showing limn→∞ φAn(t) = φµ(t) = e itµ for all t. In thespecial case that µ = 0, this amounts to showinglimn→∞ φAn(t) = 1 for all t.
18.600 Lecture 30
![Page 42: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/42.jpg)
Continuity theorems
I Let X be a random variable and Xn a sequence of randomvariables.
I Say Xn converge in distribution or converge in law to X iflimn→∞ FXn(x) = FX (x) at all x ∈ R at which FX iscontinuous.
I The weak law of large numbers can be rephrased as thestatement that An converges in law to µ (i.e., to the randomvariable that is equal to µ with probability one).
I Levy’s continuity theorem (see Wikipedia): if
limn→∞
φXn(t) = φX (t)
for all t, then Xn converge in law to X .I By this theorem, we can prove the weak law of large numbers
by showing limn→∞ φAn(t) = φµ(t) = e itµ for all t. In thespecial case that µ = 0, this amounts to showinglimn→∞ φAn(t) = 1 for all t.
18.600 Lecture 30
![Page 43: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/43.jpg)
Continuity theorems
I Let X be a random variable and Xn a sequence of randomvariables.
I Say Xn converge in distribution or converge in law to X iflimn→∞ FXn(x) = FX (x) at all x ∈ R at which FX iscontinuous.
I The weak law of large numbers can be rephrased as thestatement that An converges in law to µ (i.e., to the randomvariable that is equal to µ with probability one).
I Levy’s continuity theorem (see Wikipedia): if
limn→∞
φXn(t) = φX (t)
for all t, then Xn converge in law to X .
I By this theorem, we can prove the weak law of large numbersby showing limn→∞ φAn(t) = φµ(t) = e itµ for all t. In thespecial case that µ = 0, this amounts to showinglimn→∞ φAn(t) = 1 for all t.
18.600 Lecture 30
![Page 44: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/44.jpg)
Continuity theorems
I Let X be a random variable and Xn a sequence of randomvariables.
I Say Xn converge in distribution or converge in law to X iflimn→∞ FXn(x) = FX (x) at all x ∈ R at which FX iscontinuous.
I The weak law of large numbers can be rephrased as thestatement that An converges in law to µ (i.e., to the randomvariable that is equal to µ with probability one).
I Levy’s continuity theorem (see Wikipedia): if
limn→∞
φXn(t) = φX (t)
for all t, then Xn converge in law to X .I By this theorem, we can prove the weak law of large numbers
by showing limn→∞ φAn(t) = φµ(t) = e itµ for all t. In thespecial case that µ = 0, this amounts to showinglimn→∞ φAn(t) = 1 for all t.
18.600 Lecture 30
![Page 45: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/45.jpg)
Proof of weak law of large numbers in finite mean case
I As above, let Xi be i.i.d. instances of random variable X withmean zero. Write An := X1+X2+...+Xn
n . Weak law of largenumbers holds for i.i.d. instances of X if and only if it holdsfor i.i.d. instances of X − µ. Thus it suffices to prove theweak law in the mean zero case.
I Consider the characteristic function φX (t) = E [e itX ].
I Since E [X ] = 0, we have φ′X (0) = E [ ∂∂t eitX ]t=0 = iE [X ] = 0.
I Write g(t) = log φX (t) so φX (t) = eg(t). Then g(0) = 0 and
(by chain rule) g ′(0) = limε→0g(ε)−g(0)
ε = limε→0g(ε)ε = 0.
I Now φAn(t) = φX (t/n)n = eng(t/n). Since g(0) = g ′(0) = 0
we have limn→∞ ng(t/n) = limn→∞ tg( t
n)
tn
= 0 if t is fixed.
Thus limn→∞ eng(t/n) = 1 for all t.
I By Levy’s continuity theorem, the An converge in law to 0(i.e., to the random variable that is 0 with probability one).
18.600 Lecture 30
![Page 46: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/46.jpg)
Proof of weak law of large numbers in finite mean case
I As above, let Xi be i.i.d. instances of random variable X withmean zero. Write An := X1+X2+...+Xn
n . Weak law of largenumbers holds for i.i.d. instances of X if and only if it holdsfor i.i.d. instances of X − µ. Thus it suffices to prove theweak law in the mean zero case.
I Consider the characteristic function φX (t) = E [e itX ].
I Since E [X ] = 0, we have φ′X (0) = E [ ∂∂t eitX ]t=0 = iE [X ] = 0.
I Write g(t) = log φX (t) so φX (t) = eg(t). Then g(0) = 0 and
(by chain rule) g ′(0) = limε→0g(ε)−g(0)
ε = limε→0g(ε)ε = 0.
I Now φAn(t) = φX (t/n)n = eng(t/n). Since g(0) = g ′(0) = 0
we have limn→∞ ng(t/n) = limn→∞ tg( t
n)
tn
= 0 if t is fixed.
Thus limn→∞ eng(t/n) = 1 for all t.
I By Levy’s continuity theorem, the An converge in law to 0(i.e., to the random variable that is 0 with probability one).
18.600 Lecture 30
![Page 47: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/47.jpg)
Proof of weak law of large numbers in finite mean case
I As above, let Xi be i.i.d. instances of random variable X withmean zero. Write An := X1+X2+...+Xn
n . Weak law of largenumbers holds for i.i.d. instances of X if and only if it holdsfor i.i.d. instances of X − µ. Thus it suffices to prove theweak law in the mean zero case.
I Consider the characteristic function φX (t) = E [e itX ].
I Since E [X ] = 0, we have φ′X (0) = E [ ∂∂t eitX ]t=0 = iE [X ] = 0.
I Write g(t) = log φX (t) so φX (t) = eg(t). Then g(0) = 0 and
(by chain rule) g ′(0) = limε→0g(ε)−g(0)
ε = limε→0g(ε)ε = 0.
I Now φAn(t) = φX (t/n)n = eng(t/n). Since g(0) = g ′(0) = 0
we have limn→∞ ng(t/n) = limn→∞ tg( t
n)
tn
= 0 if t is fixed.
Thus limn→∞ eng(t/n) = 1 for all t.
I By Levy’s continuity theorem, the An converge in law to 0(i.e., to the random variable that is 0 with probability one).
18.600 Lecture 30
![Page 48: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/48.jpg)
Proof of weak law of large numbers in finite mean case
I As above, let Xi be i.i.d. instances of random variable X withmean zero. Write An := X1+X2+...+Xn
n . Weak law of largenumbers holds for i.i.d. instances of X if and only if it holdsfor i.i.d. instances of X − µ. Thus it suffices to prove theweak law in the mean zero case.
I Consider the characteristic function φX (t) = E [e itX ].
I Since E [X ] = 0, we have φ′X (0) = E [ ∂∂t eitX ]t=0 = iE [X ] = 0.
I Write g(t) = log φX (t) so φX (t) = eg(t). Then g(0) = 0 and
(by chain rule) g ′(0) = limε→0g(ε)−g(0)
ε = limε→0g(ε)ε = 0.
I Now φAn(t) = φX (t/n)n = eng(t/n). Since g(0) = g ′(0) = 0
we have limn→∞ ng(t/n) = limn→∞ tg( t
n)
tn
= 0 if t is fixed.
Thus limn→∞ eng(t/n) = 1 for all t.
I By Levy’s continuity theorem, the An converge in law to 0(i.e., to the random variable that is 0 with probability one).
18.600 Lecture 30
![Page 49: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/49.jpg)
Proof of weak law of large numbers in finite mean case
I As above, let Xi be i.i.d. instances of random variable X withmean zero. Write An := X1+X2+...+Xn
n . Weak law of largenumbers holds for i.i.d. instances of X if and only if it holdsfor i.i.d. instances of X − µ. Thus it suffices to prove theweak law in the mean zero case.
I Consider the characteristic function φX (t) = E [e itX ].
I Since E [X ] = 0, we have φ′X (0) = E [ ∂∂t eitX ]t=0 = iE [X ] = 0.
I Write g(t) = log φX (t) so φX (t) = eg(t). Then g(0) = 0 and
(by chain rule) g ′(0) = limε→0g(ε)−g(0)
ε = limε→0g(ε)ε = 0.
I Now φAn(t) = φX (t/n)n = eng(t/n). Since g(0) = g ′(0) = 0
we have limn→∞ ng(t/n) = limn→∞ tg( t
n)
tn
= 0 if t is fixed.
Thus limn→∞ eng(t/n) = 1 for all t.
I By Levy’s continuity theorem, the An converge in law to 0(i.e., to the random variable that is 0 with probability one).
18.600 Lecture 30
![Page 50: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/50.jpg)
Proof of weak law of large numbers in finite mean case
I As above, let Xi be i.i.d. instances of random variable X withmean zero. Write An := X1+X2+...+Xn
n . Weak law of largenumbers holds for i.i.d. instances of X if and only if it holdsfor i.i.d. instances of X − µ. Thus it suffices to prove theweak law in the mean zero case.
I Consider the characteristic function φX (t) = E [e itX ].
I Since E [X ] = 0, we have φ′X (0) = E [ ∂∂t eitX ]t=0 = iE [X ] = 0.
I Write g(t) = log φX (t) so φX (t) = eg(t). Then g(0) = 0 and
(by chain rule) g ′(0) = limε→0g(ε)−g(0)
ε = limε→0g(ε)ε = 0.
I Now φAn(t) = φX (t/n)n = eng(t/n). Since g(0) = g ′(0) = 0
we have limn→∞ ng(t/n) = limn→∞ tg( t
n)
tn
= 0 if t is fixed.
Thus limn→∞ eng(t/n) = 1 for all t.
I By Levy’s continuity theorem, the An converge in law to 0(i.e., to the random variable that is 0 with probability one).
18.600 Lecture 30
![Page 51: 18.600: Lecture 30 .1in Weak law of large numbersmath.mit.edu/~sheffield/600/Lecture30.pdf · 18.600: Lecture 30 Weak law of large numbers Scott She eld MIT 18.600 Lecture 30](https://reader033.vdocument.in/reader033/viewer/2022050323/5f7cc04903bc6600757ef4a0/html5/thumbnails/51.jpg)
Proof of weak law of large numbers in finite mean case
I As above, let Xi be i.i.d. instances of random variable X withmean zero. Write An := X1+X2+...+Xn
n . Weak law of largenumbers holds for i.i.d. instances of X if and only if it holdsfor i.i.d. instances of X − µ. Thus it suffices to prove theweak law in the mean zero case.
I Consider the characteristic function φX (t) = E [e itX ].
I Since E [X ] = 0, we have φ′X (0) = E [ ∂∂t eitX ]t=0 = iE [X ] = 0.
I Write g(t) = log φX (t) so φX (t) = eg(t). Then g(0) = 0 and
(by chain rule) g ′(0) = limε→0g(ε)−g(0)
ε = limε→0g(ε)ε = 0.
I Now φAn(t) = φX (t/n)n = eng(t/n). Since g(0) = g ′(0) = 0
we have limn→∞ ng(t/n) = limn→∞ tg( t
n)
tn
= 0 if t is fixed.
Thus limn→∞ eng(t/n) = 1 for all t.
I By Levy’s continuity theorem, the An converge in law to 0(i.e., to the random variable that is 0 with probability one).
18.600 Lecture 30