cs#109 lecture#19 may 9th, 2016 - stanford...
TRANSCRIPT
![Page 1: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/1.jpg)
CS 109Lecture 19
May 9th, 2016
![Page 2: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/2.jpg)
0
5
10
15
20
25
30
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
105
110
115
120
Midterm Distribution
E[X] = 87√Var(X) = 15median = 90
![Page 3: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/3.jpg)
0
5
10
15
20
25
30
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
105
110
115
120
A+
AA-B+
B
B-
Cs
Midterm Distribution
E[X] = 87√Var(X) = 15median = 90
![Page 4: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/4.jpg)
0
5
10
15
20
25
30
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
105
110
115
120
Core Understanding
AdvancedUnderstanding
Midterm Distribution
E[X] = 87√Var(X) = 15median = 90
![Page 5: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/5.jpg)
0
5
10
15
20
25
30
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
105
110
115
120
Midterm Beta
E[X] = 87√Var(X) = 15median = 90
ɑ = 7.1β = 2.7
![Page 6: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/6.jpg)
0.0
0.2
0.4
0.6
0.8
1.0
0 20 40 60 80 100 120
0.000.020.040.060.080.100.120.14
0 20 40 60 80 100 120
You can interpret this as a percentile
function
Derivative of percentile per point
CS109 Midterm CDF
CS109 Midterm PDF
Midterm Beta
![Page 7: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/7.jpg)
0
50
100
2 4 6 8 10 12 14 16 18 20 22 24
Problem 1
0
50
100
2 4 6 8 10 12 14 16 18 20
Problem 2
0
50
100
2 4 6 8 10 12 14 16 18 20
Problem 3
0
50
100
2 4 6 8 10 12 14 16 18 20
Problem 4
0
50
100
2 4 6 8 10 12
Problem 5
0
50
100
2 4 6 8 10 12 14 16 18 20 22 24
Problem 6
Midterm Question Correlations
![Page 8: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/8.jpg)
1.a
1.b
1.c
1.d
2.a
2.b2.c
3.a
3.b
3.c
4.a4.b
4.c
5 6.a 6.b
6.c 0.600.400.22
Correlation
*Correlations bellow 0.22 are not shown **Almost all correlations were positive
Midterm Question Correlations
![Page 9: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/9.jpg)
y = 0.46x + 2.6
-1
0
1
2
3
4
5
6
7
8
-1 0 1 2 3 4 5 6 7 8
Points on 6.a
Poin
ts o
n 6.
cJoint Probability Mass Function 6.a,6.cJoint PMF for Questions 6.a,6.c
![Page 10: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/10.jpg)
y = 0.2x + 11
0
2
4
6
8
10
12
14
16
18
20
0 4 8 12 16 20 24
Joint Probability Mass Function 4, 6
Total Points on 6
Tota
l Poi
nts o
n 4
Joint PMF for Questions 4,6
![Page 11: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/11.jpg)
Joint Probability Mass Function 4, 6Conditional Expectation• Let X be your score on problem 4. • Let Y be your score on problem 6.• E[X | Y]?
02468
101214161820
0 4 8 12 16 20 24
y
E[X|Y = y]
I use standard error because expectation is from a sample
![Page 12: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/12.jpg)
Four Prototypical Trajectories
1. Inequalities
![Page 13: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/13.jpg)
• In many cases, we don’t know the true form of a probability distribution§ E.g., Midterm scores§ But, we know the mean
§ May also have other measures/propertieso Variance
o Non-negativity
o Etc.
§ Inequalities and bounds still allow us to say something about the probability distribution in such caseso May be imprecise compared to knowing true distribution!
Inequality, Probability and Joviality
![Page 14: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/14.jpg)
• Say X is a non-negative random variable
• Proof:§ I = 1 if X ≥ a, 0 otherwise
§
§ Taking expectations:
0 ,][)( >≤≥ aaXEaXP all for
,0 aXIX ≤≥Since
aXE
aXEaXPIE ][ )(][ =⎥⎦
⎤⎢⎣
⎡≤≥=
Markov’s Inequality
![Page 15: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/15.jpg)
• Andrey Andreyevich Markov (1856-1922) was a Russian mathematician
§ Markov’s Inequality is named after him§ He also invented Markov Chains…
o …which are the basis for Google’s PageRank algorithm
Andrey Markov
![Page 16: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/16.jpg)
• Statistics from a previous quarter’s CS109 midterm§ X = midterm score
§ Using sample mean X = 86.7 ≈ E[X]
§ What is P(X ≥ 100)?
§ Markov bound: ≤ 86.7% of class scored 100 or greater
§ In fact, 20.1% of class scored 100 or greatero Markov inequality can be a very loose bound
o But, it made no assumption at all about form of distribution!
867.01007.86
100][)100( ≈=≤≥XEXP
Markov and the Midterm
![Page 17: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/17.jpg)
• X is a random variable with E[X] = µ, Var(X) = σ2
• Proof:§ Since (X – µ)2 is non-negative random variable, apply Markov’s Inequality with a = k2
§ Note that: (X – µ)2 ≥ k2 ⇔ |X – µ| ≥ k, yielding:
0 ,)( 2
2
>≤≥− kk
kXP all forσµ
2
2
2
222 ])[())((
kkXEkXP σµ
µ =−
≤≥−
2
2
)(k
kXP σµ ≤≥−
Chebyshev’s Inequality
![Page 18: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/18.jpg)
• Pafnuty Lvovich Chebyshev (1821-1894) was also a Russian mathematician
§ Chebyshev’s Inequality is named after himo But actually formulated by his colleague Irénée-Jules Bienaymé
§ He was Markov’s doctoral advisoro And sometimes credited with first deriving Markov’s Inequality
§ There is a crater on the moon named in his honor
Pafnuty Chebyshev
![Page 19: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/19.jpg)
Four Prototypical Trajectories
2. Law of Large Numbers
![Page 20: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/20.jpg)
• Consider I.I.D. random variables X1, X2, ...§ Xi have distribution F with E[Xi] = µ and Var(Xi) = σ 2
§ Let
§ For any ε > 0:
• Proof:
§ By Chebyshev’s inequality:
0)( ⎯⎯→⎯≥− ∞→nXP εµ
[ ] µ== +++
nnXXXEXE ...21][ ( ) nn
nXXXX2...21Var)(Var σ== +++
0)( 2
2
⎯⎯ →⎯≤≥− ∞→n
nXP
εσ
εµ
∑=
=n
iiXn
X1
1
Weak Law of Large Numbers
2
2
)(k
kXP σµ ≤≥− Var(X)
k
![Page 21: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/21.jpg)
• Consider I.I.D. random variables X1, X2, ...§ Xi have distribution F with E[Xi] = µ
§ Let
§ Strong Law ⇒Weak Law, but not vice versa
§ Strong Law implies that for any ε > 0, there are only a finite number of values of n such that condition of Weak Law: holds.
∑=
=n
iiXn
X1
1
1...lim 21 =⎟⎟⎠
⎞⎜⎜⎝
⎛=⎟
⎠
⎞⎜⎝
⎛ +++∞→
µn
XXXP nn
εµ ≥−X
Strong Law of Large Numbers
![Page 22: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/22.jpg)
• Say we have repeated trials of an experiment§ Let event E = some outcome of experiment§ Let Xi = 1 if E occurs on trial i, 0 otherwise§ Strong Law of Large Numbers (Strong LLN) yields:
§ Recall first week of class:
§ Strong LLN justifies “frequency” notion of probability§ Misconception arising from LLN:
o Gambler’s fallacy: “I’m due for a win”o Consider being “due for a win” with repeated coin flips...
)(][...21 EPXEn
XXXi
n =→+++
nEnEP
n
)(lim)(∞→
=
Intuitions and Misconceptions of LLN
![Page 23: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/23.jpg)
• History of the Law of Large Numbers§ 1713: Weak LLN described by Jacob Bernoulli
§ 1835: Poisson calls it “La Loi des Grands Nombres”o That would be “Law of Large Numbers” in French
§ 1909: Émile Borel develops Strong LLN for Bernoulli random variables
§ 1928: Andrei Nikolaevich Kolmogorov proves Strong LLN in general case
La Loi des Grands Nombres
![Page 24: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/24.jpg)
And now a moment of silence...
...before we present...
...the greatest result of probability theory!
Silence!!
![Page 25: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/25.jpg)
Four Prototypical Trajectories
3. Central Limit Theorem
![Page 26: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/26.jpg)
![Page 27: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/27.jpg)
• Consider I.I.D. random variables X1, X2, ...§ Xi have distribution F with E[Xi] = µ and Var(Xi) = σ 2
§ Let:
§ Central Limit Theorem:
∑=
=n
iiXn
X1
1
∞→n ),(~2
asn
NX σµ
The Central Limit Theorem
http://onlinestatbook.com/stat_sim/sampling_dist/
![Page 28: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/28.jpg)
The Central Limit Theorem
• Consider I.I.D. random variables X1, X2, ...§ Xi have distribution F with E[Xi] = µ and Var(Xi) = σ 2
§ Let:
§ Recall where Z ~ N(0, 1):
∑=
=n
iiXn
X1
1∞→n ),(~
2
asn
NX σµ
n/2σ
µ−= XZ
( ) ( ) ( )nnX
nn
Xn
n
n
XnZ
nNX
n
i i
n
i in
i i
σ
µ
σ
µ
σ
µσµ
−=
⎥⎦
⎤⎢⎣
⎡ −=
−=⇔
∑∑∑=
== 12
1
2
1211
),(~
∞→→−+++ nN
nnXXX n )1 ,0(...21 as
σµ ~
Another form of the Central Limit Theorem
![Page 29: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/29.jpg)
1733
Once Upon a Time…Abraham De Moivre
![Page 30: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/30.jpg)
• History of the Central Limit Theorem§ 1733: CLT for X ~ Ber(1/2) postulated by Abraham de Moivre
§ 1823: Pierre-Simon Laplace extends de Moivre’s work to approximating Bin(n, p) with Normal
§ 1901: Aleksandr Lyapunov provides precise definition and rigorous proof of CLT
§ 2003: Charlie Sheen stars in television series “Two and a Half Men”o By end of the 7th (final) season, there were 161 episodes
o Mean quality of subsamples of episodes is Normally distributed (thanks to the Central Limit Theorem)
Once Upon a Time…
![Page 31: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/31.jpg)
• CLT is why many things in “real world” appear Normally distributed§ Many quantities are sum of independent variables§ Exams scores
o Sum of individual problems on the SATo Why does the CLT not apply to our midterm?
§ Election pollingo Ask 100 people if they will vote for candidate X (p1 = # “yes”/100)o Repeat this process with different groups to get p1, ... , pno Will have a normal distribution over pio Can produce a “confidence interval”
• How likely is it that estimate for true p is correct
Central Limit Theorem in the Real World
![Page 32: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/32.jpg)
• Consider I.I.D. Bernoulli variables X1, X2, ... With probability p§ Xi have distribution F with E[Xi] = µ and Var(Xi) = σ 2
§ Let: Let:∑=
=n
iiXn
X1
1
Binomial Approximation
Y = nX
Y ⇠ N(nµ, n2�2
n)
X ⇠ N(µ,�2) as n ! 1
Y ⇠ N(np, np(1� p))
![Page 33: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/33.jpg)
• Start with 370 midterm scores: X1, X2, ..., X370§ E[Xi] = 87 and Var(Xi) = 225§ Created 50 samples (Yi ) of size n = 10
§ Prediction by CLT:
048
1216
-‐3 -‐2 -‐1 0 1 2 3
Your Midterm on the Central Limit Theorem
Zi =Yi � E[Xi]p
�2/n=
Yi � 87p22.5
Yi ⇠ N(87, 22.5)
Z =1
50
50X
i=1
Zi = 4⇥ 10�16
Var(Z) = 0.997
![Page 34: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/34.jpg)
• Have new algorithm to test for running time§ Mean (clock) running time: µ = t sec.§ Variance of running time: σ2 = 4 sec2.§ Run algorithm repeatedly (I.I.D. trials), measure time
o How many trials so estimated time = t ± 0.5 with 95% certainty?o Xi = running time of i-th run (for 1 ≤ i ≤ n)
o By Central Limit Theorem, Z ~ N(0, 1), where:
Estimating Clock Running Time
Zn =(Pn
i=1 Xi)� nµ
�pn
=(Pn
i=1 Xi)� nt
2pn
0.95 = P (�0.5 Pn
i=1 Xi
n� t 0.5)
= P (�0.5
pn
2
Pni=1 Xi
n� t 0.5
pn
2)
= P (�0.5
pn
2
pn
2
Pni=1 Xi
n�
pn
2t 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi
2pn
�pnpn
pnt
2 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi � nt
2pn
0.5pn
2)
= P (�0.5
pn
2 Z 0.5
pn
2)
![Page 35: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/35.jpg)
Zn =(Pn
i=1 Xi)� nt
2pn
0.95 = P (�0.5 Pn
i=1 Xi
n� t 0.5)
= P (�0.5
pn
2
Pni=1 Xi
n� t 0.5
pn
2)
= P (�0.5
pn
2
pn
2
Pni=1 Xi
n�
pn
2t 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi
2pn
�pnpn
pnt
2 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi � nt
2pn
0.5pn
2)
= P (�0.5
pn
2 Z 0.5
pn
2)
0.95 = P (�0.5 Pn
i=1 Xi
n� t 0.5)
= P (�0.5
pn
2
Pni=1 Xi
n� t 0.5
pn
2)
= P (�0.5
pn
2
pn
2
Pni=1 Xi
n�
pn
2t 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi
2pn
�pnpn
pnt
2 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi � nt
2pn
0.5pn
2)
= P (�0.5
pn
2 Z 0.5
pn
2)
0.95 = P (�0.5 Pn
i=1 Xi
n� t 0.5)
= P (�0.5
pn
2
Pni=1 Xi
n� t 0.5
pn
2)
= P (�0.5
pn
2
pn
2
Pni=1 Xi
n�
pn
2t 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi
2pn
�pnpn
pnt
2 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi � nt
2pn
0.5pn
2)
= P (�0.5
pn
2 Z 0.5
pn
2)
0.95 = P (�0.5 Pn
i=1 Xi
n� t 0.5)
= P (�0.5
pn
2
Pni=1 Xi
n� t 0.5
pn
2)
= P (�0.5
pn
2
pn
2
Pni=1 Xi
n�
pn
2t 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi
2pn
�pnpn
pnt
2 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi � nt
2pn
0.5pn
2)
= P (�0.5
pn
2 Z 0.5
pn
2)
0.95 = P (�0.5 Pn
i=1 Xi
n� t 0.5)
= P (�0.5
pn
2
Pni=1 Xi
n� t 0.5
pn
2)
= P (�0.5
pn
2
pn
2
Pni=1 Xi
n�
pn
2t 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi
2pn
�pnpn
pnt
2 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi � nt
2pn
0.5pn
2)
= P (�0.5
pn
2 Z 0.5
pn
2)
0.95 = P (�0.5 Pn
i=1 Xi
n� t 0.5)
= P (�0.5
pn
2
Pni=1 Xi
n� t 0.5
pn
2)
= P (�0.5
pn
2
pn
2
Pni=1 Xi
n�
pn
2t 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi
2pn
�pnpn
pnt
2 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi � nt
2pn
0.5pn
2)
= P (�0.5
pn
2 Z 0.5
pn
2)
0.95 = P (�0.5 Pn
i=1 Xi
n� t 0.5)
= P (�0.5
pn
2
Pni=1 Xi
n� t 0.5
pn
2)
= P (�0.5
pn
2
pn
2
Pni=1 Xi
n�
pn
2t 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi
2pn
�pnpn
pnt
2 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi � nt
2pn
0.5pn
2)
= P (�0.5
pn
2 Z 0.5
pn
2)
![Page 36: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/36.jpg)
0.95 = �(
pn
4)� �(
pn
4)
= �(
pn
4)� (1� �(
pn
4))
= 2�(
pn
4)� 1
0.975 = �(
pn
4)
��1(0.975) =
pn
4
1.96 =
pn
4n = 61.4
0.95 = P (�0.5 Pn
i=1 Xi
n� t 0.5)
= P (�0.5
pn
2
Pni=1 Xi
n� t 0.5
pn
2)
= P (�0.5
pn
2
pn
2
Pni=1 Xi
n�
pn
2t 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi
2pn
�pnpn
pnt
2 0.5
pn
2)
= P (�0.5
pn
2
Pni=1 Xi � nt
2pn
0.5pn
2)
= P (�0.5
pn
2 Z 0.5
pn
2)0.95 = �(
pn
4)� �(
pn
4)
= �(
pn
4)� (1� �(
pn
4))
= 2�(
pn
4)� 1
0.95 = �(
pn
4)� �(
pn
4)
= �(
pn
4)� (1� �(
pn
4))
= 2�(
pn
4)� 1
0.975 = �(
pn
4)
��1(0.975) =
pn
4
1.96 =
pn
4n = 61.4
0.95 = �(
pn
4)� �(
pn
4)
= �(
pn
4)� (1� �(
pn
4))
= 2�(
pn
4)� 1
0.975 = �(
pn
4)
��1(0.975) =
pn
4
1.96 =
pn
4n = 61.4
0.95 = �(
pn
4)� �(
pn
4)
= �(
pn
4)� (1� �(
pn
4))
= 2�(
pn
4)� 1
0.975 = �(
pn
4)
��1(0.975) =
pn
4
1.96 =
pn
4n = 61.4
0.95 = �(
pn
4)� �(
pn
4)
= �(
pn
4)� (1� �(
pn
4))
= 2�(
pn
4)� 1
0.975 = �(
pn
4)
��1(0.975) =
pn
4
1.96 =
pn
4n = 61.4
0.95 = �(
pn
4)� �(
pn
4)
= �(
pn
4)� (1� �(
pn
4))
= 2�(
pn
4)� 1
0.975 = �(
pn
4)
��1(0.975) =
pn
4
1.96 =
pn
4n = 61.4
0.95 = �(
pn
4)� �(
pn
4)
= �(
pn
4)� (1� �(
pn
4))
= 2�(
pn
4)� 1
0.975 = �(
pn
4)
��1(0.975) =
pn
4
1.96 =
pn
4n = 61.4
![Page 37: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/37.jpg)
William Sealy Gosset(aka Student)
![Page 38: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/38.jpg)
• Have new algorithm to test for running time§ Mean (clock) running time: µ = t sec.§ Variance of running time: σ2 = 4 sec2.§ Run algorithm repeatedly (I.I.D. trials), measure time
o How many trials so estimated time = t ± 0.5 with 95% certainty?o Xi = running time of i-th run (for 1 ≤ i ≤ n), and
§ What would Chebyshev say? 2
2
)(k
kXP SSS
σµ ≤≥−
nnn
nX
nXt
nXE
n
i
in
i
iS
n
i
iS
4VarVar 2
2
11
2
1==⎟
⎠
⎞⎜⎝
⎛=⎟⎠
⎞⎜⎝
⎛==⎥
⎦
⎤⎢⎣
⎡= ∑∑∑
===
σσµ
320 05.016)5.0(
/4)5.0( 21
≥⇒==≤≥−∑=
nn
ntnXP
n
i
i
∑=
=n
i
iS n
XX1
Thanks for playing, Pafnuty!
Estimating Time With Chebyshev
![Page 39: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/39.jpg)
• Number visitors to web site/minute: X ~ Poi(100) § Server crashes if ≥ 120 requests/minute§ What is P(crash in next minute)?
§ Exact solution:
§ Use CLT, where (all I.I.D)
o Note: Normal can be used to approximate Poisson
0282.0!)100()120(
120
100
≈=≥ ∑∞
=
−
i
i
ieXP
0256.0)95.1(1)1001005.119
100100()5.119()120( ≈Φ−=
−≥
−=≥=≥
YPYPXP
∑=
n
in
1
)/100(Poi~)100(Poi
Crashing Your Website
![Page 40: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/40.jpg)
It’s play time!
![Page 41: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/41.jpg)
• You will roll 10 6-sided dice (X1, X2, …, X10)§ X = total value of all 10 dice = X1 + X2 + … + X10§ Win if: X ≤ 25 or X ≥ 45§ Roll!
• And now the truth (according to the CLT)…
Sum of Dice
![Page 42: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/42.jpg)
• You will roll 10 6-sided dice (X1, X2, …, X10)§ X = total value of all 10 dice = X1 + X2 + … + X10§ Win if: X ≤ 25 or X ≥ 45
• Recall CLT:
§ Determine P(X ≤ 25 or X ≥ 45) using CLT:
0784.0)9608.01(2)1)76.1(2(1 =−≈−Φ−≈
[ ]1235)(Var 5.3 2 ==== ii XXE σµ
)1012/35)5.3(105.44
1012/35)5.3(10
1012/35)5.3(105.25(1)5.445.25(1 −
≤−
≤−
−=≤≤−XPXP
∞→→−+++ nN
nnXXX n )1 ,0(...21 as
σµ
Sum of Dice
![Page 43: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/43.jpg)
I know of scarcely anything so apt to impress the imagination as the wonderful form of cosmic order expressed by the ”[Central limit theorem]". The law would have been personified by the Greeks and deified, if they had known of it. It reigns with serenity and in complete self-effacement, amidst the wildest confusion. The huger the mob, and the greater the apparent anarchy, the more perfect is its sway. It is the supreme law of Unreason. Whenever a large sample of chaotic elements are taken in hand and marshalled in the order of their magnitude, an unsuspected and most beautiful form of regularity proves to have been latent all along.
-Sir Francis Galton
Wonderful Form of Cosmic Order
![Page 44: CS#109 Lecture#19 May 9th, 2016 - Stanford Universityweb.stanford.edu/class/archive/cs/cs109/cs109.1166/ppt/19-CLT.pdf · Lecture#19 May 9th, 2016. 0 5 10 15 20 25 30 5 10 15 20 25](https://reader034.vdocument.in/reader034/viewer/2022050106/5f443f3f44bfb503a236aaf6/html5/thumbnails/44.jpg)
0
5
10
15
20
25
30
5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
105
110
115
120
Beta
E[X] = 87√Var(X) = 15median = 90
ɑ = 7.1β = 2.7
Core Understanding
AdvancedUnderstanding
A+
AA-B+
B
B-
Cs