146 17 the_normal_distribution online
TRANSCRIPT
![Page 1: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/1.jpg)
MATH& 146
Lesson 17
Sections 2.5 and 2.6
The Normal Distribution
1
![Page 2: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/2.jpg)
Sampling Distribution of
the Mean
Here's a simple
simulation. Let's start
with one fair die. If we
toss this die many times,
what should the dotplot of
the numbers on the face
of the die look like?
To help you out, consider
the results of 500
simulated tosses.
2
![Page 3: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/3.jpg)
Sampling Distribution of
the Mean
Now let's toss a pair of
dice and record the
average of the two.
If we repeat this (or at
least simulate it) 500
times, recording the
average of each pair,
what will the dotplot of
these 500 averages look
like?
3
![Page 4: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/4.jpg)
Sampling Distribution of
the Mean
We're much more likely to
get an average near 3.5
than we are to get one near
1 or 6.
After all, the only way to get
an average of 1 (or 6) is to
roll 1's (or 6's) with both
dice.
An average of 3.5, however,
has many possibilities
4
![Page 5: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/5.jpg)
Sampling Distribution of
the Mean
What if we average 3
dice? We'll simulate
500 tosses of 3 dice
and take their average:
5
![Page 6: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/6.jpg)
Sampling Distribution of
the Mean
Note that it's getting
harder to have averages
near the ends, since
getting an average of 1 or
6 requires all three to
come up 1 or 6,
respectively.
That's less likely than for
2 dice to come up both 1
or 6.
6
![Page 7: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/7.jpg)
Sampling Distribution of
the Mean
Let's continue this
simulation to see what
happens with larger
samples.
Here's a dotplot of the
averages for 500 tosses
of 5 dice:
7
![Page 8: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/8.jpg)
Sampling Distribution of
the Mean
The pattern is becoming
clearer. Three things
continue to happen.
1) The shape is unimodal
and symmetric.
2) The shape remains
centered at 3.5.
3) The shape is
tightening.
8
![Page 9: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/9.jpg)
Sampling Distribution of
the Mean
Not convinced? Let's skip
ahead and try 20 dice.
The dotplot of averages
for 500 throws of 20 dice
looks like this:
9
![Page 10: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/10.jpg)
Sampling Distribution of
the Mean
At this point, you should
be asking if this only
works for dice throws. In
fact, this shape shows up
amazingly often when we
use sample means or
sample proportions.
We even have a name for
this shape: the normal
distribution.
10
![Page 11: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/11.jpg)
The Normal Distribution
Among all the distributions we see in practice, the
normal distribution is overwhelmingly the most
common. The symmetric, unimodal, bell-shaped curve
is pervasive throughout statistics.
Variables such as SAT scores and heights of US adult
males and females closely follow the normal
distribution.
11
![Page 12: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/12.jpg)
The Normal Distribution
Technically, while many variables are nearly normal,
none are exactly normal.
However, the normal distribution, while not perfect for
any single problem, is very useful for a variety of
problems. In fact, we will use it many times for the rest
of the course.
12
![Page 13: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/13.jpg)
What Is Normal?
The normal distribution model always describes a
symmetric, unimodal, bell-shaped curve. However,
these curves can look different depending on the
details of the model. Specifically, the normal
distribution model can be adjusted using two
parameters: mean and standard deviation.
13
![Page 14: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/14.jpg)
What Is Normal?
As you might guess, changing the mean shifts the bell
curve to the left or right, while changing the standard
deviation stretches or constricts the curve.
14
Mean = 0
Standard Deviation = 1
Mean = 19
Standard Deviation = 4
![Page 15: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/15.jpg)
Example 1
Consider the following sets of three distributions,
all of which are drawn to the same scale. Identify
the two distributions that are normal. Of the two
normal distributions, which one has the larger
standard deviation?
15
![Page 16: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/16.jpg)
Notation
Because the mean and standard deviation describe a
normal distribution exactly, they are called the
distribution's parameters. These are not the same
things as sample statistics.
Sample
Statistic
Distribution
Parameter
Mean (x-bar) μ (mu)
Standard Deviation s σ (sigma)
16
x
![Page 17: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/17.jpg)
Notation
If X is a quantity to be measured that has a normal
distribution with mean μ and standard deviation σ,
we write the distribution as N(μ,σ)
17
![Page 18: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/18.jpg)
Notation
For example, the two distributions below can be
written as
N(μ = 0, σ = 1) and N(μ = 19, σ = 4)
18
Mean = 0
Standard Deviation = 1
Mean = 19
Standard Deviation = 4
![Page 19: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/19.jpg)
Example 2
Write down each normal distribution using the
short-hand notation, and sketch its shape.
a) mean 5 and standard deviation 3
b) mean –100 and standard deviation 10
c) mean 2 and variance 9.
19
![Page 20: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/20.jpg)
z-Scores
The z-score of an observation is the number of
standard deviations it falls above or below the
mean. We compute the z-score for an observation
x that follows a distribution with mean μ and
standard deviation σ using
20
value mean
standard deviation
xz
![Page 21: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/21.jpg)
z-Scores
If X is normal, then the z-scores (also called
standard scores) will also be normal, but with a
mean of 0 and standard deviation of 1. That is,
N(μ = 0, σ = 1).
This distribution even has a special name: the
standard normal distribution.
21
![Page 22: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/22.jpg)
Example 3
Suppose a student had taken 2 exams, getting 60
in a verbal test and 80 in a numerical reasoning
test. The class scores for each exam are normally
distributed. For the verbal test, the mean is 50 and
standard deviation 10; for the numerical test, the
mean is 70 and standard deviation is 12. Relative
to the rest of the class, which was the student's
best score?
22
![Page 23: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/23.jpg)
Example 4
Over the last few classes, we have run simulations on
several case studies, including the opportunity cost
study (Lesson 13) and the medical consultant study
(Lesson 16).
Suppose we had run 10,000 simulations on each
study.
23
![Page 24: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/24.jpg)
Example 4 continued
The two graphs show the
null distribution for both of
these case studies.
Using these graphs,
describe the shape of the
distributions and note
anything that you find
interesting.
24
![Page 25: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/25.jpg)
Central Limit Theorem
It is common for distributions in general to be
skewed or contain outliers.
However, the null distributions we've so far
encountered have all looked somewhat similar
and, for the most part, symmetric. They all
resemble the normal distribution. This is not a
coincidence, but rather, is guaranteed by
mathematical theory.
25
![Page 26: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/26.jpg)
Central Limit Theorem
If we look at a proportion (or difference in proportions)
and the scenario satisfies certain conditions, then the
sample proportion (or difference in proportions) will
appear to follow a bell-shaped curve called the normal
distribution.
Though the conditions are slightly different, the sample
mean (or difference in means) will also appear to
follow a normal distribution. However, we'll save the
details for later in the course (Lessons 27 – 30).
26
![Page 27: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/27.jpg)
Conditions for Proportions
Mathematical theory guarantees that a sample
proportion or a difference in sample proportions will
follow something that resembles a normal distribution
when certain conditions are met. These conditions fall
into two categories:
• Observations in the sample are independent.
• The sample is large enough.
27
![Page 28: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/28.jpg)
Conditions for Proportions
Observations in the sample are independent.
Independence is guaranteed when we take a random
sample of less than 10% of the population. It can also
be guaranteed if we randomly divide individuals into
treatment and control groups.
The sample is large enough. To be reasonably
certain of a unimodal, symmetric distribution, the
sample should be at least a minimum size, though
what qualifies as "minimum" differs from one context to
the next. Suitable guidelines will be given in later
lessons.
28
![Page 29: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/29.jpg)
Example 5
Suppose the true population proportion were p = 0.95.
The figure shows what the distribution of a sample
proportion looks like when the sample size is n = 20,
n = 100, and n = 500.
29
n = 20
n = 500
n = 100
![Page 30: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/30.jpg)
Example 5 continued
What does each point (observation) in each of the
samples represent? Describe how the distribution of
the sample proportion, , changes as n becomes
larger.
30
p̂
n = 20
n = 500
n = 100
![Page 31: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/31.jpg)
The Normal Distribution
So far we've had no need for the normal distribution.
We've been able to answer our questions somewhat
easily using simulation techniques.
This will soon change, however, since simulating data
can be non-trivial (very, very difficult).
Instead, the normal distribution (and other distributions
like it) offer a general framework that applies to a very
large number of settings.
31
![Page 32: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/32.jpg)
Opportunity Cost
For one example, the opportunity cost study
determined that students are thriftier if they are
reminded that saving money now means they can
spend the money later. The study's point estimate
for the estimated impact was 20%, meaning 20%
fewer students would move forward with a DVD
purchase in the study scenario. However, as
we've learned, point estimates aren't perfect – they
only provide an approximation of the truth.
32
![Page 33: 146 17 the_normal_distribution online](https://reader034.vdocument.in/reader034/viewer/2022042906/5899b45f1a28aba11e8b56ad/html5/thumbnails/33.jpg)
Opportunity Cost
It would be useful if we could provide a range of
plausible values for the impact, more formally
known as a confidence interval. It is often
difficult to construct a reliable confidence interval in
many situations using simulations. However,
doing so is reasonably straightforward using the
normal distribution (Lesson 21).
33