chapter 2. continuous random variablesparkj1/math105/mainslide_ch2.pdf · random variable...

Chapter 2. Continuous random variables

Outline

Review of probability: events and probability

Random variable

Probability and Cumulative distribution function

Review of discrete random variable

Introduction to continuous random variable

Expected values

Variance and standard deviation

Variance for 2-dim random variable

Quantiles and Cumulative distribution function

Standard continuous univariate distributionsUniformExponentialNormal

Review of Probability

Probability considers an experiment before it is performed.Probability is a measure of the chance that an event may occur inthe experiment. Tossing a coin or conducting an election survey isan example of an experiment.An event is a subset of the sample space, the set of all possibleoutcomes. Seeing tail in coin or a positive response from thesurvey is an event.The legitimate questions are then: What is the probability ofseeing tail in the experiment of tossing a coin, or what is theprobability of getting a positive response in a survey?

The Axioms of Probability

Mathematically, probability is a function P which assigns to eachevent A in the sample space Ω a number P(A) in [0, 1] such that

Axiom 1: P(A) ≥ 0 for all A ⊆ Ω;

Axiom 2: P(Ω) = 1;

Axiom 3: P(A ∪ B) = P(A) + P(B) if A ∩ B = ∅ for anyA,B ⊆ Ω.

When an event has an associated probability to occur, this givesrise to uncertainty.Uncertainty principle is the fundamental of statistical inference.

An example of random variable

A random variable X is a function from sample space Ω to realnumbers R.

Exercise 2.2.1

(a) Experiment 1: In a presential election with two candidates M

and O, the possible outcomes are Ω = Candidate M wins,Candidate O wins. If we define a random variable X thatmaps from Ω to 0, 1. Then the probability of the eventCandidate O wins is equivalent to P(X = 1).

(b) Experiment 2: A national air quality monitoring systemautomatically collects measurements of ozone level atdesignated sites. The possible outcomes are Ω = x : x ≥ 0.If we define a random variable X to be the numericalmeasurements, the probability that ozone level falls below acertain level c is given by P(X ≤ c).

Random variable

A random variable X is a function that associates a uniquenumber with each possible outcome of an experiment.

Associated with each random variable X is a probabilitydistribution function that describes the chance of all possibleoutcomes of X .

Often in scientific investigation, X represents the variable ofmain interest that can be measured or observed.

All observable events are expressed in terms of a randomvariable.

Distribution function

In order to describe all possible outcomes of an experiment, wefocus on an event of the basic form

X ≤ x

for fixed x , where x can take any value. How how can one expressa general event a < X ≤ b using the basic form with setoperations?

a < X ≤ b = X ≤ b ∩ X ≤ ac .

If we have a rule of assigning probability to an event of the basicform, then the probability of any event can be determined.

Cumulative Distribution function

For any univariate randomvariable X , cumulative

distribution function, c.d.f.,FX : R → [0, 1], is defined by

FX (x) ≡ F (x) = P(X ≤ x) .

Moreover, F (−∞) = 0 andF (∞) = 1 and F is anon-decreasing function. Anexample is drawn in Figure 1.Note that for a ≤ b,

P(a < X ≤ b) = F (b) − F (a) .

Why is it so?

0

1

F (a)

F (b)

a bx

Cumulative distribution function

Figure: Example of cumulative distributionfunction.

Discrete random variable

Discrete random variables are probability models describing theoutcomes of experiments with countable sample space, oftenassigning only the integers or the non-negative integers.Examples include

college membership, exam grades(A,B,C,D,E), number ofgoals in a match, number of children in a family.

Discrete random variable

Discrete random variables are probability models describing theoutcomes of experiments with countable sample space, oftenassigning only the integers or the non-negative integers.Examples include

college membership, exam grades(A,B,C,D,E), number ofgoals in a match, number of children in a family.

The probability distribution function FX (x) = P(X ≤ x) can beobtained by

P(X ≤ x) = P(X ≤ int(x)) =

int(x)∑

k=−∞

P(X = x) ,

where int(x) denotes the largest integer smaller than or equal to x ,e.g. int(5.2) = 5, int(3) = 3, int(-2.1) = -2.

Probability mass function

The probability distribution of a discrete random variable X ischaracterised by probability mass function, p.m.f, p(x), where

p(x) = P(X = x) .

The probability mass function p(x) satisfies

0 ≤ p(x) ≤ 1 for all x ;

∑

∞

x=−∞p(x) = 1

For any event A, P(X ∈ A) =∑

x∈A p(x).For example,

P(a < X ≤ b) = P(X = a + 1) + P(X = a + 2) + · · · + P(X = b)

= p(a + 1) + p(a + 2) + · · · + p(b)

00.

10.

20.

3

Probability mass function

P(a < X ≤ b)

a b

x

Figure: Example of probability mass function. Shaded area representsP(a < X ≤ b).

Example of discrete random variable

Exercise 2.4.1For a random variable X that takes values 0,1 with probabilitiesθ, 1 − θ, obtain P(X ≤ x) for all x ≥ 0 and plot the cumulativedistribution function.

P(X ≤ x) =

0 if x < 0θ if 0 ≤ x < 11 if x ≥ 1

Continuous random variableWhen the outcome of an experiment is a measurement on acontinuous scale, such as ozone level measurements in the earlierexample, the random variable is called continuous random

variable. Examples include



height, weight, direction, waiting times in the hospital, priceof stock



height, weight, direction, waiting times in the hospital, priceof stock

Again, the cumulative distribution function is defined by

F (x) = FX (x) = P(X ≤ x) .

However, if X is continuous random variable

P(X = x) = 0 for all x .

and hence

P(a < X < b) = P(a ≤ X < b) = P(a < X ≤ b) = P(a ≤ X ≤ b) .

Probability density function

The probability density function, p.d.f. f (x) a continuousrandom variable X is defined by

f (x) =d

dxFX (x)

so that it satisfies

FX (x) =

∫ x

−∞

f (u) du .

The probability density function f (x) satisfies

f (x) ≥ 0 for all x ;

∫

∞

−∞f (x) dx = 1.

For any event A, P(X ∈ A) =∫

x∈Af (x) dx .

Interpretation of probability density function

Due to results from calculus,

P(a < X ≤ b) = FX (b) − FX (a)

=

∫ b

−∞

f (x) dx −∫ a

−∞

f (x) dx

=

∫ b

a

f (x) dx

Probability density function

P(a < X ≤ b)

a bx

f(x

)

Figure: Example of probability density function. P(a < X ≤ b) is thearea under the curve between x = a and x = b.

Exercise 2.5.1For a random variable X with cumulative distribution function

FX (x) =

x if 0 ≤ x ≤ 10 otherwise

(a) Find P(0.3 < X ≤ 0.5).

(b) Find the p.d.f of X .

(a) P(0.3 < X ≤ 0.5) = F (0.5) − F (0.3) = 0.5 − 0.3 = 0.2

(b) f (x) =

1 if 0 ≤ x ≤ 10 if x < 0 or x > 1

Expectation

If X is a discrete random variable with probability massfunction p(x) on 0, 1, · · · , then the expected value

of X is

µX = E[X ] =∞∑

x=0

xp(x) .

If X is a continuous random variable with probabilitydensity function f (x) on (−∞,∞), then the expected

value of X is

µX = E[X ] =

∫

∞

−∞

xf (x) dx .

Expectations of functions of random variables

Suppose Y = g(X ) where g is a fixed function.

If X is a discrete random variable with probability massfunction p(x) on 0, 1, · · · , then the expected value

of Y is

µY = E[Y ] =

∞∑

x=0

g(x)p(x) .

If X is a continuous random variable with probabilitydensity function f (x) on (−∞,∞), then the expected

value of Y is

µY = E[Y ] =

∫

∞

−∞

g(x)f (x) dx .

Exercise 2.6.1Let f (x) = exp(−x) for all x ≥ 0. Find (i)E[X ] (= µX ) (ii)E[X 2]and (iii)E[(X − µX )2]

(i) µX = E[X ] =

∫

∞

0x exp(−x) dx

= [−x exp(−x)]∞0 +

∫

∞

0exp(−x) dx

= 0 + [− exp(−x)]∞0 = 0 − (−1) = 1

(ii) E[X 2] =

∫

∞

0x2 exp(−x) dx

= [−x2 exp(−x)]∞0 +

∫

∞

02x exp(−x) dx

= 0 + 2 × 1 = 2

(iii) E[(X − µX )2] =

∫

∞

0(x − 1)2 exp(−x) dx

=

∫

∞

0(x2 − 2x + 1) exp(−x) dx

= 2 − 2 × 1 + 1 = 1

Properties of Expected values

TheoremIf X has expectation E[X ] and Y is a linear function of X as

Y = aX + b then Y has expectation

E[Y ] = aE[X ] + b .

Properties of Expected values

TheoremIf X has expectation E[X ] and Y is a linear function of X as

Y = aX + b then Y has expectation

E[Y ] = aE[X ] + b .

More generally, the following properties hold:

E[g(X ) + h(X )] = E[g(X )] + E[h(X )] (1)

E[cg(X )] = cE[g(X )] (2)

E[aX + b] = aE[X ] + b (3)

Note that we proved them in MATH 104 for discrete randomvariables.

Using linear properties of expectation, we may computeE[(X − a)2] by

E [(X − a)2] = E [X 2 − 2aX + a2]

= E [X 2] − E [2aX ] + E [a2]

= E [X 2] − 2aE [X ] + a2

Variance and Standard Deviation

If X is a random variable with expected valueµX = E[X ], the variance of X is

σ2X = Var[X ] = E[(X − µX )2]

=

∑

∞

x=0(x − µX )2p(x) discrete r.v.∫

∞

−∞(x − µX )2f (x) dx continous r.v.

The variance of X can be calculated as

σ2X = E[X 2] − µ2

X .

The standard deviation of X is

σX =√

Var[X ] .

Variance and Standard Deviation: example

Exercise 2.7.1For f (x) = exp(−x) for all x ≥ 0, Find σX .

From the previous example,σ2X = 1 so σX = 1.

Properties of Variance and Standard Deviation

TheoremIf Var[X ] exists and Y = a + bX , then Var[Y ] = b2Var[X ]. Hence,

the standard deviation of Y is σY = |b|σX .

Properties of Variance and Standard Deviation

TheoremIf Var[X ] exists and Y = a + bX , then Var[Y ] = b2Var[X ]. Hence,

the standard deviation of Y is σY = |b|σX .

Why do you need to take the absolute value in the aboveexpression?

0 1 2 3 4 5

0.0

0.1

0.2

0.3

0.4

y1

0 1 2 3 4 5

0.1

0.3

0 1 2 3 4 5

0.0

0.1

0.2

0.3

0.4

y1

0.1

0.5

1

Probability mass functionProbability mass function

DensityDensity

µ = 2.5

µ = 2.5

σ = 1.1

σ = 1.1

µ = 0.83

µ = 0.83

σ = 0.83

σ = 0.83

x

x

x

x

Figure: Expectations and Standard deviations for discrete and continuousrandom variables.

Correlation

With multivariate data, we need to characterise dependence:The correlation between X and Y , denoted Corr(X ,Y ), anddefined as:

The correlation between two random variables X andY is

Corr(X ,Y ) =E [X − E[X ]Y − E[Y ]]

√

Var(X )Var(Y )

=E[(X − µX )(Y − µY )]

σXσY

..

.

.

....

....

. .....

....

....

.

..

.. . . . .

.. . .

...

.. .

...

..

... . . . ..

..

.

.... .

.

...

. ...

. . .

.. .

........

..

...

.

..

....

...

. .....

......

....

. ...

.. . . . .

.. . .

.

. ..

.

...

..

.. . . ..

..

.

....

.. ...

..

.. . .

.

..

.. .

.

.

...

.

..

.. .

Correlation near 1

..

.

.

....

....

. .....

......

....

...

...

.. . . . .

.. . .

.

. ..

.. .

...

..

... . . . ..

..

.

.... .

.

...

. ...

...

.. . .

... .

........

..

.....

..

..

.

.

.

.

..

..

.cent

re h

ere

spre

ad

centre here

spread

Correlation near 0

loose clustering tight clustering

Figure: Mean and Variance are measures of location and scale;Correlation measures linear association between variables.

Measure of clustering around a straight line with a slope ∈ [−1, 1]Correlation is a scale free measure of linear dependence betweentwo variables.

TheoremIf X and Y are independent, then Corr(X ,Y ) = 0.

Covariance

Recall that the variance of a random variable X is given byVar(X ) = E

[

(X − µX )2]

. The covariance between two randomvariables X and Y is defined in a similar way:

The covariance between random variables X and Y

isCov(X ,Y ) = E [(X − µX )(Y − µY )] , .

so that Var(X ) = Cov(X ,X ) and Cov(X ,Y ) is the expectedproduct of the deviations of each variable from its expected value.

Covariance cont.

Exercise 2.8.1Show that covariance is equivalent to

Cov(X ,Y ) = E[XY ] − µXµY .

Cov(X ,Y ) = E[(X − µX )(Y − µY )]

= E[XY − µXY − µY X + µXµY ]

= E[XY ] − µXE[Y ] − µY E[X ] + µXµY

= E[XY ] − µXµY − µY µX + µXµY

= E[XY ] − µXµY

Cov(X ,Y ) = E[(X − µX )(Y − µY )]

= E[XY − µXY − µY X + µXµY ]

= E[XY ] − µXE[Y ] − µY E[X ] + µXµY

= E[XY ] − µXµY − µY µX + µXµY

= E[XY ] − µXµY

We have the following interpretation:

Cov(X ,Y ) = ρσXσY ,

where Corr(X ,Y ) = ρ, and σX and σY are the square roots of thevariances of X and Y respectively.

Quantiles

Often interest is in the values of a continuous random variablewhich are not exceeded with a given probability, e.g. income oflower 10% income tax payer or score of top 5% students.

Let X be a random variable and p any value such that0 ≤ p ≤ 1. The the pth quantile of the distribution ofX is the value xp that satisfies:

P(X ≤ xp) = p.

When p = 0.5, the quantitle x0.5 is called median.

pth QuantileSee Figure 6 for visualisation.

0

p

1

xpx

F(x

)Cumulative distribution function

Figure: xp is the pth quantile obtained from c.d.f.

QuartilesThe quartiles of a distribution are the values at which we can cutthe distribution into four equally likely slices: (x0.25, x0.5, x0.75).Figure 7 shows quartiles in c.d.f and p.d.f.

x(.25) x(.75)

0.25

0.5

0.75

1

x(.5) x(.75)

0

0.5

Cumulative distribution function Density

f(x

)

F(x

)

xx

Figure: Quartiles (x0.25, x0.5, x0.75) shown on c.d.f. and p.d.f. respectively.

Uniform distribution

This distribution is used to model variables that can take any valueon a fixed interval, when the probability of occurrence does notvary over the interval.

The p.d.f. of a Uniform random variable X , dis-tributed on the interval (a, b) is given by:

f (x ;θ) =

1

b − aif a < x < b;

0 otherwise,

where θ = (a, b) and Θ is the set of (a, b) suchthat −∞ < a < b < ∞. This is written asX ∼ Uniform(a, b).

a x0 b

01/

(b−

a)

P(a < X ≤ x0)

x

Figure: P.d.f. for Uniform(a, b) random variable. Shaded area representsP(a < X ≤ x0).

Cumulative distribution function and quantiles

Exercise 2.10.1For X ∼ Uniform(a, b),

(i) Find the c.d.f. and sketch its graph.

(ii) Find the mean µX and variance σ2X .

(iii) Find the median and compare to the mean

(i) F (x) =∫ x

a1

b−adu = x−a

b−afor a ≤ x ≤ b and 1 for x ≥ b.

(i) F (x) =∫ x

a1

b−adu = x−a


(ii) µX =∫ b

ax 1

b−adx = a+b

2 and

E[X 2] =∫ b

ax2 1

b−adx = a2+ab+b2

3

So σ2X = E[X 2] − µ2

X = (a−b)2

12 .

(i) F (x) =∫ x

a1

b−adu = x−a


(ii) µX =∫ b

ax 1

b−adx = a+b

2 and

E[X 2] =∫ b

ax2 1

b−adx = a2+ab+b2

3

So σ2X = E[X 2] − µ2

X = (a−b)2

12 .

(iii) F (x) = x−ab−a

= 0.5 so x0.5 = a + 0.5(b − a) = (a + b)/2,same as the mean.

Exercise 2.10.2Numerically evaluate the p.d.f., c.d.f. and the quantile function ofa Uniform distribution.

dunif(0.5, -2, 2) # p.d.f. Uniform(-2,2) at x=0.5, f(0.5)=0.25

punif(0.3, 0, 1) # c.d.f. of Uniform(0,1) at x=0.3, P(X<3)=0.3

x = seq(-1, 1, length=101)

fx = dunif(x, -1,1)

plot(x, fx)

lines(x, fx)

Exponential distribution

This distribution is often used to model variables that are the timesuntil specific events happen when the events occur at random at agiven rate over time.

The p.d.f. of an Exponential random variable X is

f (x ;θ) =

θ exp(−θx) for x > 0,0 otherwise,

where 0 < x and 0 < θ. This is written asX ∼ Exponential(θ) and θ ∈ Θ = (0,∞).

0 1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

x

f(x)

P(X>1)

Figure: P.d.f.s for Exponential(θ) random variables for θ = 1.

Shape of Exponential density function

Exercise 2.10.3Old Q: How is the shape of the function related to the parameter

θ? Which one has lower tail probability P(X > 10)?Find P(X > 1) when θ = 1.

P(X > 1) =

∫

∞

1exp(−x) dx

= [− exp(−x)]∞1 = 0 − (− exp(−1))

= exp(−1) = 0.3678


The c.d.f. of the Exponential(θ) distribution is

F (x) =

0 if x ≤ 0 ,1 − exp(−θx) if x > 0 .

Exercise 2.10.4For X ∼ Exponential(θ),

(i) Derive the cumulative distribution from the p.d.f.

(ii) Find the median.

(iii) The mean µX of the Exponential(θ) is 1θ. Compare themedian and the mean. Which one is larger? Why does themedian differ from the mean?

(i)

F (x) =

∫ x

0f (u) du =

∫ x

0θ exp(−θu) du = 1−exp(−θx)

(ii) Solving F (xp) = p gives xp = θ−1 log(1 − p)−1. Formedian, p = 0.5 so the median is x0.5 = 1

θlog 2.

(iii) µX = 1/θ > (1/θ) log 2 = x0.5. As the distribution isnot symmetric, the mean and the median are not thesame. The median is always smaller because there ishigher concentration of probability on smaller valuesbut with thin probability values are stretching to the farright.

Exercise 2.10.5Numerically evaluate the p.d.f., c.d.f. and the quantile function ofan Exponential distribution.

dexp(3, rate=2) # p.d.f. of Exponential(2) at x=3, f(3)=0.004957504

pexp(3, 5) # c.d.f. of Exponential(5) at x=3, P(X<3)=0.9999997

x = seq(0, 4, length=100)

fx = dexp(x, rate=2)

plot(x, fx)

lines(x, fx)

Exercise 2.10.6Suppose that a goal is scored at random in a fixed time of the cupfinal and the time until the event can be modelled by anExponential distribution with rate parameter θ = 2/3 hours. If thefirst goal has been scored just now, what is the probability ofwaiting time until the next goal is

(i) more than 30 minutes

(ii) between 30 and 50 minutes

Let X be the random variable of the waiting time. ThenX ∼ Exponential(2/3) and F (x) = 1 − exp(−(2/3)x).

(i) P(X > 1/2) = 1 − F (1/2) = exp(−2/3 · 1/2) = 0.7165

(ii)

P(1/2 < X < 5/6) = F (5/6) − F (1/2)

= exp(−2/3 · 1/2) − exp(−2/3 · 5/6)= 0.1428

Normal distribution: background

quoted from gqview weblib/Gauss.html

The normal distribution was introduced by the French

mathematician Abraham De Moivre in 1733. De Moivre used this

distribution to approximate probabilities of winning in various

games of chance involving coin tossing.

It was later used by the German mathematician Karl Gauss to

prredict the location of astronomical bodies and became known as

the Gaussian distribution.

In the late nineteenth century statisticians started to believe that

most data sets would have histograms with the Gaussian

bell-shaped form and that all normal data sets would follow this

form and so the curve came to be known as the normal curve.

Normal distribution

The p.d.f. of a Normal random variable X is

f (x ;θ) =1√2πσ

exp

−1

2

(

x − µ

σ

)2

,

where θ = (µ, σ), −∞ < µ < ∞, 0 < σ and −∞ <x < ∞. This is written as X ∼ N(µ, σ2) and θ ∈ Θ =(−∞,∞) × (0,∞).

E[X ] = µ , Var(X ) = σ2

Shape of Normal density function

Exercise 2.10.7How is the shape of the function related to the parameter θ?Which one has higher probability of P(|X | > 3)?

−3 0 3

00.

20.

40.

60.

8

sigma=0.5sigma=1sigma=1.5

x

Figure: P.d.f.s for Normal(µ, σ2) random variables where µ = 0 andσ = 0.5, 1, 1.5.

The larger σ, the more spread. So σ = 1.5 has the largestprobability of P(|X | > 3) and σ = 0.5 has the smallest.


The normal c.d.f. is

F (x) =

∫ x

−∞

f (u) du =

∫ x

−∞

1√2πσ2

exp

− 1

2

(u − µ

σ

)2

, du .

This does not have a closed form expression so numericalevaluation is required, if we want to obtain probabilities of theform P(X ≤ x) or quantiles.

Exercise 2.10.8Numerically evaluate the p.d.f., c.d.f. and the quantile function ofa Normal distribution.

pnorm(0, mean=2, sd=sqrt(5)) # P(X<0) when X ~ N(2,5)

pnorm(0, 2, sqrt(5)) # P(X<0) when X ~ N(2,5) 0.1855467

1-pnorm(-2, 0, 2) # P(X >-2) when X ~ N(0,4) 0.8413447

qnorm(0.975, 0, 1) # u such that P(X<u)=0.975, 1.959964

Note that the R functions for the Normal distribution use thestandard deviation σ, not the variance σ2.

Exercise 2.10.9A normal distribution is proposed to model the variation in heightof women with parameters µ = 160 and σ2 = 25 measured in cm.Find the proportion of tall women, defined as over 175cm tall.

Let X be the random variable of women’ height then X ∼Normal(160, 52). So

P(X > 175) =

∫

∞

175

1√2π · 5

exp

− 1

2

(x − 160

5

)2

dx

Let X be the random variable of women’ height then X ∼Normal(160, 52). So

P(X > 175) =

∫

∞

175

1√2π · 5

exp

− 1

2

(x − 160

5

)2

dx

In the above example we have expressed the proportion in terms ofan integral and as the number of deviations from the mean.The integral is impossible to calculate analytically so numericalevaluation is required to obtain probabilities or quantiles.

Standardardization of the random variable

It is useful to express such probabilities in terms of a standardizedrandom variable, with µ = 0 and σ = 1.

If X ∼ N(µ, σ2) then

Z =X − µ

σ∼ N(0, 1),

and conversely if Z ∼ N(0, 1), then

X = µ + σZ ∼ N(µ, σ2) .

Standardardization of the random variable

It is useful to express such probabilities in terms of a standardizedrandom variable, with µ = 0 and σ = 1.

If X ∼ N(µ, σ2) then

Z =X − µ

σ∼ N(0, 1),

and conversely if Z ∼ N(0, 1), then

X = µ + σZ ∼ N(µ, σ2) .

The formal proof will be given in M230 and here it is sufficient tonote that

E[Z ] = 0 Var[Z ] = 1 .

Standard normal distribution

A random variable Z is said to have a standard normal

distribution with mean 0 and standard deviation 1 if its p.d.f. isgiven by

f (z) =1√2π

exp(−z2/2) ,

where −∞ < z < ∞ and is denoted by Z ∼ N(0,1).The cumulative distribution function, i.e. the area under the curve,of the standard normal variable Z is given by

Φ(z) = P(Z ≤ z) =

∫ z

−∞

1√2π

exp(−x2/2) dx .

Values of Φ(z) are obtained from a table of standard normalprobabilities or from computer software such as R: pnorm(z)

z -3.00 -2.33 -1.67 -1.00 -0.33 0.33 1.00

Φ(z) 0.0013 0.0098 0.0478 0.1587 0.3694 0.6306 0.8413

Standard normal distribution: example

Exercise 2.10.10We repeat the previous example to illustrate the standardizationprocedure:

P(X > 175) = P(X − 160

5>

175 − 160

5)

= P(Z > 3) = 1 − P(Z ≤ 3)

= 1 − Φ(3)

= 1 − 0.9987 = 0.0013

Probabilities of Normal distribution

µµ − σ µ + σµ − 2σ µ + 2σµ − 3σ µ + 3σP(µ − σ < X < µ + σ) = 0.683

P(µ − 2σ < X < µ + 2σ) = 0.954

P(µ − 3σ < X < µ + 3σ) = 0.997

Figure: Illustration of coverage probability of Normal(µ, σ2) distribution.

chapter 2. continuous random variablesparkj1/math105/mainslide_ch2.pdf · random variable...

Documents