probability and statistics - cairo university (b) the percentage of healthy adult males who have...

62
Week 2 Sampling Distributions & Confidence Intervals

Upload: others

Post on 16-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Week 2Sampling Distributions & Confidence Intervals

Page 2: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

ObjectivesBy the end of this lesson, you should be able to:

• Explain the important role of the normal distribution as a sampling distribution

• Explain the general concepts of estimating the parameters of a population or a probability distribution

• Understand the central limit theorem

• Construct point and interval estimation of a parameter

Page 3: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Statistics in Action

• It is helpful to put statistics in the context of a general process of investigation:

1. Identify a question or problem.

2. Collect relevant data on the topic.

3. Analyze the data.

4. Form a conclusion.

Page 4: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Population & Sample

• We collect a sample of data to better understand the characteristics of a population.

• A variable is a characteristic we measure for each individual or case.

• The overall quantity of interest may be the mean, median, proportion, or some other summary of a population.

• These population values are called parameters.

• We estimate the value of a parameter by taking a sample and computing a numerical summary called a statistic based on that sample.

• Note that the two p's (population, parameter) go together and the two s's ( sample, statistic) go together.

Page 5: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Fundamental Data Descriptions

Random Sampling:

Definition

A population consists of the totality of the observations with which we are concerned.

Definition

A sample is a subset of a population.

• Each observation in a population is a value of a random variable X having

some probability distribution f(x).

• To eliminate bias in the sampling procedure, we select a random sample in

the sense that the observations are made independently and at random.

• The random sample of size n is: X1, X2, …, Xn . It consists of n observations

selected independently and randomly from the population.

Page 6: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Some Important Statistics:

Definition:

Any function of the random sample X1, X2, …, Xn is called a statistic.

Location Measure of a Sample:

Definition

If X1, X2, …, Xn represents a random sample of size n, then the sample mean is

defined to be the statistic:

n

X

n

XXXX

n

ii

n

121 (unit)

is a statistic because it is a function of the random sample

X1, X2, …, Xn.

· has same unit of X1, X2, …, Xn.

· measures the central tendency in the sample (location).

X

X

X

Page 7: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Variability in the Sample:

Definition

If X1, X2, …, Xn represents a random sample of size n, then the sample variance is

defined to be the statistic:

1

)()()(

1

)( 222

211

2

2

n

XXXXXX

n

XX

S n

n

ii (unit)2

Note:

· S2 is a statistic because it is a function of the random sample

X1, X2, …, Xn.

· S2 measures the variability in the sample.

1

)(1

2

2

n

XX

SS

n

ii

(unit)

Page 8: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Normal Distribution

Page 9: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Normal DistributionThe normal distribution is one of the most important continuous distributions.

Many measurable characteristics are normally

or approximately normally distributed, such as,

height and weight.

The graph of the probability density function (pdf)

of a normal distribution, called the normal curve,

is a bell-shaped curve.

Page 10: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

2.5% 2.5%

5% region of rejection of null hypothesis

Non directional

Two Tail

body temperature, shoe sizes, diameters of trees,

Wt, height etc…

IQ

68%

95%

13.5%13.5%

Normal Distribution:

half the scores above

mean…half below

(symmetrical)

Page 11: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%
Page 12: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

The pdf of the normal distribution depends on two parameters: mean = E(X)= and

variance =Var(X) = 2.

If the random variable X has a normal distribution with mean and variance 2, we

write:

X ~ Normal(,) or X ~ N(,)

The pdf of X ~ Normal(,) is given by:

0

;2

1),;()(

2

2

1

x

exnxf

x

The location of the normal

distribution depends on and its

shape depends on .

Suppose we have two normal

distributions:

_______ N(1, 1)

----------- N(2, 2) 1 < 2, 1=2

Page 13: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

1 = 2, 1<2 1 < 2, 1<2

Some properties of the normal curve f(x) of N(,):

1. f(x) is symmetric about the mean .

2. f(x) has two points of inflection at x= .

3. The total area under the curve of f(x) =1.

4. The highest point of the curve of f(x) at the mean .

Page 14: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Areas Under the Normal Curve of X~N(,)

The probabilities of the normal distribution N(,) depends on and .

a

-

dxf(x))aX(P

b

dxf(x) b)P(X b

a

dxf(x) b)XP(a

Page 15: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Areas Under the Normal Curve:Definition

The Standard Normal Distribution:

•The normal distribution with mean =0 and variance 2=1 is called the standard normal

distribution and is denoted by Normal(0,1) or N(0,1). If the random variable Z has the

standard normal distribution, we write Z~Normal(0,1) or Z~N(0,1).

•The pdf of Z~N(0,1) is given

by:

2

2

1

2

1)1,0;()(

z

eznzf

Page 16: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

•The standard normal distribution, Z~N(0,1), is very important

because probabilities of any normal distribution can be

calculated from the probabilities of the standard normal

distribution.

•Probabilities of the standard normal distribution Z~N(0,1) of

the form P(Za) are tabulated.

P(Za) =

a

dzf(z)

a

-

z2

1

dze2π

1 2

= from the table

Page 17: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Probabilities of Z~N(0,1):

Suppose Z ~ N(0,1).

P(Za) =From

Table (A.3)P(Zb) = 1P(Zb) P(aZb) =

P(Zb)P(Za)

Note: P(Z=a)=0 for every a .

· We can transfer any normal distribution X~N(,) to the

standard normal distribution, Z~N(0,1) by using the following

result.

Result: If X~N(,), then N(0,1)~X

Z

Page 18: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%
Page 19: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Example:

Suppose Z~N(0,1).

(1)P(Z1.50)=0.9332

Z 0.00 0.01 …

:

1.5 0.9332

:

(2) P(Z0.98)=1P(Z0.98)=1 0.8365= 0.1635

Z 0.00 … 0.08

: : :

: … …

0.9 0.8365

(3)P(1.33 Z2.42)= P(Z2.42) P(Z1.33)= 0.9922 (1-0.9082)= 0.9004

Z … 0.02 0.03

: :

1.3 0.9082

:

2.4 0.9922(4) P(Z0)=P(Z 0)=0.5

Page 20: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Example:

Suppose Z~N(0,1). Find the value of k suchthatP(Zk)= 0.0207.Solution:

Probability is less than 0.5 K is negativeFind Z with Prob.=1-0.0207=0.9793 k = 2.04

Z … 0.04

: :

2.0 0.9793

:

Probabilities of X~N(,):

Result: X ~N(,)

~

XZ

aZ

aXaX

Page 21: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

aZPaXP)1

aZP1aXP1aXP)2

aZP

bZPaXPbXPbXaP)3

4) P(X=a)=0 for every a.

5) P(X) = P(X)=0.5

Page 22: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Example:

Suppose that the hemoglobin level for healthy adults males has a normal distribution

with mean =16 and variance 2=0.81 (standard deviation =0.9).

(a) Find the probability that a randomly chosen healthy adult male has hemoglobin

level less than 14.

(b) What is the percentage of healthy adult males who have hemoglobin level less than

14?

Solution:

Let X = the hemoglobin level for a healthy adult male

X ~ N(,)= N(16, 0.9).

9.0

1614ZP

14ZP)14 P(X

= P(Z 2.22)=1-0.9868=0.0132

(a)

Page 23: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

(b) The percentage of healthy adult males who

have hemoglobin level less than 14 is

P(X 14) 100% = 0.01320 100% =1.32%

Therefore, 1.32% of healthy adult males have

hemoglobin level less than 14.

Example:

Suppose that the birth weight of babies has a normal distribution with mean =3.4 and

standard deviation =0.35.

(a) Find the probability that a randomly chosen baby has a birth weight between 3.0 and

4.0 kg.

(b) What is the percentage of babies who have a birth weight between 3.0 and 4.0 kg?

Page 24: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Solution:

X = birth weight of a baby

= 3.4 = 0.35 (2 = 0.352 = 0.1225)

X ~ N(3.4,0.35 )

(a) P(3.0<X<4.0)=P(X<4.0)P(X<3.0)

0.3ZP

0.4ZP

35.0

4.30.3ZP

35.0

4.30.4ZP

= P(Z1.71) P(Z 1.14)= 0.9564 0.1271= 0.8293

(b) The percentage of babies who have a birth weight between 3.0 and 4.0 kg is

P(3.0<X<4.0) 100%= 0.8293 100%= 82.93%

Page 25: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Notation:

P(ZZA) = A

Result:

ZA = Z1A

Example:

Z ~ N(0,1)

P(ZZ0.025) = 0.025

P(ZZ0.95) = 0.95

P(ZZ0.90) = 0.90

Example:

Z ~ N(0,1)

Z0.025 = 1.96

Z0.95 = 1.645

Z0.90 = 1.285

Z … 0.06

: :

1.9 0.975

P(ZZ0.025) = 0.025

Z0.025 = 1.96

Page 26: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Example

In an industrial process, the diameter of a ball bearing is an important component part.

The buyer sets specifications on the diameter to be 3.00±0.01 cm. The implication is

that no part falling outside these specifications will be accepted. It is known that, in the

process, the diameter of a ball bearing has a normal distribution with mean 3.00 cm

and standard deviation 0.005 cm. On the average, how many manufactured ball

bearings will be scrapped?

Page 27: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Solution:

=3.00

=0.005

X=diameter

X~N(3.00, 0.005)

The specification limits are:

3.00±0.01

x1=Lower limit=3.000.01=2.99

x2=Upper limit=3.00+0.01=3.01

P(x1<X< x2)=P(2.99<X<3.01)=P(X<3.01)P(X<2.99)

99.2ZP

01.3ZP

005.0

00.399.2ZP

005.0

00.301.3ZP

Page 28: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

= P(Z2.00) P(Z 2.00)

= 0.9772 0.0228

= 0.9544

Therefore, on the average, 95.44% of manufactured ball bearings will be accepted and

4.56% will be scrapped.

Example

Gauges are used to reject all components where a certain dimension is not within the

specifications 1.50±d. It is known that this measurement is normally distributed with

mean 1.50 and standard deviation 0.20. Determine the value d such that the

specifications cover 95% of the measurements.

Solution:

=1.5

=0.20

X= measurement

X~N(1.5, 0.20)

Page 29: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

The specification limits are:

1.5±d

x1=Lower limit=1.5d

x2=Upper limit=1.5+d

P(X> 1.5+d)= 0.025 P(X< 1.5+d)= 0.975

P(X< 1.5d)= 0.025

0.025)d5.1(X

P

025.0)d5.1(

ZP

025.020.0

5.1)d5.1(ZP

025.020.0

dZP

Z … 0.06

: :

-1.9 0.025

20.0

d:Note

96.120.0

d

025.0)20.0

dP(Z

Z0.025

Page 30: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

392.0d

)96.1)(20.0(d

96.120.0

d

The specification limits are:

x1=Lower limit=1.5d = 1.5 0.392 = 1.108

x2=Upper limit=1.5+d=1.5+0.392= 1.892

Therefore, 95% of the measurements fall within the specifications

(1.108, 1.892).

Page 31: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Sampling Distributions

Page 32: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Sampling distribution:

Definition

The probability distribution of a statistic is called a sampling

distribution.

· Example: If X1, X2, …, Xn represents a random sample of

size n, then the probability distribution of is called the

sampling distribution of the sample mean .

X

X

Sampling Distributions of Means:

If X1, X2, …, Xn is a random sample of size n taken from a normal distribution with mean and variance

2, i.e. N(,), then the sample mean has a normal distribution with meanX

X

)X(E

Page 33: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

and variance

nXVar X

22)(

· If X1, X2, …, Xn is a random sample of size n from N(,), then ~N(

, ) or ~N(, ).X

X

X

n

N(0,1)~n/

XZ)

n ,N( ~ X·

X

Page 34: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Central Limit Theorem

If X1, X2, …, Xn is a random sample of size n from any distribution (population) with

mean and finite variance 2, then, if the sample size n is large, the random variable

n

XZ

/

is approximately standard normal random variable, i.e.,

approximately. N(0,1)~n/

XZ

)n

,N( ~X N(0,1)~n/

XZ

We consider n large when n30.

For large sample size n, has approximately a normal

distribution with mean and variance , i.e.,

X

n

2

)n

,N( ~X

approximately.

Page 35: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Altman, D. G et al. BMJ 1995;310:298

Central Limit Theorem: the larger the sample size, the closer a distribution will approximate the normal distribution or

A distribution of scores taken at random from any distribution will tend to form a normal curve

jagged

smooth

Page 36: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

The sampling distribution of is used for inferences about the

population mean .

The standard deviation of the sampling distribution is called the

standard error and is equal to𝜎

𝑛

X

Example

An electric firm manufactures light bulbs that have a length of life that is approximately

normally distributed with mean equal to 800 hours and a standard deviation of 40

hours. Find the probability that a random sample of 16 bulbs will have an average life

of less than 775 hours.

Solution:

X= the length of life

=800 , =40

X~N(800, 40)

n=16

800X

1016

40

nX

Page 37: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

)10,800(N)n

,N( ~X

N(0,1)~10

800XZ

n/

XZ

10

800775

10

800XP

10

800775ZP

0062.0

50.2ZP

Page 38: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Estimation & Confidence Interval

Page 39: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Estimation Problems

· Suppose we have a population with some unknown

parameter(s).

Example: Normal(,)

and are parameters.

· We need to draw conclusions (make inferences) about the

unknown parameters.

· We select samples, compute some statistics, and make

inferences about the unknown parameters based on the

sampling distributions of the statistics.

Statistical Inference

(1) Estimation of the parameters

Point Estimation

Interval Estimation (Confidence Interval)

(2) Tests of hypotheses about the parameters

Page 40: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Classical Methods of Estimation:

Point Estimation:

A point estimate of some population parameter is a single value of a statistic .

For example, the value of the statistic computed from a sample of size n is a point

estimate of the population mean .

x X

Interval Estimation (Confidence Interval = C.I.):

An interval estimate of some population parameter is an interval of the form ( , ),

i.e, << . This interval contains the true value of "with probability 1", that is P( << )=1UL LU

L U

Page 41: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Example of Point Estimation

Page 42: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Interval Estimation (Confidence Interval) of the Mean ():

An interval estimate of some population parameter is an interval of the form ( , ),

i.e, << . This interval contains the true value of "with probability 1", that is P( << )=1

L U

UL L U

( , ) is called a (1)100% confidence interval (C.I.) for .

1 is called the confidence coefficient

= lower confidence limit

= upper confidence limit

=0.1, 0.05, 0.025, 0.01 (0<<1)

UL

L

U

Page 43: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Interval Estimation (Confidence Interval) of the Mean ():

If is the sample mean of a random sample of size n

from a population (distribution) with mean and known variance2, then a (1)100% confidence interval for can be calculatedas follows depending on whether the population variance 2 isknown or not.

n/XXn

1i

i

Page 44: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

),(

22n

ZXn

ZX

nZX

2

nZX

nZX

22

where is the Z-value leaving an area

of /2 to the right; i.e., P(Z> )=/2, or

equivalently, P(Z< )=1/2.

2

Z

2

Z2

Z

Note:

We are (1)*100% confident that ),(

22n

ZXn

ZX

(i) First Case: 2 is known:

The Z value is called the Z-score and the test is called the Z-test

Page 45: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Example

The average zinc concentration recorded from a sample of zinc measurements in 36

different locations is found to be 2.6 gram/milliliter. Find a 95% and 99% confidence

interval (C.I.) for the mean zinc concentration in the river. Assume that the population

standard deviation is 0.3.

Solution:

= the mean zinc concentration in the river.

Population Sample

=?? n=36

=0.3 =2.6

First, a point estimate for is =2.6.

(a) We want to find 95% C.I. for .

= ??

95% = (1)100%

0. 95 = (1)

=0.05

/2 = 0.025

XX

Page 46: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

= Z0.025

= 1.96A 95% C.I. for is

2

Z

nZX

2

nZX

nZX

22

36

3.0)96.1(6.2

36

3.0)96.1(6.2

2.6 0.098 < < 2.6 + 0.098 2.502 < < 2.698 ( 2.502 , 2.698)We are 95% confident that ( 2.502 , 2.698).

Page 47: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%
Page 48: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

(b) Similarly, we can find that a 99% C.I. for is2.471 < < 2.729

( 2.471 , 2.729)We are 99% confident that ( 2.471 , 2.729)Notice that a 99% C.I. is wider than a 95% C.I. This is a tradeoff betweenaccuracy and precision

Theorem

If is used as an estimate of , we can then be

(1)100% confident that the error (in estimation) will

not exceed

X

nZ

2

Page 49: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Example:

In previous example, we are 95% confident that the sample mean

differs from the true mean by an amount less than 6.2X

098.036

3.0)96.1(

2

nZ

Note:

Let e be the maximum amount of the error, that is ,

then: nZe

2

nZe

2

e

Zn

2

2

2

eZn

Theorem :

If is used as an estimate of , we can then be (1)100% confident that

the error (in estimation) will not exceed a specified

amount e when the sample size is

X

2

2

eZn

Page 50: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Solution:

We have = 0.3 , e=0.05. Then by Theorem,

Therefore, we can be 95% confident that a random sample of size n=139 will provide

an estimate differing from by an amount less than e=0.05.

96.1

2

Z 1393.13805.0

3.096.1

22

2

eZn

Example

How large a sample is required in previous example if we want to be 95% confident

that our estimate of is off by less than 0.05?

Page 51: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

T-Distribution:

· Recall that, if X1, X2, …, Xn is a random sample of size n

from a normal distribution with mean and variance 2, i.e.

N(,), then

N(0,1)~n/

XZ

· We can apply this result only when 2 is known and number

of samples is 30 or more!

If 2 is unknown (or n<30), we replace the population variance

2 with the

sample variance · to have the following

statistic

1

)(1

2

2

n

XX

S

n

ii

nS

XT

/

Page 52: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Result:

If X1, X2, …, Xn is a random sample of size n from a normal distribution with mean

and variance 2, i.e. N(,), then the statistic

nS

XT

/

has a t-distribution with =n1degrees of freedom (df), and we write T~ t().

Note:

t-distribution is a continuous distribution.

The shape of t-distribution is similar to the shape of

the standard normal distribution.

Page 53: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Z and T Distributions

Page 54: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

t = The t-value above which we find an area equal to that

is P(T> t ) =

Since the curve of the pdf of T~ t() is symmetric about 0, we

have

t1 = t Values of t are tabulated in Tables.

Page 55: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%
Page 56: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Example:

Find the t-value with =14 (df) that leaves an area

of:

(a) 0.95 to the left.

(b) 0.95 to the right.

Solution:

= 14 (df); T~ t(14)

(a) The t-value that leaves an area of 0.95 to the left is

t0.05 = 1.761

Page 57: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

(b) The t-value that leaves an area of 0.95 to the right is

t0.95 = t 1 0.95 = t 0.05 = 1.761

Page 58: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Example:

For = 10 degrees of freedom (df), find t0.10 and t 0.85 .

Solution:

t0.10 = 1.372

t0.85 = t10.85 = t 0.15 = 1.093 (t 0.15 = 1.093)

Page 59: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

If and are the sample mean

and the sample standard deviation of a random sample of size n from a normal

population (distribution) with unknown variance 2, then a (1)100% confidence

interval for is :

nXXn

ii /

1

n

ii nXXS

1

2 )1/()(

Result:

),(

22n

StX

n

StX

n

StX

2

n

StX

n

StX

22

Interval Estimation (Confidence Interval) of the Mean ():

(ii) Second Case: 2 is unknown (or n is small):Recall:

)1 t(n~n/S

XT

Page 60: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

where is the t-value with =n1 degrees of freedom leaving

an area of /2 to the right; i.e., P(T> )=/2, or equivalently, P(T< )=1/2.

2

t

2

t

2

t

Example

The contents of 7 similar containers of sulfuric acid are 9.8, 10.2, 10.4, 9.8, 10.0, 10.2,

and 9.6 liters. Find a 95% C.I. for the mean of all such containers, assuming an

approximate normal distribution.

Solution:

.n=70.10/

1

nXXn

ii 283.0)1/()(

1

2

n

ii nXXS

First, a point estimate for is 0.10/1

nXXn

ii

Page 61: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

Now, we need to find a confidence interval for . = ??95%=(1)100% 0. 95=(1) =0.05 /2=0.025

= t0.025 =2.447 (with =n1=6 degrees of freedom)

A 95% C.I. for is2

t

n

StX

2

n

StX

n

StX

22

7

283.0)447.2(0.10

7

283.0)447.2(0.10

10.0 0.262< < 10.0 + 0.262 9.74 < < 10.26( 9.74 , 10.26)We are 95% confident that ( 9.74 , 10.26).

Page 62: Probability and Statistics - Cairo University (b) The percentage of healthy adult males who have hemoglobin level less than 14 is P(X 14) 100% = 0.01320 100% =1.32% Therefore, 1.32%

To summarize: Estimation of the Mean ():

Recall:

XXE )(

nXVar X

22)(

n,N~X

N(0,1)~n/

XZ

(2 is known and

n>=30)

)1 t(n~n/S

XT

(2 is unknown or n is

smaller than 30)

We use the sampling distribution of to make

inferences about .

X