stat 350 lab session gsi: yizao wang section 016 mon 2pm30-4pm mh 444-d section 043 wed 2pm30-4pm mh...

26
Stat 350 Lab Session GSI: Yizao Wang Section 016 Mon 2pm30-4pm MH 444-D Section 043 Wed 2pm30-4pm MH 444-B

Upload: madlyn-horton

Post on 17-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Stat 350 Lab Session

GSI: Yizao Wang

Section 016 Mon 2pm30-4pm MH 444-DSection 043 Wed 2pm30-4pm MH 444-B

Outlines

• Binomial and normal distribution

• Sampling distribution and CLT (Module 4)

• Confidential intervals (Module 5, Actv.1)

• Permission to post forms• Today’s qwizdom questions are anonymous.

You don’t have to login with your UMID.

Binomial DistributionExample of B(n,p): coins flipping

Flip a coin n times. The probability of getting heads each time is p.

The number of heads we get during n times is a r.v. distributed as B(n,p)

Conditions to verify for binomial r.v. The number of heads from flipping a coin n times

1- n trails. n fixed in advance. Flipping n times.

2- 2 possible outcomes each trial. Heads of tails.

3- Independent outcomes between trails. The result of any flipping won’t change the others.

4- Probability of success is p, fixed for all trials. (Identical distribution)

Decided by the (same) coin.

Another classical example is giving a survey of one ‘yes/no’ question to n

random selected persons.

Normal Approximation of Binomial Distribution

X ~ B(n,p)• P(X = k) is decided by the parameters…

but• When n is large, very difficult to calculate!• Approximation by normal distribution

Approximately X ~ N( np,sqrt(np(1-p)) )

Normal Distribution

• Normal distribution is very rare in real world, but often a very good approximation, with some nice mathematical properties.

• Written as X ~ N(\mu,\sigma)• Z-score (z-statistic) is the standardized X by

Z = (X-\mu)//sigma

• Z ~ N(0,1) (why we want to standardize X?)• What do the normal distributions look like?

How to relate the shape with the two parameters?

Normal Distribution

• 10 minues In-lab review (8 questions) CTools\Lab Info\Lab review: Normal Distribution

Population vs. Sample

Population Sample

Definition Collection of items you want to study

Small collection of population items

Size Too large Small

Example Heights of all UM students

Heights of students in a certain lab

Random or fixed? Fixed Random (why?)

Parameters vs. StatisticsParameters Statistics

Where are they from? Population Sample

Example Mean height of UM students

Mean height of students in a certain lab

Known or not? No Calculable from sample

Random or fixed? Fixed Random

Examples of parameters and corresponding (why?) statistics

Mean Population mean Sample mean

Standard deviation Population s.d. Sample s.d.

Proportion Population proportion Sample proportion

Statistics are random variables. Parameters are constants.

Statistical Inference

• Population parameters are unknown constants.• Statistics are random variables obtained through sampling.• Statistical inference: using statistics to estimate

parameters.• Statistics are also called estimators (of parameter).

Example: X-bar is the estimator of μ• We need to study the distribution of statistics.

(Random variables have fixed distributions.)

Sampling Distribution• The probability distribution of the sample statistics

is called its sampling distribution.

The X in the pictures is not a random variable… Consider it as X-bar.

Statistical Inference

What kind of estimators do we prefer?

• Unbiased: the mean of estimator equals parameter.

• Small variation: small standard deviation.

Module 4

• Task 1-3

• Objectives: study the influence of the sample size and the distribution of parent population on the sampling distribution.

• Sampling Distribution Applet (CTools/lab info)

Summary

• The shape of the sampling distribution will depend on the distribution of original parent population as well as the sample size.

• The sampling distribution is approximately normal when…

4(a) Sampling Dist. of the Sample Mean

If the parent popul. is a normal dist. with a mean μ and a stand. dev. σ, then for any sample size, the sample mean will have a

__________ dist. with a mean of _____

and a stand. dev. of _____.

4(b) Central Limit Theorem

If the parent popul. is NOT a normal dist. but with a mean μ and a stand. dev. σ, then for a large sample size, the sample mean will have a

__________ dist. with a mean of _____

and a stand. dev. of _____.

What is the distinction between 4(a) and 4(b)?Choose all that apply...

A) Shape of parent popul.

B) Shape of dist. of sample mean

C) Standard deviation of sample mean

D) Sample size

True or False

• If n is large, the sample data will always have a normal distribution.

Clicker in your answer.

Confidence Interval

Recall the parameter-statistic comparison…

• We never know the true population parameter value.

• We use a one-sample (with several observations) statistic to estimate it.

• A sample statistic may not be exactly equal to the corresponding parameter value. (why confidence interval?)

Example: we are 95% confident that the true parameter value lies inside the confidence interval [a, b].

Confidence interval provides a method of stating:• What interval tells:

How close the value of a statistic is likely to be to the value of a parameter

• What confidence tells:The accuracy of it being that close

Confidence Interval

Confidence Interval

Basic structure for any confidence interval:

estimate multiplier standard error

The sample statistics such as p-hat, x-bar.

Margin of error. The Bigger the margin of error, the wider the CI (why?)

Two interpretations:

1. A 95% Confidence Interval: We are 95% confident that the true parameter value lies inside the confidence interval. The interval provides a range of reasonable values for the population parameter.

2. The 95% Confidence Level: If the procedure were repeated many times (that is, if we repeatedly took a random sample of the same size and computed the 95% confidence interval for each sample), we would expect 95% of the resulting confidence intervals to contain the true population parameter.

Confidence Interval

Principles for using CIs to guide decision making:• Principle 1: A value not in a CI can be rejected as possible

value of the population parameter.

A value in a CI is an “acceptable” or “reasonable” possibility for the value of a population parameter.

• Principle 2: When the CIs for parameters for two different populations do not overlap, it is reasonable to conclude that the parameters for the two populations are different.

Confidence Interval

• The probability that the true parameter lies in a particular, already computed, confidence interval is either 0 or 1. The interval is now fixed and the parameter is not random, so the parameter is either in that particular interval or it is not.

Confidence Interval

• Good summary on p26

• Confidence Interval for Mean Applet (CTools/Lab Info)

Module 5 Activity1

# 4: Interpret the (95%) confidence level in terms of a popul. mean.

A) We are 95% confident that the popul. mean will be in the computed confidence interval.

B) The computed confidence interval will contain the popul. mean 95% of the time.

C) 95% of all confidence intervals created with this method are expected to contain the popul. mean.

Before we finish today…

Questions or comments?