bruce a craig - purdue universitybacraig/notes514/topic1a.pdf · class policies - grade your grade...

41
Introduction Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 1 1

Upload: others

Post on 02-Jun-2020

20 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Introduction

Bruce A Craig

Department of StatisticsPurdue University

STAT 514 Topic 1 1

Page 2: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Outline

Class Website

Class Policies / Schedule

Overview of Course Material

Statistical Software

Overview of Design of Experiments (DOE)

Background Reading

STAT 514 Topic 1 2

Page 3: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Class Websitewww.stat.purdue.edu/∼bacraig/stat514.html

Course syllabus / Announcements

Lecture notes

SAS template programs and (possibly) help videos

Homework assignments

Exam and homework schedule

Information about group project

Data sets for lectures and homework

STAT 514 Topic 1 3

Page 4: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Class Policies - Attendance

On-campus: Not required but strongly recommended

Lecture notes / homeworks will be posted prior to classClass participation encouragedLecture also provides opportunity to ask questionsIf you have to leave early or arrive late, do not be adistraction by walking in front of the deskWill provide access to all lecture videos for reference

Off-campus: Responsible for all material discussed in lecture

Group discussion outside of class available on piazza.com

STAT 514 Topic 1 4

Page 5: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Class Policies - Grade

Your grade (out of 500 points) will be based on

Three exams, each worth 20% (100 pts) of your gradeHomework, worth 25% (125 pts) of your gradeGroup Project, worth 15% (75 pts) of your grade

The general policy is 90% for an A, 80% for a B, etc.Cutoffs may be lowered but they are never raised and+/− grades are implemented when appropriate.

STAT 514 Topic 1 5

Page 6: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Class Policies - Exams

There will be three exams during the semester

Each worth 20% of your gradeMust notify me at least a week prior to exam if there isa scheduling conflict....prefer you to take it earlierWill need a calculator with square root functionOpen book / open notesStrongly encourage constructing a summary sheetOld exams will be provided as exam date draws near

STAT 514 Topic 1 6

Page 7: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Class Policies - Homework

Expect “weekly” homework assignments

Will be due Tues by 11:59 PM using GradescopeFormat guidelines in syllabusIndividual vs group effortWorst grade will be droppedRepresents 25% of your gradeAnswer key will be posted on Web page after due date

STAT 514 Topic 1 7

Page 8: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Class Policies - Project

Group / Team project

Teams determined after Week 3 or 4Goal is to design an experiment and then analyzeresulting dataI will provide a “real” problem or problemsRepresents 15% of your gradeCheck web site for updates

STAT 514 Topic 1 8

Page 9: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Communication

Office Hours

Mon 3:00-4:30Fri 3:00-4:30By appt.

Email - [email protected]

Class email list and Web page for announcements

Piazza for class discussion

Off-campus : TA office hours TBD

STAT 514 Topic 1 9

Page 10: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Statistical Software

Lecture Software

Primarily using SAS for Windows 9.4Available on ITaP computers / Go RemoteCan also get own copy via Community Hub download

Free to use any software for homeworks but you areresponsible for your own software support. Some softwaremay not perform all procedures

Exams will include SAS output but there will be no SASprogramming questions

STAT 514 Topic 1 10

Page 11: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Getting Started with SAS

Will provide template programs to be “copied”

SAS handout on Web page

Syntax Help / Examples available

Click ’Help’Click ’SAS Help and Documentation’Click ’SAS Products’Click ’SAS/STAT’Click ’SAS/STAT 9.4 User’s Guide’

Software Consulting Service Software Desk (MATH G175)

Search the Web

STAT 514 Topic 1 11

Page 12: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

STAT 514 Topic 1 12

Page 13: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Overview of Experimental Design

Inputs ✲ Process or System

(Black Box)

✲ Response

X f (X) + ε y

Process is quantified by considering response variable y

There is variability in response y to selected inputs X due tonuisance or unknown factors and inherent noise

Interest is in understanding this process

How do changes in inputs affect average process response?

What levels of inputs maximize the average response?

What input combination results in the lowest uncertainty?

STAT 514 Topic 1 13

Page 14: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Traditional Approach

Design of experiments (DOE) combines1 Strategies of running an experiment2 Statistical tools for decision making

Uses linear models to describe/explain the process

Focuses on the plan of the experiment so that we obtainobjective conclusions

Many DOE courses focus too much on the analysisProper design usually results in straightforward analysis

STAT 514 Topic 1 14

Page 15: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Features to Consider When Planning

Statement of the problem / Goal of the experiment

What response variable should be used?

What is considered a meaningful change in the response?

What inputs should be studied in the experiment?

Which inputs are considered to be the most important?

Are there nuisance factors?

Are these nuisance factors controllable?

How can we block if they are controllable?

How many observations per combination of inputs?

What are the available resources? Experimental costs?

What is the experimental unit for each input factor?

STAT 514 Topic 1 15

Page 16: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Design of Experiments

Statement of the problem

What is the experiment intended to address or answer?Obvious question but often overlooked or left too vague

A sound goal goes a long way towards adequate planning

Response(s) to be studiedHow accurately can we measure response? On what scale?For an input combination

What is the expected range of response?What is the shape of the response distribution?

Input factors to be studiedWhat input factors might affect the response?What factors are of interest?What specific levels of input factor to consider?Can any nuisance input factors be held constant?

STAT 514 Topic 1 16

Page 17: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Design of Experiments

Number of trials/runs in the experimentHow large a difference in average response is consideredmeaningful or important?How much variation in response is expected?What costs and other resources are available?

Order of the experimental trials/runsWhat is the timing of the experiment? Can it be runsequentially?Can all inputs be assigned to like-sized units?Or are different input factors handled differently?

STAT 514 Topic 1 17

Page 18: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

What is the experimental unit?

Experimental Unit - Material to which a factor is assigned inan experiment

Probably the most important concept in statistical design

Defines unit to be replicated to increase factor precision

EUs may be different for different input factors

Need to know EUs in order to do proper analysis

STAT 514 Topic 1 18

Page 19: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

What is the sampling unit?

Sampling Unit - Material or object that is measured in anexperiment

Can be different from the experimental unit

Increasing sampling units does not impact precision asstrongly as increasing replicates

Examples:

Temperature assigned to fish tank, weight gain of eachfish measuredProcess applied to manufacture aluminum, number ofimpurities in square cm region measured

STAT 514 Topic 1 19

Page 20: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Example with different EUs?

3 corn varieties (A, B, C) and 3 fertilizers (R, R, R) on yieldNine factor combinations assigned to field in two ways:

A

B

C

C

B

A

B

C

A

A

B

C

C

B

A

B

C

A

Completely randomized Fertilizer randomized to columns

Possible sources of variability in responsecombination of fertilizer and variety −→ input factors

location in field / soil characteristics −→ nuisance factors

measurement error in determining yield −→ nuisance factor

management of each plot / weather −→ nuisance factors

application of fertilizer and/or variety to the “plots” ***

STAT 514 Topic 1 20

Page 21: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Key Tools in Design of Experiments

Replication - decrease uncertainty/ increase precision byaveraging out experimental variability

If Yi independent with mean µ and variance σ2 then

E(Y ) = µ and Var(Y ) = σ2/n

Blocking - decrease uncertainty by adjusting for(removing the effects of) specific nuisance factors

STAT 514 Topic 1 21

Page 22: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Key Tools in Design of Experiments

Randomization - provides stronger basis for use ofcoincidence argument

Provides protection - averages out unknown factorsIndependence of trials / Avoids biasesRandomization test ←→ ANOVA F -test

Factorial experiments / Orthogonality - Moreefficient than one factor at a time analysis and allowsinvestigation of interaction

STAT 514 Topic 1 22

Page 23: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Randomization

Random allocation of EUs to factor levels (or vice versa)

All possible assignments are equally likely

Why not just try to be fair?

Subjective assignment inevitably avoids some assignments

Can often lead to biased results or misrepresentation ofinherent uncertainty

STAT 514 Topic 1 23

Page 24: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Example

Consider assignment of three treatments to a field of nineregions (i.e., EUs). We assume that location in the field is ahidden/unknown nuisance factor, that there are no treatmenteffects, and that there is no experimental uncertainty.

-2

-2

0

-2

0

2

0

2

2

Represent unknown location effects in field

Allocations to consider

Fair: Randomly assign one rep per column/rowRandom: All assignments equally likely

STAT 514 Topic 1 24

Page 25: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Possible Randomizations

Fair

A

B

C

C

A

B

B

C

A

C

B

A

B

A

C

A

C

B

Square 1: A = 0/3 B = 0/3 C = 0/3

Square 2: A = 0/3 B = −2/3 C = 2/3

All Possible

A

B

C

B

A

B

C

C

A

B

A

C

B

A

C

A

C

B

Square 1: A = 0/3 B = −2/3 C = 2/3

Square 2: A = −2/3 B = −2/3 C = 4/3

STAT 514 Topic 1 25

Page 26: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Distributions of F Statistic

Assuming there is no difference among trts (H0 true), we want toreject Null hypothesis 100α% of the time.

Fair

Can show F statistic either 0.000 or 0.375 w/prob 50%F (0.05; 2, 6) = 5.14, so we’d reject 0% of timeLose test sensitivity, lower power

Random

Can show distribution of F statistic isF-stat 0.000 0.375 1.500 2.400 10.500 ∞

Prob .1286 .4179 .1929 .1929 .0643 .0036

Even with hidden trend, more like F (2, 6) distributionType I error rate is 0.0679

STAT 514 Topic 1 26

Page 27: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Example, continued

We can use simulation to study what would happen if therewas experimental noise in addition to these location effects.

Assuming ε ∼ N(µ = 0, σ), the following table summarizes theType I error for the standard F test using different values of σ(10,000 simulations each).

σAllocation 0.5 1.0 2.0 4.0 8.0

Fair 0.000 0.000 0.013 0.037 0.049

Random 0.065 0.061 0.053 0.050 0.049

Results similar when noise dominates location effects.

STAT 514 Topic 1 27

Page 28: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Randomization/Permutation Test

H0: No treatment effects → H0 : Response would be thesame regardless of what treatment was assigned to it

A

B

C

C

B

A

B

C

A

Randomization

5

4

2

6

3

8

1

2

9

Responses

A = 22/3 B = 8/3 C = 10/3

How unlikely is this result based solely on chance assignment?9!

3!3!3! = 1680 equally likely orderings

Compute F statistic for each ordering (i.e., generate the reference distribution)

Compare observed result with reference distribution to get P-value

No assumed distribution...Uses observed data to construct reference distribution

Often reference distribution closely represented by usual F distribution

STAT 514 Topic 1 28

Page 29: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Permutation test vs F test

A

B

C

C

B

A

B

C

A

1

B

B

C

A

C

A

B

A

C

2

Here are two possible allocations. Let’s compare P-values using the ANOVA F testand the permutation test

Analysis of VarianceBox 1: P(F (2, 6) > 1.59) = 0.28 Box 2: P(F (2, 6) > 0.11) = 0.90

Permutation test using 10,000 simulated allocationsBox 1: Pr(F ≥ 1.59) = .27 Box 2: Pr(F ≥ 0.11) = 0.87

STAT 514 Topic 1 29

Page 30: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Reference Dist vs F Dist

F

0 5 10 15 20

0.0

0.2

0.4

0.6

STAT 514 Topic 1 30

Page 31: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Summary

Don’t forget non-statistical knowledge: Statisticaltechniques are most effective when combined withproblem-specific knowledge. Ask questions to discover asmuch about the problem as possible.

Keep the design simple: Can often answer questions withsound straightforward approach. Complex designs moresensitive to problems.

Keep the analysis simple: Newer computer-intensivestatistical methods do not overcole a poorly designedexperiment.

Practical vs statistical significance: Need to initiallyconsider what is an “important” difference. Helps determineappropriate sample size. A statistical difference may not beanything of scientific value.

STAT 514 Topic 1 31

Page 32: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Summary

Experiments often iterative: Often little knowledge ofproblem and variability a priori. Pilot studies can be done toobtain information and/or used to ensure experiment can berun as planned. Additional experiments may focus on newlevels of important factors or include a new factor.

Randomization: Provides justification for usual F testanalysis. Helps avoid unintentional subjective biases inassignments.

STAT 514 Topic 1 32

Page 33: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Examples of Experiments

For each experiment, determine the response, treatment factors,and experimental units. Also describe differences in how theexperiment was randomized.

Exp 1 To study the effects of pesticides on birds, anexperimenter randomly allocated sixty-five chicks tofive diets (a control and four with a differentpesticide included). After a month each chick’scalcium content (mg) in one cm length of bone wasmeasured.

STAT 514 Topic 1 33

Page 34: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Exp 2 A psychologist is interested in studying the IQs of 1stgrade children from the low income areas of severalmajor cities. Six grade schools were randomly chosen(from the low income areas) and from each of theseschools, five 1st grade children were randomly chosenand had their IQs measured.

Exp 3 Brewer’s malt is produced from germinating barley.The following is an experiment to determine the bestconditions to germinate the barley. A total of thirtylots of barley seeds (100 seeds per lot) were equallyand randomly assigned to ten germination conditions.The conditions are combinations of the week afterharvest (1, 3, 6, 9, or 12 weeks) and the amount ofwater used in the process (4 ml or 8 ml).

STAT 514 Topic 1 34

Page 35: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Exp 4. Winter treatments to clear ice and snow can damageroads. An experiment was conducted comparing fourtreatments, each consisting of different combinationsof salt and sand. Because traffic level also damagesthe roads, four roads were selected for the study andeach treatment was randomly assigned to a portionof each road.

Exp 5. A researcher is interested in assessing a new fitnessregimen. Thirty subjects were randomly selected toparticipate with fifteen each assigned to the controlor treatment group. Prior to the regimen, a pre-testof the subject’s fitness was performed. Bloodmeasurements were taken 1, 5, 10, 30, and 60minutes into this fitness test. After the six weektreatment program, a similar post-test of fitness wasperformed with blood measurements again taken atthe same five time points.

STAT 514 Topic 1 35

Page 36: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Observational Study vs Designed

Experiment

Designed experiment : Start with set of EUs to which YOUassign treatment factors

Observational study : Start with several populations of EUs(conditions already built in) and you randomly sample frompopulations

An experiment compares treatments while an observationalstudy compares populations

“It is much easier to isolate the effects of interest if you can assign conditions. In anobservational study, the conditions you want to study will almost never be the onlything that makes one population different from another. This makes it hard to identifythe effects responsible for observed differences”

STAT 514 Topic 1 36

Page 37: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Observational Study vs Designed

Experiment

Experiments allow us to set up a direct comparisonbetween treatments, minimize any bias in the comparison,and control the error in the comparison

We are in control of experiments, and having that controlallows us to make stronger inferences about the nature ofdifferences that we see in the experimental observations.Specifically, we may make inferences about causation.

STAT 514 Topic 1 37

Page 38: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Confounding

Two influences on a response are confounded if the designmakes it impossible to isolate the effects of one from theother. Proper design of an experiment can prevent this.

Example 1: To assess how well a car going 50 MPH can stopon wet and dry pavement and experiment was done wherethere were 10 trials on dry pavement using a Mercedes and 10trials on wet pavement using a minivan.

Example 2: To assess how well a minivan going 50 MPH canstop on wet and dry pavement and experiment was donewhere the first 10 trials were done on dry and then second 10trials on wet pavement.

STAT 514 Topic 1 38

Page 39: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Selection Bias

Selection bias occurs in observational studies when theprocess of selecting from the populations to be comparedconfounds the effects of interest with other effects.

Versions of this bias occur in designed experiments whenproper randomization is not achieved and thus the sample ofEUs is not a representative sample from the population.Possible reasons for this include

Volunteer effectNoncomplianceAttrition / Response bias

STAT 514 Topic 1 39

Page 40: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Montgomery’s Theorems

1 If something can go wrong in conducting an experiment, it will.

2 The probability of successfully completing an experiment is inverselyproportional to the number of runs.

3 Never let one person design and conduct an experiment alone,particularly if that person is a subject-matter expert in the field ofstudy.

4 All experiments are designed experiments; some of them aredesigned well, and some of them are designed really badly. Thebadly design ones often tell you nothing.

5 About 80 percent of your success in conducting a designedexperiment results directly from how well you do thepre-experimental planning.

6 It is impossible to overestimate the logistical complexitiesassociated with running an experiment in a “complex” setting, suchas a factory or plant.

STAT 514 Topic 1 40

Page 41: Bruce A Craig - Purdue Universitybacraig/notes514/topic1a.pdf · Class Policies - Grade Your grade (out of 500 points) will be based on Three exams, each worth 20% (100 pts) of your

Background Reading

Overview of DOE: Montgomery Sections 1.1, 1.2, and 1.4

Tools of DOE: Montgomery Section 1.3

STAT 514 Topic 1 41