ap statistics lesson 4 – 2 ( day 1 ) cautions about correlation and regression

14
AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Cautions About Correlation and Regression Regression

Upload: charlene-richards

Post on 13-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

AP STATISTICS

LESSON 4 – 2 ( DAY 1 )

Cautions About Correlation and Regression Cautions About Correlation and Regression

Page 2: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

ESSENTIAL QUESTION: What is causation and how can it be determined?

OBJECTIVES:•To examine data for causation, and to examine data.

•To be careful when using models for extrapolation.

•To understand that causation may be hard to determine do to effects of lurking variables, common response, and confounding variables.

Page 3: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

Correlation and Regression

• Correlation and regression describe only linear relationships.

• The correlation r and the least-squares regression line are not resistant. One influential or incorrectly entered data point can greatly change these measures.

• Always plot data before interpreting Always plot data before interpreting regression or correlation.regression or correlation.

Page 4: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

Extrapolation

Extrapolation is the use of a regression line for prediction far outside the domain of values of the explanatory variable x that you used to obtain the line or curve. Such predictions are often not accurate.

Page 5: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

Page 226

Example 4.10

DISCRIMINATION IN MEDICAL TREATMENT

Page 6: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

Lurking Variables

A lurking variablelurking variable is a variable that is not among the explanatory or response variables in study and yet may influence the interpretation of relationships among those variables.

Page 7: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

Lurking Variables

• The relationship between two variables can be strongly influenced by lurking variables.

• Many lurking variables change systematically over time.

• One method of detecting if time has an influence is to plot residuals and response variables over time if available.

Page 8: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

Using averaged data

• Many regression or correlation studies work with averages or other measures that combine information from many individuals. Note this carefully and resist the temptation to apply the results of such studies to individuals.

• Correlations based on averages are usually too high when applied to individuals. This is another reminder that it is important to note exactly what variables were measured in a statistical study.

Page 9: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

The question of causation

In many studies of the relationship between two variables, the goal is to establish that changes in the explanatory variable cause changes in the response variable.

Even when a strong association is present, the conclusion that this association is due to a causal linking in the variables is often elusive.

Page 10: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

Explaining association: causation

Variable x and y show a strong association (dashed line). This association may be the result of any of several causal relationships ( solid arrow).

Page 11: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

Causal associations

• Causation: changes in x cause changes in y.• Common response: Changes in both x and y

are caused by changes in a lurking variable z.• Confounding: The effect ( if any ) of x and y is

confounded with the effect of a lurking variable.• Even when direct causation is present. It is

exactly a complete explanation of an association between two variables.

Page 12: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

Explaining association: Common Response

• Beware of lurking variables when thinking about an association between two variables.

• The observed association between the variables x and y is explained by a lurking variable z. Both x and y change to changes in z. This common response creates an association even though there may be no direct causal link between x and y.

Page 13: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

Confounding

• Two variables are confounded when their effects on a response variable cannot be distinguished from each other. The confounded variables may be either explanatory variables or lurking variables.

• Even a very strong association between two variables is not by itself good evidence that there is a cause-and-effect link between the variables.

Page 14: AP STATISTICS LESSON 4 – 2 ( DAY 1 ) Cautions About Correlation and Regression

What are the criteria for establishing causation when we can’t dfo an experiment?

• The association is strong.• The association is consistent.• Higher doses are associated with stronger

responses.• The alleged cause precedes the effect in

time.• The alleged cause is plausible.