bivariate data as 90425 (3 credits) complete a statistical investigation involving bi-variate data

Post on 14-Jan-2016

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Bivariate Data

AS 90425 (3 credits)

Complete a statistical investigation involving bi-variate data

Scatter Plots

The scatter plot is the basic tool used to investigate relationships between two quantitative variables.

Types of Variables

Quantitative(measurements/

counts)

Qualitative(groups)

What do I see in these scatter plots?

There appears to be a linear trend.

There appears to be moderate constant scatter about the trend line.

Negative Association.

No outliers or groupings visible.

454035

20

19

18

17

16

15

14

Latitude (°S)   

Mean January Air Temperatures for 30 New Zealand Locations

Tem

pera

ture

(°C

)

What do I see in these scatter plots?

There appears to be a non-linear trend.

There appears to be non-constant scatter about the trend line.

Positive Association. One possible outlier

(Large GDP, low % Internet Users).

0 10 20 30 40

GDP per capita (thousands of dollars)

0

10

20

30

40

50

60

70

80

Inte

rnet

Use

rs (

%)

% of population who are Internet Users vs GDP per capita for 202 Countries

What do I see in these scatter plots?

Two non-linear trends (Male and Female).

Very little scatter about the trend lines

Negative association until about 1970, then a positive association.

Gap in the data collection (Second World War).Year

1990198019701960195019401930

30

28

26

24

22

20

Ag

e

Average Age New Zealanders are First Married

What do I look for in scatter plots?

Trend Do you see

a linear trend… straight line

OR

a non-linear trend?

What do I look for in scatter plots?

Trend Do you see

a linear trend… straight line

OR

a non-linear trend?

What do I look for in scatter plots?

Trend Do you see

a positive association… as one variable gets bigger, so does the other

OR a negative association?

as one variable gets bigger, the other gets smaller

What do I look for in scatter plots?

Trend Do you see

a positive association… as one variable gets bigger, so does the other

OR a negative association?

as one variable gets bigger, the other gets smaller

What do I look for in scatter plots?

Scatter Do you see

a strong relationship… little scatter

OR

a weak relationship? lots of scatter

What do I look for in scatter plots?

Scatter Do you see

a strong relationship… little scatter

OR

a weak relationship? lots of scatter

What do I look for in scatter plots?

Scatter Do you see

constant scatter… roughly the same amount of scatter as you look across the plot

or non-constant scatter?

the scatter looks like a “fan” or “funnel”

What do I look for in scatter plots?

Scatter Do you see

constant scatter… roughly the same amount of scatter as you look across the plot

or non-constant scatter?

the scatter looks like a “fan” or “funnel”

What do I look for in scatter plots?

Anything unusual Do you see

any outliers? unusually far from the trend

any groupings?

What do I look for in scatter plots?

Anything unusual Do you see

any outliers? unusually far from the trend

any groupings?

Rank these relationships from weakest (1) to strongest (4):

Rank these relationships from weakest (1) to strongest (4):

2

1

4

3

Rank these relationships from weakest (1) to strongest (4):

How did you make your decisions? The less scatter there is about the

trend line, the stronger the relationship is.

Describing scatterplots

Trend – linear or non linear Trend – positive or negative Scatter – strong or weak Scatter – constant or non constant Anything unusual – outliers or

groupings

Correlation

Correlation measures the strength of the linear association between two quantitative variables

Get the correlation coefficient (r) from your calculator or computer

r has a value between -1 and +1: Correlation has no units

Calculating the coefficient of correlation (r)

The coefficient of correlation (r) is given by the following formula:

2 2

( )( )

( ) ( )

x x y yr

x x y y

Fortunately you will have Excel do all the work for you!

What can go wrong?

Use correlation only if you have two quantitative variables (variables that can be measured)

There is an association between gender and weight but there isn’t a correlation between gender and weight! Gender is not quantitative

Use correlation only if the relationship is linear

Beware of outliers!

Deceptive situations

r is a measure of how one variable varies in a linear relation to the other

An obvious pattern does not always indicate a high value of the coefficient of correlation (r)

A horizontal or vertical trend indicates that there is no relationship, and so r is (close to) zero

Always plot the data before looking at the correlation!

r = 0 No linear relationship, but

there is a relationship!

r = 0.9 No linear relationship, but

there is a relationship!

2007

For example: Rata's bean sprouts: r = 0

Minutes spent on measuring the plants10

Height of sprout

mm

10

20

30

2007

For example: Jenny's bean sprouts: r = 0

Soil depth (cm)2 4 6

Height of sprout

mm

10

20

Tick the plots where it would be OK to use a correlation coefficient to describe the strength of the relationship:

9876543210

4000300020001000

0

Position Number

Dis

tan

ce (

mill

ion

mile

s)

Distances of Planets from the SunReaction Times (seconds)

for 30 Year 10 Students

0

0.2

0.4

0.6

0.8

0 0.2 0.4 0.6 0.8 1

Non-dominant Hand

Dom

inan

t H

an

d

454035

20

19

18

17

16

15

14

Latitude (°S)   

Mean January Air Temperatures for 30 New Zealand Locations

Tem

pera

ture

(°C

)

Female ($)

Average Weekly Income for Employed New Zealanders in 2001

Male

($

)

0

200

400

600

800

1000

1200

0 200

400

600

800

Tick the plots where it would be OK to use a correlation coefficient to describe the strength of the relationship:

9876543210

4000300020001000

0

Position Number

Dis

tan

ce (

mill

ion

mile

s)

Distances of Planets from the SunReaction Times (seconds)

for 30 Year 10 Students

0

0.2

0.4

0.6

0.8

0 0.2 0.4 0.6 0.8 1

Non-dominant Hand

Dom

inan

t H

an

d

454035

20

19

18

17

16

15

14

Latitude (°S)   

Mean January Air Temperatures for 30 New Zealand Locations

Tem

pera

ture

(°C

)

Female ($)

Average Weekly Income for Employed New Zealanders in 2001

Male

($

)

0

200

400

600

800

1000

1200

0 200

400

600

800

Not linear

Remove two outliers, nothing happening

What do I see in this scatter plot?

Appears to be a linear trend, with a possible outlier (tall person with a small foot size.)

Appears to be constant scatter.

Positive association.

22 23 24 25 26 27 28 29

150

160

170

180

190

200

Foot size (cm)

Heig

ht

(cm

)

Height and Foot Size for 30 Year 10 Students

What will happen to the correlation coefficient if the tallest Year 10 student is removed?

It will get smaller

It won’t change

It will get bigger22 23 24 25 26 27 28 29

150

160

170

180

190

200

Foot size (cm)

Heig

ht

(cm

)

Height and Foot Size for 30 Year 10 Students

What will happen to the correlation coefficient if the tallest Year 10 student is removed?

It will get bigger22 23 24 25 26 27 28 29

150

160

170

180

190

200

Foot size (cm)

Heig

ht

(cm

)

Height and Foot Size for 30 Year 10 Students

What do I see in this scatter plot?

Appears to be a strong linear trend.

Outlier in X (the elephant).

Appears to be constant scatter.

Positive association.

6005004003002001000

40

30

20

10

Gestation (Days)

Life

Exp

ect

an

cy (

Years

)

Life Expectancies and Gestation Period for a sample of non-human Mammals

Elephant

What will happen to the correlation

coefficient if the elephant is removed?

It will get smaller

It won’t change

It will get bigger6005004003002001000

40

30

20

10

Gestation (Days)

Life

Exp

ect

an

cy (

Years

)

Life Expectancies and Gestation Period for a sample of non-human Mammals

Elephant

What will happen to the correlation

coefficient if the elephant is removed?

It will get smaller

6005004003002001000

40

30

20

10

Gestation (Days)

Life

Exp

ect

an

cy (

Years

)

Life Expectancies and Gestation Period for a sample of non-human Mammals

Elephant

top related