psyc 235: introduction to statistics lecture format new content/conceptual info questions & work...

38
Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Upload: sarah-chase

Post on 01-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Psyc 235:Introduction to

Statistics

Lecture Format• New Content/Conceptual Info• Questions & Work through problems

Page 2: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

What you should have accomplished so far…

• ALEKS account set up• completed first assessment• Worked through first section of

material• Spent 5+ hours on ALEKS• Watched the video “What is

statistics?”

Any questions/problems so far?

Page 3: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

From Last week:

• Definition of Statistics…

C Collecting …

O Organizing …

D Displaying …

I Interpreting …

A Analyzing …

Data

Page 4: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

What is Data?

• Data is the generic term for numerical information that has been obtained on a set of objects/individuals etc.

• Variable: Some characteristic of the objects/individuals (e.g.,

height)• Data:

the values of a variable for a certain set of objects/individuals

Page 5: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Two branches of statistics:

Descriptive StatisticsDescribes a given set of data you have.

Inferential StatisticsGiven the data you have about these people,does this say anything about other people?

Page 6: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Today: Descriptive Statistics

• Graphical Presentations of Distributions Histograms Frequency Polygons Cumulative Distributions Box-and-whisker plots

• Descriptive Measures of Data Measures of Central Tendency Measures of Dispersion

Page 7: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Organizing Data

• Data from last week• Frequency Table

Time Awake Number of Students6:30-7:00 17:00-7:30 17:30-8:00 38:00-8:30 28:30-9:00 49:00-9:30 59:30-10:00 710:00-10:30 410:30-11:00 3

6:557

7:307:307:45

88:258:308:458:458:50

999

9:159:259:309:309:309:309:309:459:45

1010

10:1510:2510:3010:4510:50

Page 8: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Histograms

Note: Use Histogram to note patterns in data. (Skew, etc.)

0

1

2

3

4

5

6

7

8

6:30-7:00

7:00-7:30

7:30-8:00

8:00-8:30

8:30-9:00

9:00-9:30

9:30-10:00

10:00-10:30

10:30-11:00

Wake-Up Time

Number of Students

Page 9: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Frequency Polygon

Time Awake Number of Students

Frequency

6:30-7:00 1 0.0333

7:00-7:30 1 0.0333

7:30-8:00 3 0.1

8:00-8:30 2 0.0667

8:30-9:00 4 0.1333

9:00-9:30 5 0.1667

9:30-10:00 7 0.2333

10:00-10:30 4 0.1333

10:30-11:00 3 0.1

Total 30 1

0

0.05

0.1

0.15

0.2

0.25

6:30-7:00

7:00-7:30

7:30-8:00

8:00-8:30

8:30-9:00

9:00-9:30

9:30-10:00

10:00-10:30

10:30-11:00

Time Awake

Proportion of Students

Page 10: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Cumulative Frequency

0.0000

0.2000

0.4000

0.6000

0.8000

1.0000

1.2000

6:30-7:00

7:00-7:30

7:30-8:00

8:00-8:30

8:30-9:00

9:00-9:30

9:30-10:00

10:00-10:30

10:30-11:00

Time Awake

Time Awake Number of Students Frequency Cumulative6:30-7:00 1 0.03333333 0.03337:00-7:30 1 0.03333333 0.06677:30-8:00 3 0.1 0.16678:00-8:30 2 0.06666667 0.23338:30-9:00 4 0.13333333 0.36679:00-9:30 5 0.16666667 0.53339:30-10:00 7 0.23333333 0.766710:00-10:30 4 0.13333333 0.900010:30-11:00 3 0.1 1.0000Total 30 1

Page 11: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Box and Whisker Plots

• Graphical representation of the 4 quartiles, (e.g. data is split into 4 equally sized groups)

• If there are an even number of observations, let the “top” be the top half, and let the “bottom” be the bottom half.

• If there are an odd number of observations, let the “top” be everything above the median and the “bottom” be everything below the median.

• The first quartile is the “median of the bottom”. The third quartile is the “median of the top”.

Page 12: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Box-and-Whisker Example

6:557

7:307:307:45

88:258:308:458:458:50

999

9:159:259:309:309:309:309:309:459:45

1010

10:1510:2510:3010:4510:50

Median: 9:201st Quartile: 8:303rd Quartile: 9:45

Again, Note the information you can obtain by looking at this graphical representation of the data

Page 13: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Graphical Presentations of Data

• Listed Data: All data available

• Frequency Table:Data frequency for each cell is available

• Histograms: Data frequency for each bin is available

• Polygons: Data frequency for each bin is available

• Box-and-whisker plots:Summary info and data range available

• Often: Just summarize key features of the distribution.

Less And Less Information

Page 14: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Describing Distributions

Summary Measures

• Measures of Central Tendency “Average”, “Location”, “Center” of the distribution.

• Measures of Dispersion “Spread”, “Variability” of the distribution.

Summary Measures

• Measures of Central Tendency “Average”, “Location”, “Center” of the distribution.

• Measures of Dispersion “Spread”, “Variability” of the distribution.

Page 15: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Measures of Central Tendency

• Mean • Median• Mode

• May already be familiar with these concepts, but I want you to think of them in relation to describing data.

Page 16: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Mode

• Most frequent observation or observation class

• There can be several distinct modes• “Best guess” in single shot guessing

game 12

35

5

19ABCD

Page 17: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Mode (example data)

6:557

7:307:307:45

88:258:308:458:458:50

999

9:159:259:309:309:309:309:309:459:45

1010

10:1510:2510:3010:4510:50

Mode?

9:30

Page 18: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Median

• Any value M for which at least 50% of all observations are at or above M and at least 50% are at or below M.

• Resistant measure of central tendency (not heavily influenced by extreme values)

Page 19: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Calculating the Median

Order all observations from smallest to largest.

If the number of observations is odd, it is the “middle” object, namely the [(n+1)/2]th observation.For n = 61, it is the 31st

If the number of observations is even then, to get a unique value, take the average of the (n/2)th and the (n/2 +1)th observation. For = 60, it is the average of the 30th and the 31st observation.

Page 20: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Median (example data)

6:557

7:307:307:45

88:258:308:458:458:50

999

9:159:259:309:309:309:309:309:459:45

1010

10:1510:2510:3010:4510:50

Since there are an even number of data pointsTake the average of the middle two values.

,

Page 21: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Mean

• Sum up all observations (say, n many) and divide the total by n.

• Extreme values strongly influence the mean

• Mean as the center of the value in a distribution (center of gravity)

Page 22: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Calculating the mean

• Suppose that we collect n many observations

• Let denote the individual observations.

nXXXX ,...,,, 321

Mean • Sum up all observations (say, n many) and divide the total by n.

( )nn XXXnn

XXXX +++=

+++= ...

1...21

21Mean

Page 23: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Mathematical Notation

( )

∑∑ ==

+++=+++

=

ii

nn

Xnn

X

XXXnn

XXXX

1

...1...

2121Mean

n

n

ii XXXX +++=∑

=

...211

∑= iX

Page 24: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Mean (example data)

6:55 6.927 7

7:30 7.57:30 7.57:45 7.75

8 88:25 8.428:30 8.58:45 8.758:45 8.758:50 8.83

9 99 99 9

9:15 9.259:25 9.429:30 9.59:30 9.59:30 9.59:30 9.59:30 9.59:45 9.759:45 9.75

10 1010 10

10:15 10.2510:25 10.4210:30 10.510:45 10.7510:50 10.83

∑X = 273.34

X = 273.34 / 30 = 9.11

Transform back into time scale: ≈ 9:06

Page 25: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

A few notes about summation, and implications for calculation of the mean

naaaa =+++ ...

n

naan

i

=∑=1

Page 26: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

anaa n

n

in ==∑

=

1

1

1

0123456789

10

1 2 3 4 5

Mean

If all data has the same value, a, then the mean value is also a.

naan

i

=∑=1

because:

Page 27: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Multiplying all values by a constant

∑∑==

=n

ii

n

ii XaaX

11

( )nn XXXaaXaXaX +++=+++ ...... 2121

Page 28: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

If we multiply each observationby 2, then we obtain a newdistribution with a different shape

A multiplying constant affects the mean

(and the “spread”)

XXXn

iin

n

iin 222

1

1

1

1 == ∑∑==

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 10

Page 29: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Adding a constant to all values

naXXX

aXaXaX

n

n

++++=++++++

)...()(...)()(

21

21

naXaXn

ii

n

ii +⎟

⎞⎜⎝

⎛=+ ∑∑

== 11

)(

Page 30: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

1 2 3 4 5 6 7 8 9 10

If we add the constant 5to each observation,then we obtain a newdistribution that is shiftedto the right by 5 units

A shift affects the mean(but not the “spread”)

55

)5(

1

1

1

1

1

+=+⎟⎠

⎞⎜⎝

⎛=

+

=

=

XnX

X

n

n

iin

n

iin

1 2 3 4 5 6 7 8 9 10

Page 31: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Combining two variables

)...()...(

)(...)()(

2121

2211

nn

nn

YYYXXX

YXYXYX

+++++++=++++++

⎟⎠

⎞⎜⎝

⎛+⎟

⎞⎜⎝

⎛=+ ∑∑∑

===

n

ii

n

ii

n

iii YXYX

111

)(

Page 32: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Adding two variables

⎟⎠

⎞⎜⎝

⎛+⎟

⎞⎜⎝

⎛=+ ∑∑∑

===

n

ii

n

ii

n

iii YXYX

111

)(

YXYXYXn

iin

n

iin

n

iiin +=⎟

⎞⎜⎝

⎛+⎟

⎞⎜⎝

⎛=+ ∑∑∑

=== 1

1

1

1

1

1 )(

The mean of the sum of two variables is the sum of their means

Page 33: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Measures of Dispersion

• Population Standard Deviation• Sample Standard Deviation

Page 34: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

If we want to know how much the values vary around the

mean….

( ) ( ) ( )( )∑ −=

−++−+−

XX

XXXXXX

i

n...21

We could calculate how much each value varies from the mean…

Because of the way we calculate the mean, this formula gives zero no matter what data you have!

Page 35: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Population Standard Deviation

( ) ( ) ( )1

...22

2

2

12

−+−+−=

n

XXXXXXs n

( ) ( ) ( )1

...22

2

2

1

−+−+−=

n

XXXXXXs n

• Variance

• Standard Deviation

S

S

Page 36: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Sample Standard Deviation

• Variance

• Standard Deviation

( ) ( ) ( )1

...22

2

2

12

−+−+−=

n

XXXXXXs n

( ) ( ) ( )1

...22

2

2

1

−+−+−=

n

XXXXXXs n

There are n-1 “degrees of freedom”(If you know the mean and n-1 observationsthen you can figure out the n’th observation)

Page 37: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

Computational Formulas

• Note that there are computational formulas for the standard deviation.

• Look for them in ALEKS and write them down.

• Remember you can bring notes to your assessments

Page 38: Psyc 235: Introduction to Statistics Lecture Format New Content/Conceptual Info Questions & Work through problems

For Next Week…

• Keep working on ALEKS• Finish the descriptive statistics section• Watch the second video• If you can, start probability section

before Jason’s lecture next week.

• Remember: Office Hours and Lab are always available for you.