a short tour of probability & statistics presented by: nick bennett, grass roots consulting...

24
A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee, Santa Fe Institute, GUTS November 6, 2010 Santa Fe Alliance for Science Professional Enrichment Activity

Upload: virginia-hunter

Post on 05-Jan-2016

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

A Short Tour of Probability & Statistics

Presented by:Nick Bennett, Grass Roots Consulting & GUTSJosh Thorp, Stigmergic Consulting & GUTSIrene Lee, Santa Fe Institute, GUTS

November 6, 2010Santa Fe Alliance for ScienceProfessional Enrichment Activity

Page 2: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Outline

• Framing the problem (Nick)

• Review of Statistics (Irene)

• Randomness (Nick)

• Dice & Data (Josh)

• Problem of Points (Nick)

• Crosswalk of Common Core standards (Irene)

Page 3: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

What is Statistics?

• The science of collection, organization and interpertation of data.

Page 4: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

What do Statisticians do?

• Data analysis

• Probability

• Statistical inference

Page 5: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

What is a Statistical question?

• One that anticipates variability.

• Compare• “How old am I?”• “How old are the students in my school?”

Page 6: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Describing Data

• Data (plural) are the raw material• Data are the numbers we use to interpret reality

• We will look at a few different ways of describing data.• Dot plot• Frequency table• Stem and Leaf diagram

Page 7: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

A Sample Data Set

• 92 Penn State students’ weights

• MALES: • 140, 145, 160, 190, 155, 165, 150, 190, 195, 138, 160, 155, 153, 145,

170, 175, 175, 170, 180, 135, 170, 157, 130, 185, 190, 155, 170, 155, 215, 150, 145, 155, 155, 150, 155,150, 180, 160, 135, 160, 130, 155, 150, 148, 155, 150, 140, 180, 190, 145, 150, 164, 140, 142, 136, 123, 155.

• FEMALES:• 140, 120, 130, 138, 121, 125, 116, 145, 150, 112, 125, 130, 120, 130,

131, 120, 118, 125, 135, 125, 118, 122, 115, 102, 115, 150, 110, 116, 108, 95, 125, 133, 110, 150, 108

Page 8: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Dot Plot

In a dot plot, one dot per student goes over each student’s reported weight.

Page 9: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Frequency table -> Histogram

Divide the number line into intervals and count the number of students weights within each interval.

The “frequency” is the count in any given interval.

The “relative frequency” is the proportion of weights in each interval.

Page 10: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Histograms

• From the frequency table, we can make a bar graph called a histogram.

• Each bar covers an interval and is centered at the midpoint.

• The height of the bar corresponds with the number of data points in the interval

Page 11: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Stem-and-Leaf Diagram

Both summarizes data and shows all data points.

• The STEM shows intervals (ranges in tens)

• The LEAVES show data points (ranges in ones)

• Put the leaves in order

• Is there evidence of reporting bias?

Page 12: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Summary Statistics

• Central or typical value

• Spread about that value

Page 13: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Measures of Center

• Mean

• Median

Page 14: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

The Mean

• Given

x_

=xii=1

n∑n

x1, x2 , x3,....xn

Page 15: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

The Median

• The midpoint of the data• If even number of data points, it is the middle• If odd number of data points, average the two data

points nearest the middle.

Page 16: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Why two measures of center?

Page 17: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Measures of Spread

• Interquartile range

• Standard deviation

Page 18: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Interquartile range

• Put the data in numerical order

• Divide the data set into two equal groups with the median as the center point.

• The median of the low group = 1st quartile

• The median of the high group = 3rd quartile

S = S2

Q1

Q3

IQR =Q3 −Q1

Page 19: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Box & Whiskers plot

Q1 Q3median

1.5 IQR 1.5 IQR

.

Page 20: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Standard deviation

• Average squared distance =

• Sample variance

• Standard deviation =

(xii=1

n∑ −x_

)2

n

(xii=1

n∑ −x_

)2

n−1S2 =

S = S2

Page 21: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Z-scores, Standardized Scores

• A student weighing 175 pounds has a z-score of 1.26

zi =xi −x

_

S

175 −145.223.7

=1.26

Page 22: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Summary:

• Several ways to display data

• Measures of Center

• Measures of Spread

• Standard deviations

Page 23: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Statistical inference

• Use random sampling to draw inferences about a population.

• Generalizations about a population from a sample are valid only if the sample is representative of that population.

Page 24: A Short Tour of Probability & Statistics Presented by: Nick Bennett, Grass Roots Consulting & GUTS Josh Thorp, Stigmergic Consulting & GUTS Irene Lee,

Sampling

• With replacement.

• Without replacement.