measurement variables describing distributions © 2014 project lead the way, inc. computer science...

20
Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

Upload: egbert-price

Post on 13-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

Measurement Variables

Describing Distributions

© 2014 Project Lead The Way, Inc.Computer Science and Software Engineering

Page 2: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

• A nearly perfect analogycontinuous : discreteanalog : digitalfloat : int

• Measurements of continuous variables are made discrete by "binning" them.

• How old are you? Time is continuous, but you answer in discrete, binned values.

Continuous vs. Discrete

Page 3: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

• Categorical (e.g., zip codes)categories with no meaningful

order• Ordinal (e.g., rank in a race)

ordered, but increasing by 1 has no consistent meaning

• Interval (e.g., grade level)Ordered, with consistent steps up, but no meaning for "doubling" or "tripling"

• Ratio (e.g., height)Ordered, with "2 times" being

"double"

Levels of a Measurement Variable

Page 4: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

Sample vs. Population• Population =

infinite pool of measurements, or all measurements possible

• Sample = subset of population

Page 5: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

• Population parameters= population mean= population standard deviation

• These are inferred from data

Sample vs. Population• Sample

statistics = sample mean = sample standard deviation

• These describe data

Page 6: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

Sample vs. Population• Infer population distribution from

sample histogram • Sample histogram matches parent

distribution better with large sample visualized with small intervals

Page 7: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

• Half of the area under the distribution is to the left of the median

Median

Page 8: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

Mean, Median, Mode

Page 9: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

• y-axis shows values of the data• Splits data into quartiles

Box Plot

heig

ht

Page 10: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

Each box contains 25% of the data

The IQR (Interquartile Range) Contains 50% of the Data

Page 11: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

Whiskers extend to max and min… usually

Box Plot

Page 12: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

Whiskers and Outliers Show max/min

Page 13: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

The Range Contains 100% of the Data

Page 14: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

• A family of distributions with very similar shape

• One normal distribution for each μ and σ

Normal Distributions

μ

σ

• μ ("mu") = population mean

• σ ("sigma") = population standard deviation

Page 15: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

• One normal distribution for any pair μ , σ• Example: μ = 6 and σ = 2.2

A Normal Distribution

μ

σ

• μ ("mu") = population mean

• σ ("sigma") = population standard deviation

Page 16: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

• μ = 0 and σ = 1

The Standard Normal Distribution

μ

σ

Page 17: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

The Empirical Rule: 67% - 95% - 99.7%

67% area

95% area

99.7% area

values within μ ±

σ

values within μ ±

values within μ ± 3σ

Page 18: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

Shape, Center, Spread

• These distributions are both positively-skewed because they are right-tailed

Page 19: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

Shape, Center, Spread

Page 20: Measurement Variables Describing Distributions © 2014 Project Lead The Way, Inc. Computer Science and Software Engineering

Shape, Center, Spread