statistics and modelling 2011. topic 1: introduction to statistical analysis purpose – to revise...

Post on 14-Dec-2015

217 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Statistics and Modelling

2011

Topic 1: Introduction to statistical analysis

Purpose – To revise and advance our understanding of descriptive

statistics and how to use them.

LESSON 1 – Central tendency and Variability

• Activity (as students arrive): Measure each other’s heights. – Students then calculate the mean height.

• NOTES: What are statistics? Look at measures of Central Tendency.

• Measures of Variability – what do standard deviation and variance actually tell us?

HW: Worksheet + Old Sigma (2nd edition) – Pg. 148: Ex. 10.1

Statistics

Statistics are ___________________________________________

____________________.

Population Mean = Sum of all scores Population size

= N

x

Statistics

Statistics are numerical values that describe the characteristics of a set of numbers (data-set).

2 types of statistics:

Measures of ______ ________: Mean and median & mode.

Measures of ______: Standard deviation, variance, inter-quartile range

Population Mean = Sum of all scores Population size

= N

x

Statistics

Statistics are numerical values that describe the characteristics of a set of numbers (data-set).

2 types of statistics:

Measures of Central Tendency: Mean and median & mode.

Measures of ______: Standard deviation, variance, inter-quartile range

Population Mean = Sum of all scores Population size

= N

x

Statistics

Statistics are numerical values that describe the characteristics of a set of numbers (data-set).

2 types of statistics:

Measures of Central Tendency: Mean and median & mode.

Measures of Variability: Standard deviation, variance, inter-quartile range

Measures of Central Tendency:

Mean – easy to calculate but _________ ___ ________ _______.Population Mean = Sum of all scores Population size

= N

x

Statistics

Statistics are numerical values that describe the characteristics of a set of numbers (data-set).

2 types of statistics:

Measures of Central Tendency: Mean and median & mode.

Measures of Variability: Standard deviation, variance, inter-quartile range

Measures of Central Tendency:

Mean – easy to calculate but sensitive to extreme values.Sample Mean x = Sum of all scores Sample size

x = n

x

Population Mean = Sum of all scores Population size

= N

x

Statistics

Statistics are numerical values that describe the characteristics of a set of numbers (data-set).

2 types of statistics:

Measures of Central Tendency: Mean and median & mode.

Measures of Variability: Standard deviation, variance, inter-quartile range

Measures of Central Tendency:

Mean – easy to calculate but sensitive to extreme values.Sample Mean x = Sum of all scores Sample size

x = n

x

Population Mean = Sum of all scores Population size

= N

x

2 types of statistics:

Measures of Central Tendency: Mean and median & mode.

Measures of Variability: Standard deviation, variance, inter-quartile range.

Measures of Central Tendency: Mean – easy to calculate but sensitive to extreme values.

Median – rank the data and find the middle entry.

Advantage: More _____ than the mean because it doesn’t get

dragged up or down by extreme values.

Sample Mean x = Sum of all scores Sample size

x = n

x Population Mean = of all scoresPopulation size

Sum

2 types of statistics:

Measures of Central Tendency: Mean and median & mode.

Measures of Variability: Standard deviation, variance, inter-quartile range.

Measures of Central Tendency: Mean – easy to calculate but sensitive to extreme values.

Median – rank the data and find the middle entry.

Advantage: More robust than the mean because it doesn’t get

dragged up or down by extreme values.

Disadvantages:

Sample Mean x = Sum of all scores Sample size

x = n

x Population Mean = of all scoresPopulation size

Sum

Mean – easy to calculate but sensitive to extreme values.

Median – rank the data and find the middle entry.

Advantage: More robust than the mean because it doesn’t get

dragged up or down by extreme values.

Disadvantages:

1. _______________________________________________.

_____________________________.

2. _____________________________.

Sample Mean x = Sum of all scores Sample size

x = n

x

Population Mean = of all scoresPopulation size

N

Sum

=x

Mean – easy to calculate but sensitive to extreme values.

Median – rank the data and find the middle entry.

Advantage: More robust than the mean because it doesn’t get

dragged up or down by extreme values.

Disadvantages:

1. Takes longer to calculate because you must rank the numbers first.

2. _____________________________.

Sample Mean x = Sum of all scores Sample size

x = n

x

Population Mean = of all scoresPopulation size

N

Sum

=x

Mean – easy to calculate but sensitive to extreme values.

Median – rank the data and find the middle entry.

Advantage: More robust than the mean because it doesn’t get

dragged up or down by extreme values.

Disadvantages:

1. Takes longer to calculate because you must rank the numbers first.

2. Not a function of all of the numbers.

The Mode - The most frequently occurring value. This is usually

somewhere near the middle, but not always. Thus the mode

is not always a good measure of central tendency.

Sample Mean x = Sum of all scores Sample size

x = n

x

Population Mean = of all scoresPopulation size

N

Sum

=x

Measures of VariabilityStandard deviation, variance, inter-quartile range.

The standard deviation is a measure of the _____________________

_____.

Measures of VariabilityStandard deviation, variance, inter-quartile range.

The standard deviation is a measure of the average distance from the mean.

The _______ is also a measure of variability. It is ________________

_______. ___________________________________________

____________________.

Measures of VariabilityStandard deviation, variance, inter-quartile range.

The standard deviation is a measure of the average distance from the mean.

The variance is also a measure of variability. It is ________________

_______. ___________________________________________

____________________.

Measures of VariabilityStandard deviation, variance, inter-quartile range.

The standard deviation is a measure of the average distance from the mean.

The variance is also a measure of variability. It is the standard deviation squared. ___________________________________________

____________________.

Measures of VariabilityStandard deviation, variance, inter-quartile range.

The standard deviation is a measure of the average distance from the mean.

The variance is also a measure of variability. It is the standard deviation squared. We actually work out the variance first, then square root it to get the standard deviation.

Variance Var(X)

Var(X) = The Mean of the Squared Deviations from the Mean

(i.e. the average squared distance from the mean)

Var(X) =

Measures of VariabilityStandard deviation, variance, inter-quartile range.

The standard deviation is a measure of the average distance from the mean.

The variance is also a measure of variability. It is the standard deviation squared. We actually work out the variance first, then square root it to get the standard deviation.

Variance Var(X)

Var(X) = The Mean of the Squared Deviations from the Mean

(i.e. the average squared distance from the mean)

Var(X) = Sum of Squared Deviations from the Sample Mean Sample size

Measures of VariabilityStandard deviation, variance, inter-quartile range.

The standard deviation is a measure of the average distance from the mean.

The variance is also a measure of variability. It is the standard deviation squared. We actually work out the variance first, then square root it to get the standard deviation.

Variance Var(X)

Var(X) = The Mean of the Squared Deviations from the Mean

(i.e. the average squared distance from the mean)

Var(X) = Sum of Squared Deviations from the Sample Mean Sample size

Formula: Var(X) =

n

xx 2

Variance Var(X)

Var(X) = The Mean of the Squared Deviations from the Mean

(i.e. the average squared distance from the mean)

Standard Deviation s or

Standard Deviation

Var(X) = Sum of Squared Deviations from the Sample Mean Sample size

Formula: Var(X) =

Variance

n

xx 2

Variance Var(X)

Var(X) = The Mean of the Squared Deviations from the Mean

(i.e. the average squared distance from the mean)

Standard Deviation s or

Standard Deviation Formula:

A measure of the average distance from the mean.

Var(X) = Sum of Squared Deviations from the Sample Mean Sample size

Variance

n

xx

s

2

Formula: Var(X) =

n

xx 2

Now look at the class height data, and calculate the variance & SD.

HW: Do Mean & SD worksheet + Sigma (old) Ex. 10.1.

LESSON 2– Standard Deviation applications:

Points of today: Develop an understanding of what variance and standard deviation

tell us.

STARTER: Mark HW worksheet.

1. Alternative formula for variance & standard deviation.

2. Sigma (old – 2nd edition): Pg. 151 – Ex. 10.2.

3. Types of data – discrete & continuous.

4. Sigma (old – 2nd edition): Pg. 157 – Ex. 10.4 (complete for HW).

Variance & standard deviation

The formula: Var(X)=

can be re-arranged to get:

Var(X) =

So Standard Deviation can be given by:

s =

This version of the formula is quicker to use because you don’t have to calculate the distance from the mean for each individual value.

n

xx 2

22

)(xn

x

22

)(xn

x

xx

Caution: When calculating the SD from a freq. table, remember to multiply each x value by the number of times it occurs (its frequency)!

Copy, then do Sigma (old) - pg. 151: Ex. 10.2.

Variance formula proof:Link between the 2 variance formulas: Prove that =

Left hand side =

=

=

=

=

=

=

= Right hand side

n

xx 2

2

2

)( xn

x

n

xx 2

2)(1

xxn

).2(1 22 xxxxn

n

x

n

xx

n

x

22 )(.2

n

xn

n

x

xn

x2

2

)(2

22

2

)( )(2 xxn

x

2

2

)( xn

x

LESSON 3– Grouped data & frequency:

Points of today: Interpret displays of grouped data based on relative frequency.

1. Difference between discrete and continuous data.

2. How to calculate (estimate) the mean and standard deviation of grouped data displayed on a frequency table.

Sigma (old – 2nd edition): Ex. 10.4

Discrete and Continous data

Discrete Continuous

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting

What values can it take?

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting

What values can it take?

Whole numbers or rounded values

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting

What values can it take?

Whole numbers or rounded values

What question is being asked?

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting

What values can it take?

Whole numbers or rounded values

What question is being asked?

‘How many ?’

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting

What values can it take?

Whole numbers or rounded values

What question is being asked?

‘How many ?’

Examples:

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting

What values can it take?

Whole numbers or rounded values

What question is being asked?

‘How many ?’

Examples:• Number of students who gain Excellence in the first test.• Money – since we count it in dollars and cents.

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting

What values can it take?

Whole numbers or rounded values

What question is being asked?

‘How many ?’

Examples:• Number of students who gain Excellence in the first test.• Money – since we count it in dollars and cents.

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting Measurement

What values can it take?

Whole numbers or rounded values

What question is being asked?

‘How many ?’

Examples:• Number of students who gain Excellence in the first test.• Money – since we count it in dollars and cents.

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting Measurement

What values can it take?

Whole numbers or rounded values

All real numbers (anywhere along the number line – infinite precision)

What question is being asked?

‘How many ?’

Examples:• Number of students who gain Excellence in the first test.• Money – since we count it in dollars and cents.

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting Measurement

What values can it take?

Whole numbers or rounded values

All real numbers (anywhere along the number line – infinite precision)

What question is being asked?

‘How many ?’ ‘How long ?’, ‘ How heavy ?’

Examples:• Number of students who gain Excellence in the first test.• Money – since we count it in dollars and cents.

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

Counting Measurement

What values can it take?

Whole numbers or rounded values

All real numbers (anywhere along the number line – infinite precision)

What question is being asked?

‘How many ?’ ‘How long ?’, ‘ How heavy ?’

Examples:• Number of students who gain Excellence in the first test.• Money – since we count it in dollars and cents.

• Height• Distance• Weight• Volume• Time

Copy then do Sigma (old) - pg. 157: Ex. 10.4.Q1 and 2.

Discrete and Continous data

Discrete ContinuousWhere does its value come from?

What values can it take?

What question is being asked?

Examples:

Calculating the mean & standard deviation from a frequency table

x Number of children in

your family

f Frequency

x × f

x 2 × f

123456789

10

Σ

Calculating the mean & standard deviation when data is grouped into

intervals

Height in cm

m Mid-point of

interval

f Frequency

m × f

m 2 × f

165-->(170) 3170-->(175) 4175-->(180) 5180-->(185) 3185-->(190) 2190-->(195) 0195-->(200) 1

Σ

Mean & std. deviation on your calc.

On a Graphics Calculator (GC):

Type:

MENU, STAT, CALC (F2),

SET (F6):

1Var XList :List 1 (F1)

1Var Freq :1 (F1)

EXIT (goes back)

Now enter the data into List 1.

Press 1Var

: mean

x : sum of all values

x2 : sum of the squares of all values

xn : standard deviation.

x

On a Scientific Calculator:

MODE 2 (puts it into STAT mode)

Now enter the first data-value and press M+ (means add to memory)

Repeat, pressing M+ after each:

E.g.: 4 M+ 7 M+ 8 M+ etc.

Every time you’ve entered in a new value it will tell you the number of values now saved. (e.g. n=3).

Once all are entered, type “=“, then:

SHIFT 2: 1: for the mean;

2: (xn) for the SD.

SHIFT Mode: Clears the memory

x

top related