wynberg girls high-jade gibson-maths-data analysis statistics

Data AnalysisData AnalysisChapter 10Chapter 10

Types of DataTypes of Data• Quantitative data is data recorded

with numbers – eg: learner’s weight or number of goals

• Qualitative data is data recorded in words – eg: favourite colours

… … types of data cont.types of data cont.• Within these two types of data we

can also look at …– Discrete data – information collected by

counting (1, 2, 3 … no halves/quarters etc)

– Continuous data – information collected by measurement (may have decimals and fractions)

Do Ex 10.8 Q1 (Pg 232)

Data InterpretationData Interpretation• Once data has been collected and

sorted, it has to be interpreted and analysed

• Two types of interpretation:– Pictorial methods: involve drawing

graphs

– Arithmetic methods: involve working out:•Measures of central tendency – mean median and mode

•Measure of dispersion – range, percentiles, quartiles and the interquartile range

Displaying data Displaying data (Pictorial methods)(Pictorial methods)

• Histograms – no gaps (quanitative data)• Bar Graphs – bars do not touch • Compounded bar graphs

– Dual bar graph – data displayed next to each other

– Sectional bar graph – data displayed ‘on-top of one another’

• Pie Charts • Broken line graphs

Ex. 10.2 (3)Ex. 10.2 (3)

Average Day Time Temperatures

05

1015202530

Bis

ho

Blo

emfo

ntei

n

Joha

nnes

burg

Pie

term

aritz

burg Nel

spru

it

Kim

berle

y

Pol

okw

ane

Maf

iken

g

Cap

e T

own

City

Tem

per

atu

re (

Deg

rees

C

elci

us)

January

July

10.3 (1)10.3 (1)

Learners' plans for when they have finished school

Go to University

Go to Technikon

Get a job

Don't know

10.4 (2)10.4 (2)

Jacob's weight

0

0.5

1

1.5

2

2.5

3

3.5

0 1 2 3 4 5 6 7 8 9 10

Week

Ma

ss

in k

g

Misleading graphsMisleading graphs• Ways graphs/charts can be misleading:

– Using 3D in pictograms/bar-charts– Using perspective/shape to exaggerate– Reversing the direction of an axis (to make a

decrease seem like an increase)– Altering the scale of the y-axis (to make it look

more or less steep)– Leaving part of the axis out to exaggerate

differences

http://www.coolschool.ca/lor/AMA11/unit1/U01L02.htm#

Misleading statisticsMisleading statistics• Stats are notorious for being made up or

misleading• E.g.: during a political debate in USA , a

member of the opposition claimed that employment had gone up during the President’s term of office; yes it had … but only because the population had increased, the number of unemployed people had also increased.

“86 % of statistics are made up on the spot and the remaining 24% are

flawed”

Measures of central Measures of central tendencytendency

• Mean, mode and median• “Averages”

……• Mean (x) is like the average:

– Mean = sum of values number of values

– Can be affected by outliers, so not a good measure of central tendency if outliers

……• Median is the one in the middle when

placed in numerical order (smallest to biggest)– If there are outliers then median is a

better measure of central tendency

• Mode/Modal value is the value that appears the most

Things which can help with Things which can help with measures of central tendencymeasures of central tendency

• Frequency tables– Simple tables– Or for grouped data

• Stem and Leaf diagrams – these are especially helpful for data with more than ten items

10.9 (3)10.9 (3)Heights of mealie plants (in cm)

Jabu Robert

10 0, 8

11 4

12 9

13

9, 6 14

5, 1, 0 15 6, 8, 8, 8

0 16 2, 4, 5, 5, 7

9, 6, 3 17 6, 6, 8, 8

9, 9, 4 18 0, 0

8, 6, 3 19 5

7, 5, 5, 4, 0 20

10.10 (1)10.10 (1)Stem

Leaves

2 9

3 2, 6, 9

4 4, 5, 5, 5, 5, 7, 7

5 0, 1, 2, 2, 2, 4, 4, 6, 9, 9

6 4, 5, 9

7 0, 2, 7, 8, 9

8 0, 2, 2, 7

9 0

10 0

Grouped dataGrouped data• When the data has many different

measurements involved in it, the data is usually grouped in intervals (classes). Try to have between 8 and 14 classes. And start with a value below the minimum in the data.

• Tally: lines used to count up the frequency of scores

• Frequency is the number of times that score/value appears

Example of a ‘Grouped data table’Example of a ‘Grouped data table’

Classes Tally Frequency

(f)Midpoint

(X)fX

(Frequency x midpoint)

1-5 /// 3 (1+5)÷2=3

9

6-10 //// / 6 8 48

• Midpoint is the midpoint of that interval; calculated as on the table above

• fX = frequency multiplied by midpoint

Analysing the grouped dataAnalysing the grouped data– We can calculate:

• Actual mean (x) = sum of values number of values

• Estimated mean (X) = sum of ‘fX’ valuesnumber of

values

– We can draw a graph using the data:• eg: a histogram with ‘classes’ on the x-axis

and ‘frequency’ on the y-axis

……– We can find both a mode and modal

class:• Mode: value that appears most • Modal class: class (interval) with highest

frequency

– We can estimate the median from a histogram:• By estimating the value at which the ‘area’

of the histogram is divided into two equal parts

Histograms and frequency Histograms and frequency polygonspolygons

• Histograms and frequency polygons are both ‘frequency graphs’ – The difference between them is that the

histogram is made up of bars, whereas the frequency polygon is a line graph

– The ‘polygon’ is made from the lines of the graph and the horizontal axis

Drawing Frequency Drawing Frequency Polygons (2 methods)Polygons (2 methods)

• 1) Using the bars of a histogram– Mark the midpoint of the top of each bar– Join the points; including two points at zero

on either side of the histogram

……• 2) Without using a histogram:

– Plot the midpoint of each interval against the frequency

– Join the points; and add the two “zero” points on either side as with the histogram

0

1

2

3

4

5

6

7

8

1 2 3 4 5 6 7

Measures of DispersionMeasures of Dispersion• Tell us how the data is grouped

around the “average”• Is it closely grouped, or scattered

widely?• Measure of spread, scattering or

dispersion of scores

RangeRange• Range = largest value – smallest

value

– Has a few limitations in that it cannot be used for ‘grouped data’; and it doesn’t tell us anything about the distribution of the values between the largest and smallest

– For this reason we can also look at quartiles, deciles and/or percentiles

Quartiles, Percentiles and Quartiles, Percentiles and DecilesDeciles

• Quartiles: are points that subdivide the data into quarters

• Deciles: are points that subdivide the data into tenths

• Percentiles: are points that subdivide the data into hundredths

QuartilesQuartiles• First/lower quartile (Q1): is one

quarter of the way through the data set when ordered from lowest to highest

• Second quartile (Q2) = median• Third/upper quartile (Q3): is three

quarters of the way through the data set (in order)

• Interquartile range = third quartile – first quartile

• The interquartile range is a better measure of dispersion than the range as it is not affected by ‘extreme’ values

• It indicates how densely the data is spread around the median

• Semi-quartile range = Q3 – Q1

2• It is half of the interquartile range

wynberg girls high-jade gibson-maths-data analysis statistics

Education

data interpretation

grouped data stem

continuous data information

discrete data information

sectional bar graph

data analysis chapter

grouped data table midpointis

highest frequency