wynberg girls high-jade gibson-maths-data analysis statistics
DESCRIPTION
Powerpoint slides for data analysis in statisticsTRANSCRIPT
Data AnalysisData AnalysisChapter 10Chapter 10
Types of DataTypes of Data• Quantitative data is data recorded
with numbers – eg: learner’s weight or number of goals
• Qualitative data is data recorded in words – eg: favourite colours
… … types of data cont.types of data cont.• Within these two types of data we
can also look at …– Discrete data – information collected by
counting (1, 2, 3 … no halves/quarters etc)
– Continuous data – information collected by measurement (may have decimals and fractions)
Do Ex 10.8 Q1 (Pg 232)
Data InterpretationData Interpretation• Once data has been collected and
sorted, it has to be interpreted and analysed
• Two types of interpretation:– Pictorial methods: involve drawing
graphs
– Arithmetic methods: involve working out:•Measures of central tendency – mean median and mode
•Measure of dispersion – range, percentiles, quartiles and the interquartile range
Displaying data Displaying data (Pictorial methods)(Pictorial methods)
• Histograms – no gaps (quanitative data)• Bar Graphs – bars do not touch • Compounded bar graphs
– Dual bar graph – data displayed next to each other
– Sectional bar graph – data displayed ‘on-top of one another’
• Pie Charts • Broken line graphs
Ex. 10.2 (3)Ex. 10.2 (3)
Average Day Time Temperatures
05
1015202530
Bis
ho
Blo
emfo
ntei
n
Joha
nnes
burg
Pie
term
aritz
burg Nel
spru
it
Kim
berle
y
Pol
okw
ane
Maf
iken
g
Cap
e T
own
City
Tem
per
atu
re (
Deg
rees
C
elci
us)
January
July
10.3 (1)10.3 (1)
Learners' plans for when they have finished school
Go to University
Go to Technikon
Get a job
Don't know
10.4 (2)10.4 (2)
Jacob's weight
0
0.5
1
1.5
2
2.5
3
3.5
0 1 2 3 4 5 6 7 8 9 10
Week
Ma
ss
in k
g
Misleading graphsMisleading graphs• Ways graphs/charts can be misleading:
– Using 3D in pictograms/bar-charts– Using perspective/shape to exaggerate– Reversing the direction of an axis (to make a
decrease seem like an increase)– Altering the scale of the y-axis (to make it look
more or less steep)– Leaving part of the axis out to exaggerate
differences
http://www.coolschool.ca/lor/AMA11/unit1/U01L02.htm#
Misleading statisticsMisleading statistics• Stats are notorious for being made up or
misleading• E.g.: during a political debate in USA , a
member of the opposition claimed that employment had gone up during the President’s term of office; yes it had … but only because the population had increased, the number of unemployed people had also increased.
“86 % of statistics are made up on the spot and the remaining 24% are
flawed”
Measures of central Measures of central tendencytendency
• Mean, mode and median• “Averages”
……• Mean (x) is like the average:
– Mean = sum of values number of values
– Can be affected by outliers, so not a good measure of central tendency if outliers
……• Median is the one in the middle when
placed in numerical order (smallest to biggest)– If there are outliers then median is a
better measure of central tendency
• Mode/Modal value is the value that appears the most
Things which can help with Things which can help with measures of central tendencymeasures of central tendency
• Frequency tables– Simple tables– Or for grouped data
• Stem and Leaf diagrams – these are especially helpful for data with more than ten items
10.9 (3)10.9 (3)Heights of mealie plants (in cm)
Jabu Robert
10 0, 8
11 4
12 9
13
9, 6 14
5, 1, 0 15 6, 8, 8, 8
0 16 2, 4, 5, 5, 7
9, 6, 3 17 6, 6, 8, 8
9, 9, 4 18 0, 0
8, 6, 3 19 5
7, 5, 5, 4, 0 20
10.10 (1)10.10 (1)Stem
Leaves
2 9
3 2, 6, 9
4 4, 5, 5, 5, 5, 7, 7
5 0, 1, 2, 2, 2, 4, 4, 6, 9, 9
6 4, 5, 9
7 0, 2, 7, 8, 9
8 0, 2, 2, 7
9 0
10 0
Grouped dataGrouped data• When the data has many different
measurements involved in it, the data is usually grouped in intervals (classes). Try to have between 8 and 14 classes. And start with a value below the minimum in the data.
• Tally: lines used to count up the frequency of scores
• Frequency is the number of times that score/value appears
Example of a ‘Grouped data table’Example of a ‘Grouped data table’
Classes Tally Frequency
(f)Midpoint
(X)fX
(Frequency x midpoint)
1-5 /// 3 (1+5)÷2=3
9
6-10 //// / 6 8 48
• Midpoint is the midpoint of that interval; calculated as on the table above
• fX = frequency multiplied by midpoint
Analysing the grouped dataAnalysing the grouped data– We can calculate:
• Actual mean (x) = sum of values number of values
• Estimated mean (X) = sum of ‘fX’ valuesnumber of
values
– We can draw a graph using the data:• eg: a histogram with ‘classes’ on the x-axis
and ‘frequency’ on the y-axis
……– We can find both a mode and modal
class:• Mode: value that appears most • Modal class: class (interval) with highest
frequency
– We can estimate the median from a histogram:• By estimating the value at which the ‘area’
of the histogram is divided into two equal parts
Histograms and frequency Histograms and frequency polygonspolygons
• Histograms and frequency polygons are both ‘frequency graphs’ – The difference between them is that the
histogram is made up of bars, whereas the frequency polygon is a line graph
– The ‘polygon’ is made from the lines of the graph and the horizontal axis
Drawing Frequency Drawing Frequency Polygons (2 methods)Polygons (2 methods)
• 1) Using the bars of a histogram– Mark the midpoint of the top of each bar– Join the points; including two points at zero
on either side of the histogram
……• 2) Without using a histogram:
– Plot the midpoint of each interval against the frequency
– Join the points; and add the two “zero” points on either side as with the histogram
0
1
2
3
4
5
6
7
8
1 2 3 4 5 6 7
Measures of DispersionMeasures of Dispersion• Tell us how the data is grouped
around the “average”• Is it closely grouped, or scattered
widely?• Measure of spread, scattering or
dispersion of scores
RangeRange• Range = largest value – smallest
value
– Has a few limitations in that it cannot be used for ‘grouped data’; and it doesn’t tell us anything about the distribution of the values between the largest and smallest
– For this reason we can also look at quartiles, deciles and/or percentiles
Quartiles, Percentiles and Quartiles, Percentiles and DecilesDeciles
• Quartiles: are points that subdivide the data into quarters
• Deciles: are points that subdivide the data into tenths
• Percentiles: are points that subdivide the data into hundredths
QuartilesQuartiles• First/lower quartile (Q1): is one
quarter of the way through the data set when ordered from lowest to highest
• Second quartile (Q2) = median• Third/upper quartile (Q3): is three
quarters of the way through the data set (in order)
• Interquartile range = third quartile – first quartile
• The interquartile range is a better measure of dispersion than the range as it is not affected by ‘extreme’ values
• It indicates how densely the data is spread around the median
• Semi-quartile range = Q3 – Q1
2• It is half of the interquartile range