stats exam questions!
TRANSCRIPT
-
8/10/2019 STATS Exam Questions!
1/3
Answering Exam Questions on StatisticsExamination questions may require the calculation and interpretation of statistical measures and tests. This Factsheet discusses
strategies for approaching such questions and gives guidance on common mistakes to avoid. Factsheets 79 and 85 cover the chi-squared
test and t-test specifically. A later Factsheet will cover diagrams and their interpretation.
Bio FactsheetApril 2003 Number 122
1
Calculator Tip:- To do this sum on your calculator, you need toput brackets around all of the top and all of the bottom, like this:(60 + 143 + 160 + 152 + 88) (6 + 11 + 10 + 8 + 4)
What can they ask you?Exactly what is examinable depends on the specification you are studying,
but there are three main categories:
basic statistical calculations and their interpretation
chi-squared test
t-test
Basic statistical calculations and their interpretation
All specifications require you to calculate the mean; some also require the
standard deviation. You need to remember the formula for the mean, but
will be given it for the standard deviation.
Calculating the mean
a) For a list of numbers, just add them all up and divide by how many there
are.
b) For a table of grouped data, follow this procedure
Step 1. Find out the midpointof each class, by adding its endpoints
and dividing by two. Add it to the table. Call this column "x"
Step 2. Add another column, and put in it the values of
x number of individuals (f)
Step 3.mean =
eg: Find the mean of the following data
mean = = 15.46
Length
(nearest cm)
9 - 11
12 - 14
15 - 17
18 - 20
21 - 23
Number of
individuals (f)
6
1 1
1 0
8
4
x
(9 + 11) 2 = 10(12 + 14) 2 = 13(15 + 17) 2 = 16
(18 + 20) 2 = 19(21 + 23) 2 = 22
x f
60
143
160
152
88
60+143+160+152+88
6+11+10+8+4
Calculating the standard deviation
The formula for this that you will be given is:
standard deviation =x2
n mean2
means "sum of", so x2 means "square each value then add them up"
a) For a list of numbers:
i) Square each number and add up the squares (this gives x2
)ii) Divide your answer to i) by how many numbers there are
(this gives x2/n)iii) Find the mean and square it.
iv) Take the answer to iii) away from the answer to ii)
(this gives everything inside the square root)
v) Square root the answer to iv) (this gives the standard
deviation)
eg: Find the standard deviation of 2, 5, 6, 7, 8
i) x2 =22+ 52+ 62+ 72+ 82= 178
ii) x2/n = 1785 = 35.6iii) Mean = (2 + 5 + 6 + 7 + 8)5 = 5.6 Mean2 = 5.62= 31.36
iv) x2/n mean2 = 35.6 31.36 = 4.24v) standard deviation = 4.24 = 2.0591
b) For a table of grouped data
i) Complete the columns "x" and "x f" as for finding the meanii) Add another column, which is x2 fiii) Find the total of the " x2 f" column. (this gives x2)iv) Divide your answer to iii) by the total of the "f" column
(this gives x2/n)v) Find the mean, as described opposite, and square it
vi) Take the answer to v) away from the answer to iv)
(this gives everything inside the square root)
vii) Square root the answer to vi) (this gives the standard deviation)
eg. Find the standard deviation of the following data
iii) x2= 600 + 1859 +2560 + 2888 + 1936 = 9843iv) Total of f column = 6 + 11 + 10 + 8 + 4 = 39
x2/n = 984339 = 252.3846
v) Mean2
= 15.462
= 239.0116vi) x2/n - mean2 = 252.3846 - 239.0116 = 13.3730vii) Standard deviation = 13.3730 = 3.657
Length
(nearest cm)
9 - 11
12 - 14
15 - 17
18 - 20
21 - 23
Number of
individuals (f)
6
11
1 0
8
4
x
1 0
1 3
1 6
1 9
2 2
x f
6 0
143
160
152
8 8
x2f
600
1859
2560
2888
1936
Calculator Tip:- Most scientific or graphical calculators willallow you to calculate mean and standard deviation automatically.This can save a lot of time! However, not all calculators do it inthe same way, so you need to consult your calculator instructionbook and practice well in advance of the exam.
One of the commonest mistakes candidates make when usingthe calculator is not to clear all the data before starting a newcalculation. You can usually do this on a scientific calculator bygoing into the statistics mode and then pressing SHIFT or 2NDand "AC". To check it works, press the button that you wouldnormally use to get the mean - if it gives you a number, youhaven't cleared the data properly!
www.curriculumpress.co.uk
total of "x f" columntotal of "f" column
-
8/10/2019 STATS Exam Questions!
2/3
Bio Factsheet
2
Answering Exam Questions on Statistics
Interpreting the mean and standard deviationThe mean, of course, is the average- but that does notmean half the values
are below and half above it, or that it is a common value. For example, the
mean of the values 1, 1, 2, 3, 100 is 21.4; this is nowhere near any of the actual
values, and four out of the five values are below it!
The mean also does not distinguish betwee these two data sets:-
A: 48, 49, 50, 51, 52
B: 35, 40, 50, 62, 63
Both sets of data have mean 50, but they are not very similar.
This is where the standard deviation comes in. This measures how spread
outthe data are - the bigger the standard deviation, the greater the spread.
For example, for data set A above, the standard deviation is 1.414, and for
set B, it is 11.296.
So, for example if you know the following:
Data set 1: mean = 45.2 standard deviation = 2.13
Data set 2: mean = 43.7 standard deviation = 10.03
We know that data set 2 is more spread out than data set 1. Let's consider
which would be more likely to have a value in it above 50, say.For data set 1, 50 is more than 2 standard deviations away from the mean
(45.2 + 2 2.13 = 49.46)For data set 2, 50 is less than 1 standard deviation away from the mean
(43.7 + 10.03 = 53.73).
This tells us that 50 is a less "extreme" or "uncommon" value for data set
2 than for data set 1. So data set 2 is more likely to have values above 50.
Statistical tests
In the exam, you will always be told which statistical test to use if you arebeing required to do calculations. You will be given any tables you need.
There are various types of questions:-
understanding statistical terms like degrees of freedom, significance, etc
interpreting results and drawing conclusions doing the calculations according to the test formula
finding degrees of freedom
using statistical tablesSome of these are the same for both t-test and chi-squared; others are specific
to the test.
Understanding statistical terms
Hypotheses: the purpose of a statistical test is to decide between the null
hypothesis and the alternative hypothesis. The exact form of these
hypotheses depends on the test. When you are carrying out the test, you
accept the null hypothesis, unless you have convincing evidence otherwise(in a court of law, the "null hypothesis" is that the person is innocent - he
is only decided to be guilty if there is enough evidence).
Test statistic: this is the value calculated from your data. The formula for
it depends on the test you are doing.
Critical value: this is the value you compare the test statistic to, to decide
whether you are going to accept or reject the null hypothesis.
For both t-test and chi-squared test, you rejectthe nullhypothesis if your
test statistic is greaterthan the critical value.
Critical values come from statistical tables.
Significance level: It is possible to reject the null hypothesis even if it is
true, because "unusual" results can occur by chance (eg it is possible -
although unlikely - to get 100 heads in succession when tossing a coin).
The significance level is the chance of rejecting the null hypothesis when it
is true. These may be written as percentages (10%, 5%, 1%) or as decimals
(0.1, 0.05, 0.01).
The normal significance level in science is 5%. Use this unless you
are told otherwise.
Degrees of freedom: you do not need to know the exact meaning, although
you do need to know how to calculate them (see below). The idea is that
the amount of data you have affects the critical value - this is because you
are much more likely to get unusual results by chance if you only have a few
observations, than if you have a lot of observations.
Interpreting results and drawing conclusions
You mustremember that if the value you calculate (the test statistic) is
greater than the value from the tables (the critical value), then you reject
the null hypothesis. Otherwise you accept it.
You then need to relate this back to the original hypotheses; this will be
discussed in more detail for each test.
Choose your words carefully - a statistical test does not "prove" a
hypothesis is true - there is always a chance that a wrong decision could be
made. It is normal to say "the result is significant at the 5% level" or "the
alternative hypothesis was accepted at the 5% level".
The remainder of the section is divided between the chi-squared test and the
t-test.
Chi-squared test
There are two types main types of chi-squared test you may have to do:
a) Testing to see if there is a difference
b) Testing to see if the theoretical ratios predicted by genetics apply
The hypotheses for the tests are
a) H0: there is no difference between the different conditions
H1: there is a difference between the different conditions
b) H0: the observations are in accordance with the predictions of genetics
H1: the observations are not in accordance with the predictions of
genetics
Calculations for the test formula
In chi-squared, you will need to calculate expected frequencies, and then
the value of chi-squared, using the formula:
2=
a) To calculate expected values when you are testing for a difference, you
just add up all the values and divide by the number of them.
b) To calculate expected values for genetics, you have to use the genetic
ratio. The procedure is:i) Add up all the values from the data you are given
ii) Add up all the numbers in the genetic ratio
(eg for 9:3:3:1, do 9 + 3 + 3 + 1 = 16)
This tells you the number of parts you will be dividing your total
from i) into.
iii) Find out how much one part is, by dividing your total from i) by your
total from ii)
iv) Find out the expected frequencies, by multiplying one part by the
numbers in the ratio (eg by 9, 3, 3 and 1)
Once you have calculated the expected frequencies, you substitute into the
formula above to find the chi-squared value.
Finding degrees of freedom
You need to learn this formula:
For chi-squared:
degrees of freedom = number of categories - 1
(O - E)2E O is observed values - the data from the questionE is expected values - the ones you calculatemeans sum of
www.curriculumpress.co.uk
-
8/10/2019 STATS Exam Questions!
3/3
Bio Factsheet
3
Answering Exam Questions on Statistics
Acknowledgments: This Factsheet was researched and written by Cath Brown.
Curriculum Press, Unit 305B The Big Peg, 120 Vyse Street, Birmingham B18 6NF
Bio Factsheets may be copied free of charge by teaching staff or students,
provided that their school is a registered subscriber.
No part of these Factsheets may be reproduced, stored in a retrieval system, or
transmitted, in any other form or by any other means, without the prior
permission of the publisher.
ISSN 1351-5136
Using statistical tables
All you have to do is to read down to find the number of degrees of freedom
you have, and across to find the significance level (usually 5% = 0.05).
t-testThere are two types of t-test, paired and unpaired. The exam will always
make it clear which you should do. You will always be given the relevant
formulae.
The hypotheses for both tests are
H0: mean 1 = mean 2
H1: mean 1 mean 2(This is a2-tailed test- you may also come across 1-tailed tests, but in the
exam you will never have to choose between the two)
Calculations for the test formula
The calculations for either type of type of t-test are similar to those for finding
means and standard deviations. You also need to be able to substitute into
a formula. Provided you can do calculations like the ones on page 1, you will
not have a problem with these. Remember, you will be given any formulae
you require.
The paired t-test first requires you to find the differences between each pair
of values. You then work with these differences only.
paired t-test: t =
In the unpaired t-test, youwill need to use these formulae:
Exam questions will get you to do these calculations bit by bit and "follow
through" marks are likely to be awarded - so if you calculate s wrong, for
example, but use your value correctly to calculate the value of t, then you
will get the rest of the marks.
chi-squared tables
df 0.10 0.05 0.025 0.01 0.005
1 2.71 3.84 5.02 6.63 7.88
2 4.61 5.99 7.38 9.21 10.60
3 6.25 7.81 9.35 11.34 12.84
4 7.78 9.49 11.14 13.23 14.86
For a chi-squared test
with 1 degrees of freedom
at a significance level of5%, the critical (tables)
value is 3.84
Common mistakes
These are some of the commonest errors candidates make:-
Rounding errors, due to rounding too early. If in doubt, use all thefigures.
It is useful to keep figures in your calculator, to avoid having to keep
writing down and re-entering data. Learn how to use your calculator
memory.
Calculator errors- putting the correct figures into the calculatorwrongly. See the calculator tips in this Factsheet and practice using
your calculator well before the exam.
Failure to show working- hence throwing away all the marks if thereis even one tiny error in calculation.
Failure to recall the formulae for degrees of freedom- these haveto be learnt. If you get them wrong, they will invalidate your tables
value and your conclusion.
Not drawing conclusions correctly - you must learn that if yourcalculated value is larger than the tables value, you reject the null
hypothesis.
Getting the hypotheses the wrong way round - if your calculatedresult is greater than the tables value, then:
for the t-test, there is a difference between the means for testing for a difference in chi-squared, there is a difference for genetics chi-squared, the results are not as predicted by
genetics
x is the mean of the differences
n is the number of pairs
s is the standard deviation of thedifferences
x1and x
2are the means of the
two samples
n1and n
2are the sizes of the
two samples
means "sum of"t
=
t-table
For a t-test with 10 degrees of
freedom at a significance levelof 5%, the critical (tables)
value is 2.228
Significance level
df 0.1 0.05 0.01
7 1.895 2.365 3.499
8 1.860 2.306 3.355
9 1.833 2.262 3.250
10 1.812 2.228 3.169
11 1.796 2.201 3.106
Calculator Tips:-
To carry out any calculation that is set out as a fraction, youmust put brackets round the top and round the bottom.It is probably easier to work out the number inside the square-root first, then take the square root, rather than trying to do it allin one go.
Finding degrees of freedom
You need to learn these formulae:
For paired t-test:
degrees of freedom = number of pairs - 1
For unpaired t-test:
degrees of freedom = number in 1stsample + number in 2ndsample - 2
Using statistical tables
All you have to do is to read down to find the number of degrees of freedom
you have, and across to find the significance level (usually 5% = 0.05).
www.curriculumpress.co.uk
x (n -1)s
x1
2- n1x
1
2 + x2
2- n2x
2
2
n1+ n
2- 2
s =
s1 + 1n
1n
2
x1- x
2