statisticsforbiologists colstons
TRANSCRIPT
![Page 1: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/1.jpg)
BIOLOGY
Spacebar to continue
![Page 2: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/2.jpg)
Introduction• Biological studies deal with organisms
which show variety
• We cannot rely on a single measurement and so we must take a sample
• This sample of data must be summarised and analyzed to find out if it is reliable
Spacebar to continue
![Page 3: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/3.jpg)
Summarising data• MEAN Sum of samples ÷ sample size
x ÷ n
• MEDIAN Middle number in a list when arranged in rank order: 2, 5, 7, 7, 8, 23, 31
• MODE The measurement which occurs most frequently ; 2, 5, 7, 7, 8, 23, 31
Spacebar to continue
![Page 4: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/4.jpg)
Distribution Curves• A visual summary of data
• They can be produced by;1. Collect data
2. Split results into equal size classes
3. Make a tally chart
4. Plot a histogram of frequency against size class
• Data can show normal distribution or skewed distribution
Spacebar to continue
![Page 5: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/5.jpg)
Distribution curves
• Normal distribution• Symmetrical bell
shaped curve around the mean
• Use parametric tests to analyse data
0
2
4
6
8
10
12
14
16
Spacebar to continue
![Page 6: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/6.jpg)
Distribution curves
• Skewed data• Asymmetrical curve
around the mode• Use non-parametric
tests to analyse data
0
2
4
6
8
10
12
14
16
18
Spacebar to continue
![Page 7: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/7.jpg)
Standard Deviation
• Standard deviation (SD) is a measure of the spread of the data
Large SDSmall SD
![Page 8: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/8.jpg)
Standard deviation
• A high SD indicates data which shows great variation from the mean
• A low SD indicates data which shows little variation from the mean value
• By definition, 68% of all data values lie within the range MEAN 1SD
• 95% of all values lie within 2SD
Spacebar to continue
![Page 9: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/9.jpg)
•
SD and confidence limits
0
2
4
6
8
10
12
14
68%
95%
![Page 10: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/10.jpg)
Calculating SD
• Can only be used for normally distributed data
• Calculate as follows;– Sum the values for x2 ie (x2) – Sum the values for x, then square it ie (x)2
– Divide (x)2 by n– Take one from the other and divide by n– Take the square root of this. (see hand-out)
Spacebar to continue
![Page 11: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/11.jpg)
Calculating SD
Spacebar to continue
S = x2 - ((x)2/n)
n
![Page 12: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/12.jpg)
Confidence limits
• 95% of all values lie within 2SD of the mean
• Any value which lies outside this range is said to be significantly different from the others
• We say that we are working to 95% confidence limits or to a 5% significance level.
Spacebar to continue
![Page 13: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/13.jpg)
Comparison tests
• To compare two samples of data we look at the overlap between the two distribution curves.
• This depends on;– The distance between the two mean values– The spread of each sample (standard deviation)
• The greater the overlap, the more similar the two samples are.
Spacebar to continue
![Page 14: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/14.jpg)
Comparison tests
Spacebar to continue
MeanMean
Sample 2OverlapSample 1
![Page 15: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/15.jpg)
Comparison tests
Spacebar to continue
Sample 2OverlapSample 1
When the SD is small, the overlap is less;
![Page 16: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/16.jpg)
The null hypothesis
• In order to compare two sets of data we must first assume that there is no difference between them.
• This is called the null hypothesis
• We must also produce an alternative hypothesis which states that there is a difference.
Spacebar to continue
![Page 17: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/17.jpg)
The t-test
• Used to compare the overlap of two sets of data
• Samples must show normal distribution
• Sample size (n) should be greater than 30
• This tests for differences between two sets of data
Spacebar to continue
![Page 18: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/18.jpg)
The t-test
• To calculate t;– Check data is normally distributed by drawing a
tally chart
– Work out difference in means |x1 – x2|
– Calculate variance for each set of data (this is s2 ÷ n)
– Put these into the equation for t:
Spacebar to continue
![Page 19: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/19.jpg)
The t-test
Spacebar to continue
t =
|x1 – x2|
s12 s2
2
n1 n2
![Page 20: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/20.jpg)
The t-test
• Compare the value of t with the critical value at n1 + n2 – 2 degrees of freedom
• Use a probability value of 5%• If t is greater than the critical value we can
reject the null hypothesis…• … there is a significant difference between the
two sets of data • … there is only a 5% chance that any
similarity is due to chance
![Page 21: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/21.jpg)
Mann-Whitney u-test
• Compares two sets of data
• Data can be skewed
• Sample size can be small; 5<n<30
• For details refer to stats book
Spacebar to continue
![Page 22: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/22.jpg)
Chi squared
• Some data is categoric• This means that it belongs to one or more
categories• Examples include
– eye colour – presence or absence data– texture of seeds
• For these we use a chi squared test 2
• This tests for an association between two or more variables
![Page 23: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/23.jpg)
Chi squared
• Draw a contingency table
• These are the observed values
Blue eyes Green eyes Row totals
Fair hair a b a+b
Ginger hair c d c+d
Column totals
a+c b+d a+b+c+d
![Page 24: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/24.jpg)
Chi squared
• Now work out the expected values:
• Where,
E =(Row total) x (Column total)
(Grand total)
![Page 25: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/25.jpg)
Chi squared
Blue eyes Green eyes Row totals
Fair hair(a+b)(a+c)
(a+b+c+d)
(a+b)(b+d)
(a+b+c+d)a+b
Ginger hair(c+d)(a+c)
(a+b+c+d)
(c+d)(b+d)
(a+b+c+d)c+d
Column totals
a+c b+d a+b+c+d
![Page 26: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/26.jpg)
Chi squared
• For each box work out (O-E)2 ÷ E
• Find the sum of these to get 2
2 =(O-E)2
E
![Page 27: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/27.jpg)
Chi squared
• Compare 2 with the critical value at 5% confidence limits
• There will be (no. rows – 1) x (no. columns – 1)
degrees of freedom
• If 2 is greater than the critical value we can say that the variables are associated with one another in some way
• We reject the null hypothesis
![Page 28: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/28.jpg)
Spearman Rank
• Two sets of data may show a correlation
• The data can be plotted on a scatter graph:
Positive correlation No correlationNegative correlation
![Page 29: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/29.jpg)
Spearman Rank
• We calculate the correlation by assigning a rank to the values:
Data 1 Rank
12
14
18
18
Data 2 Rank
24
29
29
38
![Page 30: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/30.jpg)
Spearman Rank
• We calculate the correlation by assigning a rank to the values:
Data 1 Rank
12 1
14
18
18
Data 2 Rank
24
29
29
38
This is the Lowest value – So we call it rank 1
![Page 31: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/31.jpg)
Spearman Rank
• We calculate the correlation by assigning a rank to the values:
Data 1 Rank
12 1
14 2
18
18
Data 2 Rank
24
29
29
38
This is the 2nd lowestvalue – so we call it rank 2
![Page 32: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/32.jpg)
Spearman Rank
• We calculate the correlation by assigning a rank to the values:
Data 1 Rank
12 1
14 2
18 ?
18 ?
Data 2 Rank
24
29
29
38
These should be rank 3 & 4 – but they are the same. We find the average of 3 + 4 and give them this rank
![Page 33: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/33.jpg)
Spearman Rank
• We calculate the correlation by assigning a rank to the values:
Data 1 Rank
12 1
14 2
18 3.5
18 3.5
Data 2 Rank
24
29
29
38(3+4)/2 = 3.5
![Page 34: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/34.jpg)
Spearman Rank
• We calculate the correlation by assigning a rank to the values:
Data 1 Rank
12 1
14 2
18 3.5
18 3.5
Data 2 Rank
24
29
29
38
Similarly on thisside
![Page 35: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/35.jpg)
Spearman Rank
• We calculate the correlation by assigning a rank to the values:
Data 1 Rank
12 1
14 2
18 3.5
18 3.5
Data 2 Rank
24 1
29
29
38
![Page 36: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/36.jpg)
Spearman Rank
• We calculate the correlation by assigning a rank to the values:
Data 1 Rank
12 1
14 2
18 3.5
18 3.5
Data 2 Rank
24 1
29 2.5
29 2.5
38
The averageof 2 & 3
![Page 37: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/37.jpg)
Spearman Rank
• We calculate the correlation by assigning a rank to the values:
Data 1 Rank
12 1
14 2
18 3.5
18 3.5
Data 2 Rank
24 1
29 2.5
29 2.5
38 4
![Page 38: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/38.jpg)
Spearman Rank
• Find the difference D between each rank
• Square this difference
• Sum the D2 values
• Calculate the Spearman Rank Correlation Coefficient rs
rs = 1 -6D2
n(n2-1)
![Page 39: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/39.jpg)
Spearman Rank
• Compare rs with the critical value at the 5% level
• If it is greater than the critical value (ignoring the sign) then we reject the null hypothesis
• … there is a significant correlation between the two sets of data
• If the value is positive there is a positive correlation
• If it is negative then there is a negative correlation
![Page 40: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/40.jpg)
Quick guide
Is your data interval data or is it categoric data (it can only be placed in a number of categories)
IntervalInterval CategoricCategoric
![Page 41: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/41.jpg)
Quick guide
Are you looking for a correlation between two sets of data – eg the rate of photosynthesis and light intensity
YesYes NoNo
![Page 42: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/42.jpg)
Quick guide
Use the Chi squared test
BackBack EndEnd Chi squaredChi squared
![Page 43: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/43.jpg)
Quick guide
Use the Spearman Rank test
BackBack EndEnd Chi squaredChi squared
![Page 44: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/44.jpg)
Quick guide
Are you comparing data from two populations?
YesYes NoNo
![Page 45: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/45.jpg)
Quick guide
Is your data normally distributed?
YesYes NoNo
0
2
4
6
8
10
12
14
16
![Page 46: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/46.jpg)
Quick guide
Use a t-test
t-testt-test BackBack
![Page 47: Statisticsforbiologists colstons](https://reader036.vdocument.in/reader036/viewer/2022081404/558520e9d8b42a4c128b4add/html5/thumbnails/47.jpg)
Quick guide
Use a Mann-Whitney U test
BackBack ExitExit