analysing and presenting quantitative data: inferential statistics
TRANSCRIPT
![Page 1: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/1.jpg)
Analysing and Presenting Quantitative Data:
Inferential Statistics
![Page 2: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/2.jpg)
Objectives
After this session you will be able to:
• Choose and apply the most appropriate statistical techniques for exploring relationships and trends in data (correlation and inferential statistics).
![Page 3: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/3.jpg)
Stages in hypothesis testing
• Hypothesis formulation.• Specification of significance level (to see
how safe it is to accept or reject the hypothesis).
• Identification of the probability distribution and definition of the region of rejection.
• Selection of appropriate statistical tests.• Calculation of the test statistic and
acceptance or rejection of the hypothesis.
![Page 4: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/4.jpg)
Hypothesis formulation
Hypotheses come in essentially three forms.Those that:
• Examine the characteristics of a single population (and may involve calculating the mean, median and standard deviation and the shape of the distribution).
• Explore contrasts and comparisons between groups.
• Examine associations and relationships between groups.
![Page 5: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/5.jpg)
Specification of significance level – potential errors
• Significance level is not about importance – it is how likely a result is to be probably true (not by chance alone).
• Typical significance levels:– p = 0.05 (findings have a 5% chance of being untrue)– p = 0.01 (findings have a 1% chance of being untrue)
[
![Page 6: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/6.jpg)
Identification of the probability distribution
![Page 7: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/7.jpg)
Selection of statistical tests –examples
Research question Independent variable
Dependent variable Statistical test
Is stress counselling effective in reducing stress levels?
Nominal groups (experimental and control)
Attitude scores (stress levels)
Paired t-test
Do women prefer skin care products more than men?
Nominal (gender) Attitude scores (product preference levels)
Mann Whitney U(data not normally distributed)
Does gender influence choice of coach?
Nominal (gender) Nominal (choice of coach)
Chi-square
Do two interviewers judge candidates the same?
Nominal Rank order scores Spearman’s rho(data not normally distributed)
Is there an association between rainfall and sales of face creams?
Rainfall (ratio data) Ratio data (sales) Pearson Product Moment (data normally distributed)
![Page 8: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/8.jpg)
Nominal groups and quantifiable data (normally distributed)
To compare the performance/attitudes of two groups, or to compare the performance/attitudes of one group over a period of time using quantifiable variables such as scores.
Use paired t-test which compares the means of the two groups to see if any differences between them are significant.
Assumption: data are normally distributed.
![Page 9: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/9.jpg)
Paired t-test data set
![Page 10: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/10.jpg)
Data outputs: test for normalityCase Processing Summary
Cases
Valid Missing
Total
N Percent N Percent N Percent
StressTime1 92 98.9% 1 1.1% 93 100.0%
StressTime2 92 98.9% 1 1.1% 93 100.0%
Tests of Normality
Kolmogorov-Smirnov(a)
Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
StressTime1 .095 92 .041 .983 92 .289
StressTime2 .096 92 .034 .985 92 .363
a Lilliefors Significance Correction
![Page 11: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/11.jpg)
Data outputs: visual test for normality
![Page 12: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/12.jpg)
Statistical output
Paired Samples Test
Paired Differences
df Sig. (2-tailed)
Mean Std. Deviation
Std. Error Mean
95% Confidence Interval of the Difference
tLower Upper
Pair 1
Stress Time 1Stress Time 2 1.60870 2.12239 .22127 1.16916 2.04823 7.270 91 .000
Paired Samples Statistics
Mean NStd.
DeviationStd. Error
Mean
Pair 1
StressTime110.3587 92 3.48807
.36366
StressTime2 8.7500 92 3.19555 .33316
![Page 13: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/13.jpg)
Nominal groups and quantifiable data (normally distributed)
To compare the performance/attitudes of two groups, or to compare the performance/attitudes of one group over a period of time using quantifiable variables such as scores.
Use Mann-Whitney U.
Assumption: data are not normally distributed.
![Page 14: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/14.jpg)
Example of data gathering instrument
![Page 15: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/15.jpg)
Mann-Whitney U data set
![Page 16: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/16.jpg)
Statistical outputTests of Normality
a Lilliefors Significance Correction
Attitude
Mann-Whitney U 492.500
Wilcoxon W 1020.500
Z -4.419
Asymp. Sig. (2-tailed) .000
a Grouping Variable: Sex
Test Statistics(a)
Ranks
Sex
Kolmogorov-Smirnov(a)
Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Attitude 1.298 32 .000 .815 32
.000
2 .167 68 .000 .909 68 .000
Ranks
Sex N Mean Rank Sum of Ranks
Attitude 132 31.89
1020.50
268 59.26
4029.50
Total 100
Ranks
![Page 17: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/17.jpg)
Association between two nominal variables
We may want to investigate relationships between two nominal variables – for example:
• Educational attainment and choice of career.• Type of recruit (graduate/non-graduate) and
level of responsibility in an organization.• Use chi-square when you have two or more
variables each of which contains at least two or more categories.
![Page 18: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/18.jpg)
Chi-square data set
![Page 19: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/19.jpg)
Statistical outputChi-Square Tests
Value dfAsymp. Sig.
(2-sided)Exact Sig. (2-sided)
Exact Sig. (1-sided)
Pearson Chi-Square .382(b) 1 .536
Continuity Correction(a)
.221 1 .638
Likelihood Ratio .383 1 .536
Fisher's Exact Test .556 .320
Linear-by-Linear Association
.380 1 .537
N of Valid Cases 201
a Computed only for a 2x2 tableb 0 cells (.0%) have expected count less than 5. The minimum expected count is 33.08.
Symmetric Measures
a Not assuming the null hypothesis.b Using the asymptotic standard error assuming the null hypothesis.
ValueApprox.
Sig.
Nominal by Nominal
Phi.044
.536
Cramer's V .044 .536
N of Valid Cases 201
![Page 20: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/20.jpg)
Correlation analysis
Correlation analysis is concerned with associations between variables, for example:
• Does the introduction of performance management techniques to specific groups of workers improve morale compared to other groups? (Relationship: performance management/morale.)
• Is there a relationship between size of company (measured by size of workforce) and efficiency (measured by output per worker)? (Relationship: company size/efficiency.)
• Do measures to improve health and safety inevitably reduce output? (Relationship: health and safety procedures/output.)
![Page 21: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/21.jpg)
Perfect positive and perfect negative correlations
![Page 22: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/22.jpg)
Highly positive correlation
![Page 23: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/23.jpg)
Strength of association based upon the value of a coefficient
Correlation figure Description
0.00 0.01-0.090.10-0.290.30-0.590.60-0.740.75-0.991.00
NoneNegligibleWeakModerateStrongVery strongPerfect
![Page 24: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/24.jpg)
Calculating a correlation for a set of data
We may wish to explore a relationship when:• The subjects are independent and not chosen
from the same group.• The values for X and Y are measured
independently. • X and Y values are sampled from populations
that are normally distributed.• Neither of the values for X or Y is controlled (in
which case, linear regression, not correlation, should be calculated).
![Page 25: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/25.jpg)
Associations between two ordinal variables
For data that is ranked, or in circumstances where relationships are non-linear, Spearman’s rank-order correlation (Spearman’s rho), can be used.
![Page 26: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/26.jpg)
Spearman’s rho data set
![Page 27: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/27.jpg)
Statistical output
Correlations
MrJones MrsSmith
Spearman's rho MrJones Correlation Coefficient1.000
.779(**)
Sig. (2-tailed).
.000
N30
30
MrsSmith Correlation Coefficient.779(**)
1.000
Sig. (2-tailed).000
.
N 30 30
** Correlation is significant at the 0.01 level (2-tailed).
![Page 28: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/28.jpg)
Association between numerical variables
We may wish to explore a relationship when there are potential associations between, for example:
• Income and age.• Spending patterns and happiness.• Motivation and job performance.
Use Pearson Product-Moment (if the relationships between variables are linear).
If the relationship is or -shaped, use Spearman’s rho.
![Page 29: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/29.jpg)
Pearson Product-Moment data set
![Page 30: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/30.jpg)
Relationship between variables
Rainfall70.0060.0050.0040.0030.0020.00
Sal
es
180.00
160.00
140.00
120.00
100.00
80.00
![Page 31: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/31.jpg)
Statistical output
Descriptive Statistics
MeanStd.
Deviation N
Rainfall 48.17 11.228 30
Sales 132.47 28.311 30
Correlations
Rainfall Sales
Rainfall Pearson Correlation1
-.813(**)
Sig. (2-tailed)
.000
N 30 30
Sales Pearson Correlation-.813(**)
1
Sig. (2-tailed).000
N 30 30
** Correlation is significant at the 0.01 level (2-tailed).
![Page 32: Analysing and Presenting Quantitative Data: Inferential Statistics](https://reader034.vdocument.in/reader034/viewer/2022052618/55141bc0550346d8488b55a9/html5/thumbnails/32.jpg)
Summary
• Inferential statistics are used to draw conclusions from the data and involve the specification of a hypothesis and the selection of appropriate statistical tests.
• Some of the inherent danger in hypothesis testing is in making Type I errors (rejecting a hypothesis when it is, in fact, true) and Type II errors (accepting a hypothesis when it is false).
• For categorical data, non-parametric statistical tests can be used, but for quantifiable data, more powerful parametric tests need to be applied. Parametric tests usually require that the data are normally distributed.