faculty of social sciences induction block: maths & statistics lecture 6: sample size, spss and Hypothesis Testing Dr Gwilym Pryce

1 Sciences Induction Block: Maths & Statistics Lecture 6: Sample size, SPSS and Hypothesis Testing Dr Gwilym Pryce

Faculty of Social Sciences Induction Block: Maths & Statistics Lecture 6:

Sample size, SPSS and Hypothesis Testing

Dr Gwilym Pryce

1. Summary of L5 2. Statistical Significance 3. Type 1 and Type II errors 4. Four steps of Hypothesis Testing 5. Overview of the Course

1. Summary of L5: Social Research is usually based on samples We usually want to use our sample to say

something about the population– I.e. we want to be able to generalise

How precisely we can estimate the population mean or proportion depends on our sample size and the variation within the sample

Using the CLT, statistical inference offers a systematic way of establishing: – the range of values in which the population mean or

proportion is likely to lie (‘a confidence interval’).– Whether a hypothesis about a mean or a proportion is

likely to hold in the population.

2. Statistical Significance “Significance” does not refer to “importance”

– but to “real differences in fact” between our observed sample mean and our assumption about the population mean

P = significance level = chances of our observed sample mean occurring given that our assumption about the population (denoted by “H0”) is true.– So if we find that this probability is small, it might

lead us to question our assumption about the population mean.

I.e. if our sample mean is a long way from our assumed population mean then it is:– either a freak sample – or our assumption about the population mean is wrong.

If we draw the conclusion that it is our assumption that is wrong and reject H0 then we have to bear in mind that there is a chance that H0 was in fact true.

– I.e. every twenty times we reject H0 when P = 0.05, then on one of those occasions we would have rejected H0 when it was in fact true.

Obviously, as the sample mean moves further away from our assumption (H0) about the population mean, we have stronger evidence that H0 is false.

If P is very small, say 0.001, then there is only 1 chance in a thousand of our observed sample mean occurring if H0 is true.

– This also means that if we reject H0 when P = 0.001, then there is only one in a thousand chance that we have made a mistake (I.e. that we have been guilty of a “Type I error”)

There is a tradition (initiated by English scientist R. A. Fisher 1860-1962) of rejecting H0 if the probability of incorrectly rejecting it is 0.05. – If P 0.05 then we say that H0 can be rejected at the 5%

significance level.– If P > 0.05, then, argued Fisher, the chances of incorrectly

rejecting H0 are too high to allow us to do so.

Sig level = P = the probability of a sample mean at least as extreme as our observed value occurring, given our assumption about the population mean.

3. Type I and Type II errors:

P = significance level = chances of incorrectly rejecting H0 when it is in fact true.– Called a “Type I error”

If we accept H0 when in fact the alternative hypothesis is true– Called a “Type II error”.

On this course we shall be concerned only with Type I errors.

4. The four steps of hypothesis testing Last lecture we looked at confidence

intervals:– establish the range of values of the population

mean for a given level of confidence• e.g. we are 90% confident that population mean age of

HoHs in repossessed dwellings in the Great Depression lay between 32.17 and 36.83 years (s = 20).

• Based on a sample of 200 with mean = 34.5yrs.

– But what if we want to use our sample to test a specific hypothesis we may have about the population mean?

• E.g. does = 30 years? – If does = 30 years, then how likely are we to select a

sample with a mean as extreme as 34.5 years?» I.e. 4.5 years more or 4.5 years less than the pop


One tailed test: P = how likely we are to select a

sample with mean age at least as great as 34.5?

Finding the value of P:

Because all sampling distributions for the mean (assuming large n) are normal, we can convert points on them to the standard normal curve– e.g. for 34.5: z = (34.5 - 30)/(20/200)

=4.5/1.4 = 3.2.

Upper tailed test:

Two tailed test:

4 Steps to Hypothesis tests:

1. Specify null and alternative hypotheses 2. Specify threshold significance level and

appropriate test statistic formula 3. Specify decision rule (reject H0 if P < ) 4. Compute P and state conclusion.

P values for one and two tailed tests: Upper Tail Test:

H1: > 0 then P = Prob(z > zi)

Lower Tail Test:

H1: < 0 then P = Prob(z < zi)

Two Tail Test:

H1: 0 then P = 2xProb(z > |zi|)

C o n f i d e n c e I n t e r v a l H y p o t h e s i s T e s t sF i n d t h e 9 0 % c o n f i d e n c e i n t e r v a l o f t h ep o p u l a t i o n m e a n a g e

T e s t t h e h y p o t h e s i s t h a t t h e p o p u l a t i o n m e a n a g e = 3 0u s i n g a s i g n i f i c a n c e l e v e l o f 0 . 11 . S p e c i f y n u l l a n d a l t e r n a t i v e h y p o t h e s i s :

H 0 : = 3 0H 1 : 3 0

1 . C h o o s e t h e a p p r o p r i a t e t e s ts t a t i s t i c :


xz i



szx i


2 . S p e c i f y t h e l e v e l o f s i g n i f i c a n c e a n d t h e t e s ts t a t i s t i c S i g n i f i c a n c e l e v e l :

= l i k e l i h o o d o f T y p e I e r r o r t h a t y o u a r ep r e p a r e d t o t o l e r a t e

= P r o b ( R e j e c t H 0 w h e n i t i s t r u e ) = 0 . 1T e s t S t a t i s t i c :

n > 3 0 , t h e r e f o r e w e c a n u s z :


xz i




xz i



i . e . w e w r i t e t h e z c f o r m u l a a s s u m i n g t h a t H 0 i s c o r r e c t2 . E s t a b l i s h t h e v a l u e o f z * :

P r o b ( - z * < z < z * ) = 0 . 9A r e a o f t a i l s = ( 0 . 1 ) / 2 = 0 . 0 5 z * = 1 . 6 5

3 . S p e c i f y t h e d e c i s i o n r u l e :R e j e c t H 0 i f f P ( t h e c a l c u l a t e d l e v e l o f T y p e I e r r o r )i s n o g r e a t e r t h a n t h e t o l e r a t e d l e v e l :

i . e . R e j e c t H 0 i f f P ( t h e s m a l l e r i s P , t h e l e s s r i s k i n v o l v e d i n r e j e c t i n g H 0 )

3 . C a l c u l a t e t h e c o n f i d e n c e i n t e r v a l :


2065.15.34 = 4 5 . 5 2 . 3 3

4 . C o m p u t e P a n d s t a t e y o u r c o n c l u s i o n :z c = 3 . 1 8 ;P P r o b ( z < 3 . 1 8 )

S i n c e P < ( i . e . , i t s s a f e t o r e j e c t H 0

5. Overview of the Course: L1: Density Functions & CLT

L3: Introduction to Confidence Intervals

L4: Confidence Intervals for All Occasions

L5: Introduction to Hypothesis Tests

L2: Calculating z-scores

L6: Hypothesis Tests for All Occasions

L8: Regression

L7: Relationships between Categorical Variables

Quants I

24/09/2005 - v23

Nature of the Course: This is course in applied statistics

– Applied: Not teach theoretical proofs • prove anything with maths (eg Teletubbies are evil)• What counts is understanding the concepts

– Statistics: also teach you SPSS,• But lots of different stats packages out there

– You are likely to use different ones over the course of your research career

– But statistic concepts remain unchanged

Enable you to critique other people’s work Also part of a wider research methods training

programme:– Broader remit is to teach you good practice in

research techniques• Essential to learn syntax…

Why learn syntax?Most texts & courses avoid it!

A succinct and secure record Transparency and reproducibility Efficiency Paste and Learn Avoiding obsolescence

– SPSS point-n-click routines change with each new version of SPSS – changes once a year

– Syntax remained virtually unchanged for 15 years Accessing Extra Resources & Expanding


Why the macros? 4 reasons: (a) Get the statistical procedure right, then

choose the program/calculator– SPSS doesn’t know what sort of data you have– SPSS canned routine may not be the right one for

your data– You could compute the procedure by hand, &

indeed it is important to know how to do this. – but this can be long-winded in repeated

applications & easy to make mistakes– Macro commands speed the process & are a

useful way to check your calculations.

(b) Critiquing/Analysing Published Work– SPSS routines can only be used if you have the

original data– Not much use if you want to critique or analyse

someone else’s published research• E.g. Newspaper examples in M&S tutorial• E.g.United Nations crime survey• E.g. MPPI paper by Pryce & Keoghan

– If all you can do is the point-n-click stuff in SPSS you are going to be severely hampered in what you can do.

– The Macro commands written specifically for the course only need summary info (n, xbar, sd, prop.)

• Publicly available via the downloads page of www.geebeejey.co.uk

(c) Working with standard texts– The exercises and examples in standard

statistical texts (such as Moore and McCabe) usually only provide summary information not the original data.

– Can’t use SPSS to do these examples or to check your results

(d) Encourages awareness & development of Macros– SPSS’s greatest strength:

• Customisability/expandability

– Actually don’t need to be good at statistics to use macros

• You can use macros to do anything:– Manipulate data,– Automate repetitive tasks– Formalise and automate complex calculations

– Writing SPSS macros is actually a good way to acquire basic programming skills

– In real-life applied research, most of your time is taken up with non-statistical manipulation of data

• Learning how to write your own macros or use other people’s will greatly increase your productivity & employability!

SPSS macrosConfidence Intervals (CI) Hypothesis tests

Macro command

Definition Macro Command


CI_L1M Large sample CI for one mean H_L1M Large sample significance test on one mean

CI_S1M Small sample CI for one mean H_S1M Small sample significance test on one mean

CI_S2MP Small independent samples CI for difference between 2 means (pooled variance)

H_S2MP Small independent samples significance test for equality of 2 means (pooled variance)

CI_S2MD Small independent samples CI for difference between 2 means (different variances)

H_S2MD Small independent samples significance test for equality of 2 means (different variances)

CI_L1P Large sample CI for one proportion (presents output for both Traditional and Wilson methods of calculation)

H_L1P Large sample significance test on one proportion

CI_L2P Large sample CI for comparing two proportions (presents output for both Traditional and Wilson methods of calculation)

H_L2P Large samples significance test on two proportions

H_S2VF Simple small sample F-test on equality of two variances (see also Levene’s test in the SPSS help menu for more sophisticated test of homogenous variances).

N_L1M Sample size for desired margin or error for the mean

Guide to Reading: Essential reading (recommended for

purchase):– Pryce, G. Inference and Statistics in SPSS

• Lab exercises drawn from this book.

Usually recommended a book on statistics & a book on SPSS:– E.g. Moore & McCabe (£40) -- stats– E.g. Field (£25+) -- SPSS– M&M and Field = 2 great books but 4 major


2 great books but 4 major problems:– 1. Cost (to buy both comes to approx £65)

many students have tried to make do without one or the other & struggled.

– 2. Length• 600 pages (M&M) + 832 pages (Field)

– 3. Content: neither geared to business & soc. sci.• Field: too shallow/applied:

– Covers huge spectrum of topics (useful for Quants II)– does not cover some of the basic material we need to do

» tends to cover what can be achieved in SPSS » Does not use macros» Does not teach syntax

• M&M: too deep/theoretical– The Rolls Royce of introductory texts but does not teach SPSS– But would take 2 semesters to cover material in this depth & learn


– 4. Integration• Leaves you the student with the task of combining the two

Advantages of Pryce I&S:– 1. Cost

• Pryce = £22 + P&P (special price of £20 this week) – M&M + Field = £65

– 2. Length• Pryce = 200 pages + supplement with further reading

– 600 pages (M&M) + 832 pages (Field)

– 3. Content:• Pryce:

– tries to strike the right balance between theory & application– Based in SPSS– Teaches syntax– Uses the macros– Geared to business and social science – Based on worked examples & exercises

– 4. Integration• Pryce tries to integrate learning inference with learning SPSS• But macros will also allow you do do the Moore & McCabe type of

exercise should you want to get more practice

Disadvantages of Pryce I&S: 1. First edition:

– A few glitches here & there…– But, rare edition because only a small print run

• valuable as a collectors item if you keep it for 20 years.• Glitches add value – ask a stamp collector• Even more valuable if I sign it. • Makes a great Xmas gift for friends & family.

2. Wire comb binding– But actually better for working next to PC

3. I’m biased in my recommendation!– But correct, of course.

Feedback forms…