Scot Exec Course Nov/Dec 04
Survey design overviewGillian Raab
Professor of Applied Statistics
Napier University
Scot Exec Course Nov/Dec 04
Summary
• Overview of government surveys
• Types of survey
• Household surveys, design aspects
Scot Exec Course Nov/Dec 04
Reasons for doing government surveys
• Evaluate success of policies –– e.g. smoking reduction
• Determine what effect of policy changes might be– e.g. who might claim a proposed new benefit
• Measure public concern in policy areas– E.g. environmental attitudes
Scot Exec Course Nov/Dec 04
Fit for purpose
• Do you really need a survey?
• Could administrative data help?
• Are there items in existing surveys that could give satisfactory information?
• UK answers (especially N of England) may give answers that are relevant to Scotland in many areas
Scot Exec Course Nov/Dec 04
Who / what is to be surveyed
• Is the question relevant to– Organisations?– Businesses?– Patients in hospitals– General public?
• These will then form the POPULATION OF INTEREST
Scot Exec Course Nov/Dec 04
How to select a sample from the population?
• Convenience samples– A poor choice except except for piloting
• Quota samples– OK for market research and short term questions (e.g.
election forecasting)
– But not for major policy questions, especially trends
• Probability samples– The method used in most government surveys
Scot Exec Course Nov/Dec 04
Sampling frame
• Is a list that allows you to attempt to make contact with every member of the population of interest– List of patients admitted to hospital– Community health index– Business directory– A list of households (e.g. PAF)
Scot Exec Course Nov/Dec 04
Survey design
• Method by which a sample is selected from the sampling frame
• We will discuss this in detail later
• Choice of design will depend on how respondents are to be contacted
• And on what questions the survey is designed to answer
Scot Exec Course Nov/Dec 04
How to contact respondents?
• Postal survey– Cheap, but increasingly response rates are a
problem– Incentives (not prizes) may help
• Telephone survey• Internet survey (with email address list)• Household survey (with interviewers)
– Most expensive but most reliable
Scot Exec Course Nov/Dec 04
Features of household surveys
• Most large UK government sponsored surveys follow this pattern
• Interest in people – but people accessed via their addresses
• Usually carried out by ONS or by survey organisations with large field forces of interviewers– Interviewer contacts address (often several times over)
– Gets details of occupants
– Selects one or more person to interview at the address
Scot Exec Course Nov/Dec 04
Features of household surveys (2)
They generally incorporate some of the following features– Clustering– Stratification– Weighting– Big surveys are usually complicated
The design is intended to enable the survey to get accurate and precise results
Scot Exec Course Nov/Dec 04
Clustering
• Used for the convenience of organising the survey– A sampling frame may only be available within
larger units (e.g. employees within workplaces)– Fieldwork costs are reduced if households are
close together
• The unit from the original list used to select the sample is called the Primary Sampling Unit (PSU)
Scot Exec Course Nov/Dec 04
Clustering leads to two stage sampling
• First a random sample of clusters is made
• And then a random sample of the individuals within each cluster is selected
Scot Exec Course Nov/Dec 04
Proportionate or disproportionate samples
• In proportionate samples, every individual has the same chance of being selected into the sample
• In disproportionate samples some members of the population have a greater chance of being selected than others.
• Both of these types of sample can be probability samples where only a random process determines if a particular individual will be in the sample.
Scot Exec Course Nov/Dec 04
Selecting a proportionate random sampleunclustered data
• We want a sample in which every individual will have the same chance of being in the sample. This is the sampling fraction (f), eg f=0.001 or f = 1 in 1000.
• Simple random sampling no clustering– Get the sampling frame– Order by a random number– For an f=0.001 select every 1000th record
Scot Exec Course Nov/Dec 04
Selecting a proportionate random sampleclustered data
• Select k clusters with probability proportional to size. A cluster of size m is selected with probability = k m/(m).
• Then a fixed number of individuals (p, say 10 or 15) is selected randomly from each cluster.
• Sampling fraction is product probability at each stage
• f = (k m/(m) x ( p /m) = k p /(m). • Same for every member of the population
Scot Exec Course Nov/Dec 04
Terminology
• Biased estimate – lack of accuracy
• Estimate with high variability - imprecision
Scot Exec Course Nov/Dec 04
Impact of design features - clustering
• Clustering doesn’t introduce any inaccuracy in estimates, but it does increase imprecision
• Degree of increase depends on cluster size and cluster homogeneity
• It reduces the effective sample size• To account for clustering need to identify the
primary sampling unit (PSU) when analysing a dataset.
Scot Exec Course Nov/Dec 04
Examples of clustered designs
• Scottish Health Survey is clustered by post-code sector
• Scottish Household survey is clustered by census enumeration district in rural areas, but not clustered in urban areas
• Household surveys that select more than one person per household have another level of clustering
Scot Exec Course Nov/Dec 04
Stratified sampling
• The population is divided into groups called strata• A separate sample is selected within each stratum• Proportionate stratification
– the same sampling fraction (f) is the same in each stratum
• Disproportionate stratification– Different sampling fractions by stratum
Scot Exec Course Nov/Dec 04
Proportionate stratification
• Many household surveys use proportionate stratification (either overall or within regions)
• Does not affect estimates and tends to improve precision. – Degree depends on choice of stratifiers.
– Best improvement when results vary by stratum
Scot Exec Course Nov/Dec 04
Disproportionate stratification
• In household surveys this may be done to get better estimates for some small areas or sub-groups (e.g. local authorities, ethnic groups)– This tends to make results for the whole country less precise
– But it improves estimates for small areas or groups
• Some surveys take larger sampling fractions where the results are known to be more variable– E.g. types of farm in an agricultural survey or size of workplace in
a survey of employees
– This should improve precision for the whole survey
Scot Exec Course Nov/Dec 04
Disproportionate sampling- examples
• The Scottish Household Survey is stratified by local authority with bigger sampling fractions in small and rural local authorities
• Detailed questions are asked of one ‘random adult’. So the random-adult data set has disproportionate sampling by household size.
Scot Exec Course Nov/Dec 04
Features of disproportionate samples
• If analysed without any adjustment they can give biased results.
• To overcome this a weighting procedure needs to be used.
• Weighted results should give unbiased estimates, but they will affect the precision of results (can be better or worse)
Scot Exec Course Nov/Dec 04
Examples of disproportionate samples
• As part of the design– Disproportionate stratification
– In a household survey only one adult is selected per household
• At the analysis stage– Differential non-response is obtained from different
types of respondent
– Details of this will be covered tomorrow
Scot Exec Course Nov/Dec 04
Weights
• Weights are calculated as the inverse of the probability of selection.
• This makes the survey results a better match to the population
• Usually weights are calculated by the survey contractors and are supplied as part of the data set
Scot Exec Course Nov/Dec 04
Example 1: WERS98 (workplaces)No of employees
Population Sample Sampling fraction (1
in ..)
Weight
10-24 197358 362 545 545
25-49 76087 603 126 126
50-99 36004 566 64 64
100-199 18701 562 33 33
200-499 9832 626 16 16
500+ 3249 473 7 7
Scot Exec Course Nov/Dec 04
Effect of selecting one adult per household
H’hldsize
H’hlds (per 100) Adults Adults selected Weight
1 38 38 38 1
2 51 102 51 2
3 9 27 9 3
4+ 2 8 2 4
Total 100 175 100
Scot Exec Course Nov/Dec 04
Effect of weights on estimates
• Weighting changes almost all survey estimates (means, percentages, odds ratios, correlation coefficients, regression coefficients etc.)
• Both accuracy and precision are usually affected• The weighted estimate should be more accurate (if
weights are correct)• It may be more or less precise
Scot Exec Course Nov/Dec 04
Summary – design features for household surveys
• Proportionate stratification improves survey precision
• Clustering makes it worse
• Weighting for disproportionate sampling should improve accuracy, but its effect on precision may go either way
Scot Exec Course Nov/Dec 04
Overall summary
• Reasons for doing survey
• Type of survey
• Method of contacting respondents
• Design features for surveys – focussing mainly on household surveys