how good is good enough?€¦ · we shared our preliminary work at the 2014 amsrs conference in...
TRANSCRIPT
![Page 1: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/1.jpg)
Page 1 of 24
How good is good enough?
Jayne Van Souwe and David Bednall
Synopsis
The concept that any information is better than nothing explains why DIY research, straw polls and
meta-data analysis of partial data are becoming so popular and the results are being treated as
reliable and accurate by the users of them. As former scientific researchers we were concerned
that “near enough” is not “good enough”.
This paper outlines results from two pieces of original research designed to find out who responds
to surveys by contact and response mode and who doesn’t. It sheds new light on our thinking and
calls into question the representativeness of even the best-designed and executed surveys.
We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have
even more evidence from every mode of recruitment and most completion modes.
We will give our views on the best uses of different sample frames for different applications from a
Census which must be taken and used by government for the purposes of providing services and
future planning, down to online panel and street intercept which can be very handy in specific
circumstances.
Background
Having come from scientific backgrounds we have always wanted data to be accurate and robust.
When one of the authors was talking with a client about the shortcomings of capturing some
information needed to make a decision she said, “Look I have to make this decision. At the
moment I have no information so it’s the toss of a coin, ANY information you can give me that helps
to make it, however rubbery, will be appreciated” - so much for being accurate to within +/- 2% at
the 95% confidence interval!
The large number of studies that are now carried out on metadata, river samples and convenience
samples demonstrate that there are many users of research for whom data accuracy and
robustness are not valued as highly as getting results quickly and/or cheaply.
It is interesting to note that the ESOMAR Market Research Handbook starts by defining “marketing
intelligence”.
“The purpose of marketing intelligence is to provide management with the facts, information and
insights it needs to rapidly make the best, most efficient business decisions” – Fredrik Nauckhoff
1 Bednall, D et al - Access all people: the 3M approach – AMSRS Conference, Melbourne 2014
![Page 2: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/2.jpg)
Page 2 of 24
This is depicted in the following wordle:
The definition of market research as stated in the AMSRS Code of Professional Behaviour is:
“The systematic gathering and interpretation of information about individuals or organisations using
the statistical and analytical methods and techniques of the applied social sciences to gain insight
or support decision making. This differs from other forms of information gathering in that the
identity of participants will not be revealed to the user of the information without explicit consent and
no sales approach will be made to them as a direct result of their having provided information.”
The first thing to note is that the definition of marketing intelligence includes value judgements such
as “best”, “most efficient” and it includes an adverb “rapidly”. The short and punchy definition of
marketing intelligence sounds fast paced and there is a sense of movement.
Market research sounds altogether ponderous in comparison!
![Page 3: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/3.jpg)
Page 3 of 24
We market researchers are pragmatists, and we know that our endeavours only form a part of the
broader intelligence framework. We also know that the days of the dedicated market research
buyer have gone and that research is now purchased by people with broader titles like Insights
Managers, Knowledge Gatherers and so on, as well as the more traditional product or service
marketing and communications personnel.
We face the issue of remaining relevant if we’re plodding along systematically while the world is
making decisions on the fly, however that isn’t the topic of this paper. This paper concerns itself
with the nature of the information that we do provide on which decisions will be made, and the
accuracy and precision that are necessary in order to be “good enough”.
Sampling – the mainstay of market research – which sample is good enough?
If we’re going to be systematic in gathering information of any type, we must first define the people
we need to talk with and then work out where to find them. It’s obviously easier to talk with people
in a known population, for example, customers of an organisation who use a particular product or
service and whose contact details are known to the organisation. It’s harder when we want to talk
to people and we don’t yet know exactly who they are, as would be the case in new product
development or where we really need to speak to everyone in order to be sure that we have a clear
idea of their views.
Having defined who we need to speak with, the next issue is the bane of all researchers’ lives, but
especially junior researchers, that is, to source a contact list or suitable sample of them.
When we are trying to get a fix on a population we generally take a random sample of that
population. This is especially important if we don’t understand its characteristics.
Simple random sampling "is a probability sampling procedure that ensures that every sampling unit
making up the target population has a known, equal, non-zero chance of being selected." (Hair &
Lukas, 2014, p.252). As the most fundamental probability sampling method, we use it as the basis
for projecting our sample results to the population with a known degree of confidence.
A year ago, we compared respondents from a reputable online panel, with telephone (fixed and
mobile respondents) and gave everyone the chance to complete the survey by phone or online
(Bednall et al, 2014). In fact we also used mail, but timing conspired against us and the response
was so low as to make the results unusable. The matched samples in that study attempted to gain
co-operation from exactly the same type of respondents demographically from each of fixed line
telephone, mobile and online panels. We achieved these for mobile and online panels but in the
interests of good practice did not fill all quotas with younger people via fixed line phone because of
the enormous number of phone calls we were making to do this. We could not justify contacting
several hundred members of the public in order to find one in-scope respondent.
![Page 4: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/4.jpg)
Page 4 of 24
We found that people who responded in the different modes had different attitudes and behaviours
to each other even if they seemed to be the same demographically.
In our most recent study, we added face-to-face interviewing into the mix in order to contact young
people and were very surprised at the excellent response rate achieved by this method. All
respondents were invited to participate in an online survey and were recruited face to face or by
dual frame (fixed line and mobile) telephone interviewing. People who did not want to complete the
survey online (or were recalcitrant) were allowed to complete the survey on the spot, if recruited
face-to-face or by telephone, if recruited by that means. Responses to questions relating to method
of contact are shown in Table 1.
Table 1: Methods of accessing the general public
Base: All respondents (Excludes Don't Knows)
QD2. Which of the following do you have access to?
Source: Wallis Multi Frame Multi Mode Omnibus Dec 2014
What is immediately clear is that amongst the individual means of making contact, letter boxes have
the highest penetration into the community followed by home internet. In aggregate, it is possible to
contact the entire population by phone (fixed or mobile), and most of the population by mail and the
internet. These figures, though obtained from a multi-mode survey are very consistent with ACMA
data which is sourced from another comprehensive, survey2.
So what’s the problem, we can access everyone can’t we? Let’s leave the practical matters of
access in the various contact modes aside for the moment and think about the fact that everyone
can be contacted somehow.
2 ACMA – Communications Report 2013-2014 see http://www.acma.gov.au/theACMA/Library/Corporate-library/Corporate-
publications/communications-report
TOTAL 18-29 30-49 50+
(n=643) (n=222) (n=148) (n=273)
% % % %
A phone (net) 99 99 99 100
A fixed phone (land line) in your home 74 44 68 96
A mobile phone (net) 94 99 98 89
A smartphone that connects to the internet 68 95 87 41
A mobile phone (not a smartphone) 33 8 20 54
A mailbox (net) 94 87 96 95
A letterbox 91 85 94 94
A post office box 14 16 15 12
An internet connection (excl. phone) 89 90 95 86
An internet connection at home 84 82 92 81
An internet connection at work or in a public place (e.g. Library) 56 61 78 43
A tablet that connects to the internet (by Wi-Fi / 3G / 4G / LTE etc.) 46 46 61 38
AGE
![Page 5: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/5.jpg)
Page 5 of 24
Research has shown that response rates are low and dropping in many modes (Bednall et al,
2013), so even when we can contact someone by a particular method, we have to worry about non-
response – that is, are the people who don’t respond the same or different from those who do?
Knowing that people generally respond in the mode they’re contacted in and looking at Table 1
suggests that different people respond in different modes as well.
There is also a matter of preference. Just because people have a means of contact, does not
mean that they will respond in it. Our research both last year and this showed people don’t expect
organisations that they don’t know to contact them by some methods as shown in Chart A.
Chart A: Expected means of contact from organisations and people you know
Base: 859
Q1: How do you prefer people you know to contact you?
Q2: Overall, what is the main way that you prefer organisations to contact you?
Source: Deakin/Wallis Multi Mode Multi frame Omnibus, 2013
People expect communications by mail (electronic or hardcopy) from organisations they don’t know
as well as from people they know. However, when answering their mobile device or landline phone
or viewing an SMS, they are largely expecting it to be a communication from someone they know.
This should be particularly the case for those whose landline is on the “Do Not Call Register”,
though the exemption for charities has meant a large proportion of calls are likely to come from
these sources. Call screening is a likely response, reinforcing the point that people prefer to use
this medium for personal, not business, communication (see Chart E).
Chart B shows these data broken down further by age group. This chart shows the proportion of
people in different age groups who are accessible by a medium and who expect to be contacted by
organisations in this medium.
![Page 6: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/6.jpg)
Page 6 of 24
Chart B: Proportion of people by age group who are accessible and expect
organisations to contact them in this medium
Base: 859
Q1: Which of the following do you have access to?
Q2: Overall, what is the main way that you prefer organisations to contact you?
Source: Deakin/Wallis Multi Mode Multi frame Omnibus, 2013
People of all ages expect organisations to contact them by e-mail above any other means. On the
surface this is good news for people launching surveys online. However, as we know too well,
people are not logical and fail to behave in the way they believe they will.
Further, while organisations with e-mail addresses of their customers or targets can, and do, survey
their populations by this means, there are no good lists available for the whole population. Even
when using those lists that do exist, researchers must take great care in their approach that they
are complying with the relevant legislation, not least of which is the SPAM Act (2003).
People’s preference for the online medium has undoubtedly fuelled the growth and use of online
panels as fast, efficient and, hopefully, pleasant means of reaching Australians. Unfortunately,
while access to the internet is now practically universal in Australia, our previous study estimated
that only about one in five Australians had ever registered for an online panel and about one in six
is still on one – fewer again are active. We also demonstrated that people on panels are invited to
complete surveys at very much greater frequency than by other modes and to complete many more
surveys. Nonetheless, the pool of willing respondents in total is quite large.
![Page 7: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/7.jpg)
Page 7 of 24
Chart C Estimated access to the Australian public by all major electronic modes
Source: Deakin/Wallis Multi Mode Multi Frame Omnibus, 2013
There is known to be considerable overlap between panellists – that is, panellists tend to be on
more than one panel. Chart D shows the age profiles of people on the major panels – which are
disguised. It also shows the extent of overlap between the largest eight panels operating at the end
of 2013.
Chart D is interesting in that it shows quite clearly why some online panels are becoming
increasingly protective of younger respondents – there are not relatively more of them in
comparison to other age groups – and the over 60 age group is the largest single age group.
![Page 8: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/8.jpg)
Page 8 of 24
Chart D: Age breakdown of panel members
Base: Respondents on a panel (398)
Q5: Which panel(s) do you belong to?
Source: Deakin/Wallis Multi-Frame Multi-Mode Omnibus, 2013.
Finding young people who are willing to participate in survey research has always been a challenge
and it remains so with panels as well.
It is particularly important in the context of Road Safety to interview this group since it remains
overrepresented in road accidents. The Transport Accident Commission in Victoria has the dual
mandates of reducing road trauma on Victoria’s roads whilst providing financial support to those
who are injured (or killed) on them. To this end, it is important for the TAC to understand the actual
prevalence of attitudes, beliefs and behaviours so that it can deploy its resources appropriately.
Being a public body, the evidence upon which it acts must be transparent and credible.
The TAC has kindly given permission for data from two of its flagship surveys to be shared to
demonstrate how people behave in practice. Some of this data is published, but other data is as
yet unpublished.
The annual Road Safety Monitor (RSM) and the ongoing Public Education Evaluation Programme
(PEEP) are considered by the Victorian government to be of sufficient importance that the TAC has
been given access to the complete database of licensed drivers in order to conduct them. Both
studies recruit participants initially via a letter of invitation to randomly selected Victorian license
holders. The RSM includes a self-completion questionnaire as well as giving people who do not
respond in that mode the ability to go online and complete it, or to wait for a telephone call and
complete it that way. This survey runs periodically with sufficient time allowed for data capture to
enable multiple follow ups to increase the proportion of the initial sample to participate (TAC, 2014).
![Page 9: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/9.jpg)
Page 9 of 24
PEEP similarly invites people to participate via a letter, but because it is time sensitive, it asks
Victorian motorists to go online and complete the survey, or wait for a phone call. Interviewing must
be completed within a week.
Address information is accurate, but not all addresses have telephone numbers and both studies
use the Sensis telephone matching service to find numbers where they do not exist.
Table 2: Response rate to TAC RSM and PEEP by age and mode
Source: Transport Accident Commission, by kind permission.
The response rate amongst the youngest age group is the lowest in both studies. We have not
shown the rates since they are not directly comparable. It shows that young people will go online of
their own volition, but both studies enjoy success when they are phoned. PEEP does particularly
well in this medium because of the timeframe and the fact that the survey is considerably shorter
than the RSM. These data de-bunk the idea that young people have a preference (and perhaps a
greater willingness) to complete surveys online in practice (or any mode, in reality).
The information presented so far has demonstrated that there is no, one single sample frame that
gives equal access to all people.
Is the Dual Frame telephone sample the answer?
Dual frame samples or mobile only surveys appear to give the highest penetration into the
community and now approximate the “good old days” when everyone had a fixed line phone and
most were listed in the telephone directory. In practice this is not the case.
Just because people have a phone, does not mean that they will answer it. Our research both last
year and this showed that many people use blocking tactics to screen contacts from people they
don’t know.
As the left hand pie in Chart E shows, the slim majority of mobile phone owners try to answer calls
immediately, but the remainder has other strategies. The people who try to answer immediately
also have compelling reasons for doing so – a high proportion of them are tradespeople and small
businesspeople who clearly need to answer their phones to generate or operate their business.
Hard Copy Online Phone Online Phone
% % % % %
18 - 25 33 48 19 32 68
26 - 39 37 50 13 41 59
40 - 59 48 36 10 39 61
60+ 78 18 4 34 66
RSM
(n=928)
PEEP
(n=5,176)
![Page 10: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/10.jpg)
Page 10 of 24
Chart E: Actions taken by people when their mobile phone rings
Base: All respondents with mobile phone or fixed line phone
QD2b / Q1f: When your mobile / landline rings what do you usually do?
Source: Wallis Multi Frame Multi Mode Omnibus Dec 2014, Deakin / Wallis Multi Frame Multi Mode Omnibus 2013
The pie chart on the right hand side demonstrates graphically what happens with fixed line
telephones. Again, while the majority of people try to answer the phone immediately, a third of
people use screening tactics.
Taken together this means that many people will not answer a phone call if they do not recognise
the number calling them or if the organisation or person calling them does not leave a compelling
message. Once again, it is far from guaranteed that everyone does have an equal chance of being
included in this frame, given a large proportion of the public disqualifies itself from answering
unsolicited phone calls.
There are other problems with incorporating mobile phones:
Ethical - it is imperative to make sure that any respondent is safe physically and in other
ways to answer any call, but with mobiles the challenges are greater. We don’t know the
location of the phone we’re calling from most samples that are available, meaning we must
be careful not to call people outside the hours permitted by law.
Respondent Goodwill - our industry has done itself no favours with the public in foisting
long, often boring and often irrelevant studies on it. In mobile mode it is incumbent on us
to do the right thing: keep to interviewing length guidelines and make the experience
![Page 11: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/11.jpg)
Page 11 of 24
good. As we have seen, the public is not expecting to be called via this medium to
participate in market research.
Dual frame interviewing is certainly an advance on many of the other frames available and is
becoming more widely used in Australia and overseas.
Should we go back in the future?
Referring back to Table 1 shows that virtually all Australians have access to a mail box. We have
shown that this technique can be very useful – particularly if the survey is short or simple or
pleasant or extremely relevant to the individual, and to make initial contact. It can be very effective
on its own where the contact details of the population of interest are available, such as a customer
listing, It does not work so well if you’re simply writing “to the householder” with a long and tedious
questionnaire on something that is not a high priority for the recipient. Nonetheless, there are many
personalised mailing lists available, and for some surveys it remains a viable technique. However,
its known flaws of having low response rates and being slow, quite apart from recent cost increases
in mailing, paper and administration, have served to relegate this as a mainstream technique.
Face to face interviewing is an old technique which is enjoying something of a revival. We used it
to excellent effect to interview young people in our latest multi-mode, multi-frame omnibus.
Interviewers can be located where respondents are likely to congregate making it an efficient
means of finding respondents of certain types. Tablet and notebook computers have replaced pen
and paper, bringing the benefits of computerised scripts to the streets and people, particularly
young people, are so taken aback at being approached that a surprising number agree to an
interview. Nonetheless, the costs involved in using only face to face interviewing as a means of
gaining information from the entire Australian public mean that this is not viable in most cases. Like
self-completion mailed surveys, it is a useful addition to the arsenal rather than a magic pudding.
The future – Take me to the river?
New technology allows access to a wide array of electronic communications that can be fished,
skimmed or accessed. DIY survey packages encourage researchers to answer their questions in a
variety of ways. The more reputable operators mention the need for good sampling practices, and
some supply respondents to researchers who have no suitable sample available to them. Some
encourage putting survey links in places where they are likely to be seen by potential respondents –
or river sampling.
River sampling is a tricky business. Just as a fast flowing river can fill a receptacle placed into it
very quickly, so can a well-placed link to a survey be filled quickly by responses. The issue is that
the type of respondent will differ depending on where the survey link is placed – and there is
generally little way of ensuring that people responding have an equal chance of being included in
the first place. The name river sampling is apt – any hydrologist will tell you that the composition of
the river is highly variable and what you capture in your net or bucket will vary greatly depending on
which part of the river it is placed in. The same is true in survey research. As with other means of
![Page 12: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/12.jpg)
Page 12 of 24
accessing the public, using river sampling as a means of finding a specific group of people can be
valid, but as a proxy for the entire population it is clearly skewed towards the stream that it is placed
in. This may well be good enough to give a sense of general sentiment (and it is usually bad
sentiment that permeates the ether), but as a basis of systematic research, it is limited.
Using pop-ups and placements within advertorials to capture the views of in scope respondents can
also be problematic. Our most recent piece of research has shown that a high proportion of people
using the internet employ a range of strategies to block what they see. Over a quarter of
Australians, for example, block ads and pop ups regularly and this rises to four in ten people aged
under 30.
The current gold standard
Multi-frame interviewing offers the best way to access all members of the Australian public. One of
the key challenges for surveys that use multiple sampling frames is the possibility that an individual
might be in-scope in more than one of the sampling frames. This means that all people in the
starting sample do not have an equal probability of being selected and causes problems for
statisticians both in working out how accurate estimates are and how to weight the data to correct
for sampling biases (Ansolabehere & Shaffner, 2014; Hu et al, 2014; Maia, 2011; Pfefferman &
Rao, 2009,).
Much work has gone into appropriate ways of weighting the data using a range of ways to assess
the probability of contact and normalise for it (Barr et al, 2014; Berzofsky et al, 2009; Callegaro et
al, 2011; Levrakis, 2013; Lohr, 2000 – 2011; Ridenhour et al, 2013; Yeager et al, 2011).
In both of our multi-mode surveys we found that the closest approximation to ABS published
statistics (and other published data that we had captured as a sanity check) was achieved simply by
adding data from the different sample frames together. The reason for this seems to be that while
people may have many means of being contacted, they have a mode preference both for contact
and completion. Talking with respondents in our latest survey also suggested that most people
would only be available and willing to answer in any one mode.
Having said this, we’re not suggesting that the answer is to use so many different starting samples
that it is impossible for members of the public to escape the net, or that we don’t take the possibility
of overlap seriously. We are continuing our work in the area of best practice in ways of handling
multi-frame (and multi-mode) data for community based surveys and commend readers of this
paper to the references we’ve provided for those attempting what is a still a new and very complex
means of surveying. However, in our view, multi-framing is the best means currently available to
give thorough access to the entire Australian public.
The following Chart demonstrates our best estimate of the way in which the population can be
contacted by mode:
![Page 13: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/13.jpg)
Page 13 of 24
Chart F: Means of contacting the Australian Public in Practice by Mode
Source: Deakin/ Wallis Multi Frame Multi Mode Omnibus 2013/2014
How many people should we speak with to be “good enough?”
We’ve demonstrated that there is no one sample or mode of interviewing that gives access to the
entire public. However, clearly this is not always necessary.
For specific audiences, different samples work well – for example, it is possible to gain the co-
operation of older females through fixed line telephones, and tradespeople by calling mobile phone
numbers during the working day. Of course, where research is to be conducted with a known
population, listings of customers, service users or potential customers give the best starting sample
of all, although we suggest that to gain the opinions of the widest range of people it is necessary to
have multiple contact points – address, phone numbers and e-mail address.
In the context of making estimates of the wider population though, we return to statistics 101 and
pose the question, “how many people do we need to speak with to give reliable estimates?” In a
true random sample the answer is usually 300 (because this gives estimates accurate to ±4-6% at
the 95% confidence interval) or if you really want to analyse sub-groups within this population,
1,500 because then overall error is reduced to ±2-3% at the 95% confidence interval and you can
analyse up to five evenly sized sub-groups with reasonable precision. Or can we?
![Page 14: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/14.jpg)
Page 14 of 24
The more I find out the less I know?
Where multiple sample frames are used, there is highly likely to be considerable overlap as most
people will appear in more than one frame. Thus our ideal simple random sampling paradigm
cannot apply as people will not have an equal chance of being selected. This raises two questions.
Firstly, as we build multi-frame samples, how do we determine accuracy? Secondly, if the sample
frame is skewed, will interviewing more people give a more accurate result or support more detailed
analysis?
To assist with both, it is clearly important to have a means of sense-checking any information, so
that surveys should be designed to gather appropriate information. Unfortunately, as we’ve seen
with most starting samples that do not have total coverage (and none really do in terms of likely
response) this cannot be information such as demographic data. It is possible to make four people,
if chosen carefully, represent the whole of Australia – but no-one is seriously thinking that four
people can be relied upon to give a robust result. However, they may be relied upon to act as
expert witnesses.
Table 3 shows a comparison of answers to the question, “What are young Australians’ favourite
foods?”
Table 3 Australia’s favourite foods amongst people aged 18 – 29
The results are remarkably similar. Two of these (Jamie Oliver and Matt Preston) are based on
personal opinions formulated over many years of visiting and living in Australia as well as their
views on popular culture here. The Herald Sun Poll is keen to tell readers the number of
respondents to the survey as a means of demonstrating statistical rigour.
Herald Sun Poll - May 2015
Jamie Oliver Matt Preston (PureProfile)
N=1 N=1 N=1,000
Pavlova Seafood platter Vegemite
Fish and chips Burger Meat pie
Potato cakes Mixed grill Pavlova
Dims sims Fish and chips Steak
Steak Pumpkin soup Macadamia nuts
Sausage rolls Salads Lamingtons
Burgers Avo and vegemite on toast Kangaroo
Lamingtons Sticky date pudding Chiko Rolls
Pie Pavlova Dagwood Dogs
BBQ Shrimp Tim Tam slam Iced Vovos
![Page 15: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/15.jpg)
Page 15 of 24
Ever since the Oracle at Delphi the opinions of experts and elder statesmen have been used to
provide guidance. The Delphi technique is still very much used and is a useful tool – but it depends
upon drawing on past knowledge and wisdom.
While the past is the best predictor of the future, “think tanks” don’t always get it right and nor do the
experts. They will be closer where the answers are generally known. It’s unlikely that the results
presented in Table 3 were a surprise to the reader. However, right or wrong, the data is unlikely to
be of great importance to anyone except dieticians or anyone wanting to set up a fast food chain.
Happily, according to the ABS3, what we like and what we eat are very different.
However, on the opposite side we see how very wrong statistically “robust” opinion polls can be on
such matters as voting intentions, as was witnessed in the UK earlier this year. The polls were so
wrong that the UK has initiated a public enquiry, with results expected by the middle of 2016.
Something was clearly amiss with the sampling, the mode, the invitation to participate or perhaps
the population really had no idea what it was going to do at the time of asking and was too polite to
refuse to answer.
Random Sampling or Sampling at Random?
We have demonstrated that most sampling frames available for conducting community surveys do
not give each person in the frame an equal chance of participation, yet the statistics we apply to
determine their accuracy rely on this. The problem is compounded across frames (Lohr, 2011).
Weighting may help if we have a reasonable method for estimating frame overlap (Barr et alia,
2014), but that is not always available.
In this brave new world of non-probability sampling, we believe it is time to consider alternative
ways of stating how accurate information is.
Does it really matter whether we state that data are accurate to within +/- x% at a particular
confidence interval, or is it just as useful to give some other means of determining the reliability or
variability of the data? Where there is some existing data available it is possible to use Bayesian
Estimation to give credibility intervals and this is growing in popularity (Roshwalb et al, 2012).
However, this is not possible where there is no existing data to use as the basis for estimation.
In this case, two possible techniques are Boot Strapping and Jack knifing which were originally
suggested in 1958 as a means of comparing results from very small sample sizes (Tukey et al,
1958, Efron, 1982). These techniques work by looking within the data at variability. These
techniques used to be arduous to apply, but now most statistical packages have them built in and
they are enjoying something of a resurgence as a result (Lohr, 2010), Here is an example of how it
can be applied using data on eye colour from the 2013 multi-mode multi frame study. People were
3 ABS Cat No: 4364.0.55.007 - Australian Health Survey: Nutrition First Results - Foods and Nutrients, 2011-12
![Page 16: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/16.jpg)
Page 16 of 24
asked to describe their eye colour. The table excludes those people who would or could not say.
The figures relate to the percentage of the whole sample that gave that response.
Table 4 Bootstrapped estimates of eye colour
Base: 859
Qi1: What colour are your eyes?
Surprisingly, we can find no Australian data, so we can say ours are the definitive estimates! Some
US estimates are provided for interest4. The bootstrapping procedure takes account of the gender
and age strata in our samples. Because it uses a resampling with replacement technique, it models
a range of possibilities, including possible samples drawn largely from a single mode. It shows our
best estimates – from simply combining the samples, to our bootstrapped 95% confidence intervals.
Moving to this approach allows us to talk about ranges within which results should be considered
“accurate” rather than putting a range around a number per se. This can work in the same way in
practice, but requires a mind shift away from displaying results as statements of fact or absolute
truths towards results that tell a story with a general margin of reliability.
Interpretation is the key?
Based on the statistically sound Ipsos MORI Poll the following headline appeared in the UK
recently:
Today’s key fact: you are probably wrong about almost everything
Most people around the world are pretty bad when it comes to knowing the numbers behind the
news. But how issues such as immigration are perceived can shape political opinion and
promote misconceptions.
4 see http://brandongaille.com/eye-color-percentages-and-statistics/
Colour Fixed Mobile Panel Total Lower Upper
Brown 29.9% 35.8% 33.2% 33.2% 30.3% 36.1% 41.0% Brown
Blue 32.4% 28.7% 33.5% 31.5% 28.6% 34.5% 47.0% Blue/Grey
Green 14.3% 13.2% 11.6% 12.9% 10.7% 15.1% 12.0% Green
Hazel 17.6% 12.5% 13.5% 14.3% 12.1% 16.6%
Grey 0.8% 2.7% 4.7% 2.9% 1.9% 4.1%
Other 1.6% 4.1% 2.2% 2.7% 1.6% 3.8%
Bootstrapped 95%
Confidence Interval Data from US
![Page 17: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/17.jpg)
Page 17 of 24
A number of questions were asked of people around the World. Chart G shows the actual
proportion of migrants to the country in question in the burgundy bar and the estimated number at
the end of the bar. The difference, shown in orange, is the difference between the actual and
estimated figures. The figures represented in this Chart are almost certainly accurate - however
accuracy is measured.
Australia has the highest proportion of migrants in its population and respondents made the most
accurate estimate of this proportion. Canada has a lower actual proportion of migrants yet
estimated the proportion to be similar to Australians. The US, Italy, Belgium and France all made
estimates around 30% - but the reality is quite different. In every country, the estimate is higher
than the actual proportion.
Chart G Difference between actual and estimated migration rates
Q: Out of 100 people how many do you think are migrants to this country?
Source: Ipsos MORI
![Page 18: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/18.jpg)
Page 18 of 24
Bobby Duffy, Managing Director of Ipsos Mori Social Research Institute said about this data:
“The real peril of these misperceptions is how politicians and policymakers react. Do they try to
challenge people and correct their view of reality or do they take them as a signal of concern, the
result of a more emotional reaction and design policy around them?
Clearly the ideal is to do a bit of both – politicians shouldn’t misread these misperceptions as people
simply needing to be re-educated and then their views will change – but they also need to avoid
policy responses that just reinforce unfounded fears.”
As Bobby rightly points out, it is how the data are used that is the issue, not the facts themselves.
So size doesn’t really matter as much as careful design and reporting. The actual numbers in this
example are less important than the fact that there are gaps at all and what underpins them.
Which brings us back to PURPOSE!
So how good is good enough?
Why is the information needed?
How will it be used?
These are the two most important questions we need to ask ourselves as researchers and indeed
this is absolutely common to both marketing intelligence and market and social research.
They can also guide us in deciding how to design studies fit for purpose.
Clearly at the top end are data gathering exercises that lead to inscrutable results. The most
obvious examples are various surveys conducted by the Australian Bureau of Statistics with the
mother of them all being the Census of Population and Housing.
A census is a very expensive exercise. The 2011 census is reported to have cost about $440
million. However, if we didn’t have it and just one new major piece of infrastructure was built in the
wrong place as a result, it would cover the savings! On a household basis this is just under $60 per
household to give accurate information for governments at all levels and business to use in their
planning – and for market and social researchers to use as a sanity check or weighting base!
Could it be less accurate? Australia is acknowledged as having an extremely high quality of
information and to justify this on the basis of a rapidly growing population with rapidly changing
needs. Earlier this year, when the government floated the idea of scrapping the census, there was
a large amount of negative comment. These plans appear to have been shelved.
![Page 19: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/19.jpg)
Page 19 of 24
The ABS has grasped the need to get everyone involved and is able, with force of law, to compel
households to respond. Nonetheless, households are not the only sampling frame used - it goes
beyond these to capture homeless people. The census can also be completed in hard copy format
or online. It is, therefore a multi frame, multi-mode study.
Next in order of importance are studies where it is imperative to make absolute estimates of the
prevalence of behaviour, attitudes or beliefs. Studies such as those mentioned earlier for the TAC
fall into this category as do studies on the state of public health and actual rates of crime. Most of
these studies are now carried out in Australia by census like approaches or where sampling is
involved, by the use of multiple starting sample frames.
At the other extreme, when making a small extension to a product line it may be less expensive to
use an experimental design and simply place the product with some potential users or in store,
rather than going to the expense of a full blown survey to gauge uptake. For example one of the
authors was asked the following question some years ago while working as a Market Research
Manager for a major confectionery manufacturer…
Should Freddo frog wear a bow tie?
The use of market research funds to answer such a question was ridiculous. All that needs to be
done is to watch a number of children eat a Freddo. The answer is simple, children (and some
adults) eat Freddo head first or feet first – most would have no idea what he wears (or if he wears
anything). A more relevant question is whether the cost of changing the mould to make Freddo
with a bow tie could be justified. Clearly it could not be.
The prevalence and ease with which many surveys can be completed has led some organisations
to collect data, sometimes enormous amounts of it, because they can. More is not better if the
information is not captured appropriately, or systematically.
Collecting information quickly is not necessarily a good thing. For example, in assessing
community attitudes towards such things as new developments in local areas, the most vocal
community members are those who are strongly opposed or strongly in support of the development.
If surveys deployed by any means are run overnight or very quickly, they will pick up answers from
the people who have vested interests, not the balanced view that comes out in time, when everyone
is given a chance to participate and give their opinions.
So it comes down to the risk involved in the decision to be made. In general terms, the bigger the
risk (and that is not only financial) that may be involved in the decision, the more accurate a study
should be. We’re defining accurate as:
“The ability to match the population of interest as closely as possible with those questioned and to
ask sufficient people as to make their answers reproducible”
![Page 20: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/20.jpg)
Page 20 of 24
Obviously, the more that is known about the characteristics of the population the easier it is to
match them. Conversely the less is known about that population, the harder this is to do and the
more compelling an argument to use all methods at the researcher’s disposal.
In our view we should always strive for the best and in doing this recognise that “good enough” in
the context of survey research is a function of:
1. How much we know already
2. How risky the decision is that needs to be made and therefore how much tolerance we have for
considering possible decisions on the basis of confidence intervals rather than a single estimate
3. How quickly we need to make the decision
4. How much money we are prepared to spend to make the decision
There is nothing new in this and researchers have been trading these factors off officially in
Australia for 60 years. What has changed is that we generally have some information available to
us as a starting point, the speed with which decisions are being made is increasing and the relative
size of the budget available is reducing.
Conclusions and Recommendations
In this paper we have attempted to point out the shortcomings in all survey techniques but also to
offer some solutions.
The information presented here causes us to conclude that:
1. No one sample frame gives total coverage of the Australian community and even those that
appear to do not because of response mode preferences.
We recommend that researchers use multiple sample frames wherever possible when
attempting to make estimates for the entire community and understand the limitations
where this is not possible. We recommend challenging the use of probabilistic statistics in
stating the error margins on single frame or single mode samples and using alternative
means of understanding variability in the data.
2. Samples do not need to be big to be reliable - they should reflect the population of interest as
closely as possible.
Where the characteristics of the population are not known, we highly recommend including
questions that have known answers available from other sources that can be used for the
purposes of cross referencing to gauge the extent of the fit – in other words as a sanity
check.
3. The bigger the decision, the more accurate evidence needs to be
We recommend matching the sample to the size of the decision in both its coverage and
accuracy, using ways to calculate the accuracy of the estimates can be achieved in a
number of ways- we do not need to rely on probability sampling to give an idea of the
reproducibility and variability of the data.
![Page 21: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/21.jpg)
Page 21 of 24
4. Research based on known users or potential users of a product or services provides the best
starting sample but is still not immune to biases – including those introduced if only a single
mode of completion is allowed.
We recommend that respondents should be allowed to respond in the way or ways that
suit them best.
5. The new gold standard is a single listing of the population of interest together with accurate and
up-to-date contact details for multiple ways of contacting that individual.
In practice we seldom have such lists, so if budget and time permit:
Build in multi-mode contact
Build in multiple frames
If using single frame or single mode, be careful how the findings are reported. In most
cases it is no longer possible to estimate error limits based on random sampling principles,
however we can provide guidance on the level of variability of the data.
6. It is good enough if the research design gives results that are within the risk level for the
decision to be made.
We recommend an open debate between people commissioning and using research to
ensure that the uses of the information and its potential limitations are clearly identified.
We conclude with the following table and diagram, which offer a fast ready-reckoner to the optimum
approach to help researchers to design research that is good enough and fit for purpose every time.
We have this within our power – it is up to all of us to be passionate about good practice - and make
it happen now and into the next 60 years!
Table 4: Samples and their uses and abuses
![Page 22: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/22.jpg)
Page 22 of 24
Chart H: Guide to How Good is Good Enough?
![Page 23: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/23.jpg)
Page 23 of 24
References
Ansolabehere S & Schaffner BF (2014) Does Survey Mode Still Matter? Findings from a 2010
Multi-Mode Comparison, Oxford University Press on behalf of the Society for Political Methodology.
Barr ML, Ferguson RA, Hughes PJ & Steel DG (2014). Developing a weighting strategy to include
mobile phone numbers into an ongoing population health survey using overlapping dual-frame
design with limited benchmark information. BMC Medical Research Methodology, 14(102), 1-9.
doi:10.1186/1471-2288-14-102
Bednall D, Van Souwe J, Fine B & Bishop B (2014) Access all people: The 3M Project AMSRS
Conference, Melbourne, Victoria, Australia
Bednall D, Spiers M, Ringer A & Vocino A (2013), Response Rates in Australian Market Research,
Deakin University, School of Management and Marketing, Melbourne, Vic.
Berzofsky M, Williams R & Biemer P (2009) Combining Probability and Non-Probability Sampling
Methods: Model-Aided Sampling and the O*NET Data Collection Program Survey Practice 2 (6)
Callegaro M, Ayhan O, Gabler S, Haeder S & Villar A (2011) Combining landline and mobile
phone samples – A dual frame approach GESIS – Libniz-Institut fur Sozialwissenschaften 13.
Efron B (1982) The Jackknife, the Bootstrap and Other Resampling Plans, Philadelphia:
Society for Industrial and Applied Mathematics
Hair JF & Lukas B (2014) Marketing Research. Fourth Edition. McGraw-Hill Education, Sydney,
Australia.
Hu SS, Balluz L, Battaglia MP & Frankel MR (2011). Improving public health surveillance using a
dual-frame survey of landline and cell phone numbers. American Journal of Epidemiology,
173(6), 703- 711. doi: 10.1093/aje/kwq442
Levrakis PJ (2013) Recent developments in Dual Frame RDD Surveys Presentation to Australian
Market and Social Research Society, Melbourne, Victoria, Australia.
Lohr SL (2011). Alternative Survey Sample Designs: Sampling with Multiple Overlapping Frames.
Survey Methodology, 37, 197-213.
Lohr SL (2010) Sampling: Design and Analysis Second Edition. Arizona State University,
Tempe, Arizona, USA.
Lohr SL (2006) Estimation in Multiple-frame surveys Journal of the American Statistical
Association, 63, 271-280
![Page 24: How good is good enough?€¦ · We shared our preliminary work at the 2014 AMSRS Conference in Melbourne1. We now have even more evidence from every mode of recruitment and most](https://reader034.vdocument.in/reader034/viewer/2022050508/5f99681138d8b21bfa3758f1/html5/thumbnails/24.jpg)
Page 24 of 24
Lohr SL & Rao JNK (2006). Estimation in Multiple-Frame Surveys. Journal of the American
Statistical Association, 101, 1019-1030.
Lohr SL & Rao JNK (2000). Inference in Dual Frame Surveys. Journal of the American Statistical
Association, 95, 271-280.
Maia M (2009) Indirect Sampling in Context of Multiple Frames JSM 1769-1777
Pfeffermann D & Rao CR. (Eds.) (2009) Sample Surveys: Design, Methods and Applications,
Vol. 29A, (pp. 71-88) Elsevier, The Netherlands: North-Holland.
Ridenhour J, Berzofsky M, Couzens G L, Blanton C, Lu B, Sahr TR & Ferketich A (2013) Most
efficient weighting approach in dual frame phone survey with multiple domains of interest AAPOR
Annual Conference¸ Boston, Massachusetts
Roshwalb A, El-Dash N & Young C (2012) Towards the Use of Bayesian Credibility Intervals in
Online Survey Results IPSOS Public Relations
Simon JL (1997). Resampling: the New Statistics. Second Edition Resampling Stats.
Tukey JW (1958) Bias and Confidence in Not Quite Large Samples, Annals of Mathematical
Statistics, 29, 614.
Yeager DS, Krosniak JA, Chang L, Javitz HS, Levendusky MS, Simpser A & Wang R (2011)
Comparing the Accuracy of RDD Telephone surveys and Internet Surveys conducted with
probability and non-probability Samples, Public Opinion Quarterly, 75 (4), 709-747.