internet pilot paper2

14
Online Opinions – A Pilot to Extend Social Data Collection Capabilities within the Office for National Statistics Kathryn Ashton and Ed Dunn 1. Summary In the face of falling response rates, increases in traditional face-to-face social data collection costs and challenging public sector efficiency targets, there is increasing pressure to review and improve methods of social data collection. This has led to the exploration of the internet as a tool for social data collection. In November 2008 the Office for National Statistics’ (ONS) Social Survey Division ran a pilot on the Opinions survey (OPN - previously known as the Omnibus survey) providing respondents with the option to complete on-line rather than via a traditional face-to-face interview. Key objectives of the pilot were to determine the unit and item response rates that might be obtained, gain valuable experience of designing and implementing a web survey, and to investigate the characteristics of respondents who would respond on-line. The pilot demonstrated that it is possible to get a substantial, if minority, response from a web survey. The pilot obtained responses from around 20 per cent of issued addresses. However, the respondent profile of those completing on-line does appear to be substantially different to a standard face-to-face respondent. The results suggest that internet- only surveys risk substantial bias in estimators but that the internet could possibly be used to supplement traditional methods of data collection. However, further work would be required to establish if this would cause mode effects in the quality of the data collected. 2. Introduction There is a combination of pressures which have led to ONS exploring the use of web surveys as part of a mixed mode approach to social data collection. Challenging public sector efficiency targets and the increasing costs and difficulty of 46 SMB 53 1/04

Upload: kathryn-ashton

Post on 16-Feb-2017

70 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Internet Pilot Paper2

Online Opinions – A Pilot to Extend Social Data Collection Capabilities within the Office for National Statistics

Kathryn Ashton and Ed Dunn

1. Summary

In the face of falling response rates, increases in traditional face-to-face social data collection costs and challenging public sector efficiency targets, there is increasing pressure to review and improve methods of social data collection. This has led to the exploration of the internet as a tool for social data collection.

In November 2008 the Office for National Statistics’ (ONS) Social Survey Division ran a pilot on the Opinions survey (OPN - previously known as the Omnibus survey) providing respondents with the option to complete on-line rather than via a traditional face-to-face interview. Key objectives of the pilot were to determine the unit and item response rates that might be obtained, gain valuable experience of designing and implementing a web survey, and to investigate the characteristics of respondents who would respond on-line.

The pilot demonstrated that it is possible to get a substantial, if minority, response from a web survey. The pilot obtained responses from around 20 per cent of issued addresses. However, the respondent profile of those completing on-line does appear to be substantially different to a standard face-to-face respondent. The results suggest that internet-only surveys risk substantial bias in estimators but that the internet could possibly be used to supplement traditional methods of data collection. However, further work would be required to establish if this would cause mode effects in the quality of the data collected.

2. Introduction

There is a combination of pressures which have led to ONS exploring the use of web surveys as part of a mixed mode approach to social data collection. Challenging public sector efficiency targets and the increasing costs and difficulty of face-to-face data collection have both led to web-based data collection becoming an increasingly attractive mode for government social research surveys. In addition, over the past few years, there has been a steady decline in the response rates of our major social surveys. These pressures have enhanced the appeal of web based data collection, as part of a mixed mode approach and as a potential opportunity to enter the online-only survey market.

While face-to-face, telephone and paper-based interviewing are well established activities, the development of web-based data collection within ONS has not, so far, extended beyond consideration of the significant and challenging issues web-based data collection poses. As Flatley et al (2001) indicated, much development work is needed. Key methodological issues and concerns have to some extent prevented further exploration of web-based data collection so far.

This paper will highlight the methodological issues with web surveys, outline the key objectives of the pilot, describe the methodology used and present the main findings.

46 SMB 53 1/04

Page 2: Internet Pilot Paper2

3. Methodological Issues with Web Surveys

There are a number of methodological issues and concerns surrounding the use of web surveys as a method of data collection within ONS national household surveys. Household surveys have several main key characteristics:

- the use of random probability sampling to gain a representative sample

- the questionnaire is often very long and questions require complex routing

- they require a high response rate in order to produce results with the required accuracy and precision.

- questions frequently require responses from all members of the household

All these characteristics are difficult to achieve in a web survey. Past research (Fricker and Schonlau 2002) indicates that only a certain sub-set of the population with similar characteristics will respond to a web survey. This will cause bias in response. For example, if higher income households are more likely to complete on-line than lower income households, or if those from a certain geographical area are more likely to complete the survey on-line than others.

Random probability sampling is difficult to achieve in a web survey. To date, there is no sampling frame of those who use the internet or are able to use the internet within the UK. This will result in different sampling methods having to be utilised when designing a wholly internet based survey.

There is a multitude of choice in which software to use when designing a web based survey. The software available ranges from the simple HTML approach to more complex programming.

As in all methods of data collection, security is a key concern with internet surveys. Due to recent data security issues within government, respondents may be more sensitive about disclosing personal details over the internet, and this is an aspect which the design of an internet survey must attempt to overcome. It is important that the software used complies with relevant data security and confidentiality regulations, not only when storing the data, but also during transmission.

Inevitably, the required developmental work required can be both costly and timely.

4. Pilot Objectives

Key objectives of the web pilot were to:

- determine the unit and item response rates that might be obtained using a web-based survey as the preferred approach

- investigate the characteristics of people responding to the on-line survey and, where possible, compare to the face-to-face mode of the OPN survey

- evaluate the on-line software used and determine our current hardware and software capabilities in this area, including consideration of data security issues for web-based surveys

- examine the effect of offering an incentive to half of the sample on unit and item response rates

Page 3: Internet Pilot Paper2

- gain valuable experience in translating current computer assisted personal interview (CAPI) programmes to a computer assisted web interview (CAWI) approach

- gain experience in designing and launching a CAWI survey and the logistical requirements involved

- where the scale of the pilot allows, ensure that the needs of OPN customers, in terms of data quality, quantity and timeliness of delivery, can continue to be met

- deliver the pilot to a short timescale (by December 2008) in order to feed into a review of social data collection within ONS and provide recommendations for further work in this area

Given the limited nature and short timescale for the pilot, the objectives did not seek to:

- investigate the multitude of mode effects and question/response ordering effects that may occur

- make robust comparisons between the web pilot and the 'normal' face–to-face OPN survey being run in parallel

- test the impact of different incentives- test the impact of alternative advance letter design or format- test the impact of alternative question wording- investigate alternative designs and format of the web survey

5. Sample Design and Respondent Selection

5.1 Sample Design and Respondent Selection of the Web Pilot

The web pilot took a simple random sample of 2,000 addresses from the Post Office Address File (PAF). By using a simple random sample, all cases within the sampling frame had an equal chance of being selected. This ensured the highest level of confidence in the unit response rate was obtained. As previously noted, there is no sampling frame available for those who have access to, or are able to use the internet. Only English households were selected since an existing sampling program was available to use and there was no capacity to amend the program to include Scottish, Welsh or Northern Irish households.

Half of the sample (1,000 addresses) were selected at random and offered an incentive to take part on-line. Respondents in the other half of the sample were offered no incentive to take part. The incentive offered was a £10 Amazon 'electronic' gift certificate which was emailed to the respondent on receipt of their fully completed response.

Each household within the sample was sent a modified version of the Integrated Household Survey (IHS) advance letter, inviting them to take part in the 'Opinions Survey' giving some basic details about the survey and its uses. The on-line aspect was heavily emphasised, but the letter also suggested that if they did not complete on-line by December 1st an interviewer may call to conduct the survey in person. The letter contained the website address 'www.ons.gov.uk/takepart' and a unique 12 character ‘userid’ (in the format CPS08123456x) and a five character password (in the format x12yz) for each respondent. The website 'www.ons.gov.uk/takepart' contained some basic completion information for respondents and a direct link to the on-line survey.

The letter asked the adult with the most recent birthday to complete the survey on-line. This is a method of random selection of the respondent within the household (Dillman 2007). It is important to note that no attempt was made to evaluate the effectiveness of this response selection method within this pilot.

Half way through the field period, a follow-up letter was sent to non-responders, but worded as if to all respondents. At the survey close, a further letter was sent to non-responding households informing them that their participation was no longer required and that an interviewer would no longer be calling.

Page 4: Internet Pilot Paper2

5.2 Sample Design and Respondent Selection of the Opinions Survey (OPN)

The OPN face-to-face survey differs from the web pilot in that it uses multistage cluster sampling, as opposed to simple random sampling. The November OPN which was in the field at the same time as the pilot had a sample size of 2,010 households.

The face-to-face OPN survey uses a Kish grid to select a respondent. This involves constructing a list of eligible household members, based on their age, then selecting the respondent using the address’ serial number.

6. Questionnaire Design

The pilot used a set of questions and question blocks designed to be broadly equivalent (in terms of questionnaire length and the variety of questions included) to the November OPN survey. However, the full Integrated Household Survey (IHS) core, which asks each member of the household to complete a section of the questions, was not included in the web survey. This was due to concerns over the impact on response (e.g., it would require all members of the household to log-on and complete the section) and because of the extra complexity in programming.

The questionnaire consisted of:

- IHS core questions asked of the individual selected to respond (e.g. socio-demographics, economic activity questions)

- smoking OPN module- healthy eating OPN module- charities OPN module- tax OPN module- disability OPN module- follow-up question for consent to further research

Questions were reviewed to take web-based self-completion into consideration. Most questions and responses were incorporated in their existing format. However, some modifications were necessary; e.g., replacing 'Code all that apply' with 'Please select all that apply.' Dillman, (2007) found that respondents are more likely to drop out of a questionnaire if they encounter problems so, as response was the main quality measure, it was important the pilot questionnaire was as intuitive and easy to complete as possible.

The literature in this area indicates that the primary tasks of the respondent are to read, comprehend and respond. The pilot's web page design and layout reflected this. ONS logos and colours were used throughout the questionnaire; no multimedia effects were used as they may have affected the respondent's interaction with the instrument and influence their answers. The design also respected the Western visual flow of reading from left to right. In considering the page layout, an attempt was made to minimise the amount of screen scrolling required by the respondent while also attempting to maximise the numbers of questions per page. This is a standard approach to web surveys, as suggested by Dillman (2007). A progress bar was included within the web-based questionnaire to motivate respondents to proceed through the questionnaire and not drop out.

7. Findings

Page 5: Internet Pilot Paper2

7.1 Unit response

Table 1 provides headline response figures to the pilot web survey with highlights from the November OPN figures for comparison.

The headline response figures were:

- 18% of the total issued sample of the web pilot responded compared with the November OPN which gained a response rate of 53%

- of the incentivised group on the web pilot, 18% responded and the non-incentivised group had a response rate of 19%

- 69% of the full issued sample on the web pilot responded to the income question compared with the November OPN where 84% of the sample responded.

Table 1

Headline Response Figures

Web Pilot Web pilot Nov OPN Nov OPNn % n %

Total issued sample 2000 2010Full response~ 364 18 1083 53^Partial response # 32 2Full response based on assumptions on eligibility and internet access*

364 33 1083 59¬

Issued sample offered incentive 1000Full response~ (incentivised group) 179 18Partial response # (incentivised group)

14 1

Issued sample not offered incentive 1000Full response rate~ (non-incentivised group)

185 19

Partial response rate #(non- incentivised group)

18 2

Response to income question (full or banded)

251 69 910 84

within incentivised group 124 69within non-incentivised group 127 69~ Full response is defined as where respondents have answered all questions including a refusal to provide income details. Refusal was a valid response category.# Partial response is defined as being where a respondent has not finished completing the survey^ This is a crude response rate based on issued addresses as for the web pilot we have no information on eligibility. More detailed response rates are available for November OPN* Uses assumption of 10% ineligibility and 61% household internet access rate applied to 2,000 issued sample = 1,098 eligible households with internet access.¬ This is the published OPN response rate for November calculated as the number of achieved interviews as a percentage of the eligible sample.

These response rates are rather healthy if we compare them with those of other recent internet surveys. For example, the Scottish Census test obtained a response rate of 17 per cent and the Canadian Census in 2006 achieved a response rate of 19 per cent. Research, including work by Fricker and Schonlau (2002), indicates that web surveys do only achieve fairly modest response rates, and this is why they should be used as part of a mixed mode method.

Page 6: Internet Pilot Paper2

Solomon (2001) states that partial responders are most likely to stop completing a questionnaire when: 1) encountering the very first question; 2) encountering complex household grid questions or; 3) when asked to supply their email addresses. Our pilot supported this argument. Partial responders either stopped at the first question, the household grid question, which asked for information on each household member, or the question which requested their email address at the end of the interview. Methods to overcome this could be derived by further investigation into the issues of partial response.

The pilot results showed that there was little difference between the incentive and non-incentive groups in terms of response. But these results may have been an effect of the approach which was taken, which offered the incentive on receipt of the full completion of the survey. Literature in the US has suggested that pre-paid incentives may have more of a substantive effect (Dilman 2007, Göritz 2006, Singer, Hoewyk and Maher 2000).

7.2 Item Response

When compared with the November OPN, response rates were likely to be lower on the internet pilot for the name variable, personal income questions and employment status questions. In particular, the face-to-face interview gained an item response rate of 84 per cent on the personal income questions, whereas the internet pilot received a response rate of 69 per cent. All of these results were statistically significant.

The results also indicated that response rates were likely to be higher on the internet pilot than the face-to-face interview for questions on health and religion.

7.3 Survey metrics

7.3.1 Timing

Overall, 80 per cent of respondents appear to have completed the survey in less than 25 minutes. Table 2 provides further details.

Table 2Length of Survey in Minutes

Length in Minutes Frequency Percent

0-5 5 1

6-10 20 6

11-15 107 29

16-20 100 28

21-25 58 16

26-30 25 7

31-35 18 5

36-40 11 3

41-45 7 2

46-50 5 1

51-55 1 0

56-60 0 0

60-70 4 1

Over 70 3 1

Total 364 100

Page 7: Internet Pilot Paper2

This is significant when compared with the average time taken to complete a face-to-face OPN survey, which is on average 45 minutes, excluding interview administration time.

7.3.2 Time and day of completion

Overall, 75 per cent completed the survey between the hours of 8am to 6pm, and 11 per cent between 7pm and 8pm. However, the most popular hour in which to begin the survey was between 7pm-8pm, with 4pm-5pm a close second. Just under a quarter of responses were completed at the weekend and Wednesdays and Fridays proved to be the most popular days - this reflects the sending of the advance and follow up letters. Tables 3 and 4 provide further details.

Table 3

Day of completion

Day of Completion Frequency Percent (%)Monday 36 10Tuesday 36 10Wednesday 76 21Thursday 56 15Friday 70 19Saturday 43 12Sunday 47 13Total 364 100

Table 4

Starting Hours of On-line Completion

Starting Hour Frequency Percent (%)

7.00 3 1

8.00 5 1

9.00 21 6

10.00 25 7

11.00 32 9

12.00 21 6

13.00 21 6

14.00 28 8

15.00 25 7

16.00 35 10

17.00 28 8

18.00 26 7

19.00 39 11

20.00 19 5

21.00 20 6

22.00 11 3

23.00 2 1

Page 8: Internet Pilot Paper2

24.00 3 1

Total 364 100

In hindsight, a useful additional question on the survey would have been a question that asked where the respondent completed the survey; for example, at home, at work or at a friend's house. This variable would have supported further analysis into factors of response, and would be useful for development work of web-based surveys in the future.

8. Results - Profile and Key question responses

This section provides a basic summary of the responses obtained from the web-based pilot and the characteristics of the on-line respondents; it also highlights some key question responses within the OPN modules. Readers are reminded of the pilot objectives, as stated in section 4.0, when looking at the response analysis.

The OPN survey is routinely weighted to an overall population estimate using the person level weights provided with the data, which are calibrated to a set of population constraints. It was not desirable to use the same set of calibration totals on the internet data because the sample size was too small to produce a robust set of results. Consequently, a reduced set of calibration constraints was constructed for the internet data. The same set of constraints was then applied to the OPN design weight in order to produce a new calibration weight for the OPN data that was in accord with that used for the internet data. This ensured that any differences arising between estimates produced by the two datasets were not caused by spurious differences in weighting procedures.

The sample designs of the internet and OPN surveys are independent; hence, the standard error of the difference between the estimators based on data from the respective surveys can be obtained by combining the standard errors of the individual estimators. The latter are squared and added up, the square root of the sum yields the standard error for the difference. Significance tests were carried out using the ratio of the difference of the two estimates by their estimated standard error as the test statistic. We assumed that the distribution of this ratio is approximated by a t-distribution and hence used the t-test (two-sided with significance level 0.05).

8.1 Profile of Responders of the Web Pilot compared with the OPN face-to-face

Some clear (and statistically significant) differences were found in the profile of internet pilot respondents compared with the face-to-face November OPN and key question responses.

When compared with the face-to-face OPN, internet pilot respondents were more likely to be:

- aged 25-44 and 55-64- married and living with partner- white- better educated (with degree level qualifications)- managers or supervisors- in good health

It is possible to see a response bias emerging from individuals who responded on-line.

While it may be thought that younger age groups are more likely to complete on-line, the web pilot does not support this as the youngest age group (age 16-24) was one of the age groups with the lowest response rates. This may have been due to the approach we took in contacting respondents, for example the advance letter to the household. However, overall, the profile of the web pilot respondent is, as might be expected, slightly younger than that of the OPN face-to-face respondent profile.

Page 9: Internet Pilot Paper2

Chart 1

Age Profile of the Respondents

In terms of key question responses, when compared with the November OPN face-to-face survey, the pilot respondents were less likely to:

- smoke- have a disability- think charities played an important role in society- think HM Revenue and Customs treated them fairly

Also, respondents to the pilot were more likely to:

- eat healthily (5 portions of fruit or vegetables a day)

Again, all of these results were statistically significant.

The results of the opinion question modules produced some interesting results. As the pilot respondents were less likely to think charities played an important role in society, or think HM Revenue and Customs treated them fairly, this indicates that individuals might feel more comfortable disclosing such opinions on-line as opposed to a face-to-face environment with a Government field interviewer. More research could be developed in this area to confirm if the differences in these results are a result of mode effects.

9. Next Steps for the Future of Internet Surveys within ONS

This pilot has demonstrated the viability of conducting a major ONS survey on-line and achieving a significant, if minority, response. The limitations of this pilot are widely accepted and much more time and resource within ONS is required to fully investigate the use of an internet survey.

The web-based pilot has highlighted logistical and technological issues of designing and administering a web survey and also reinforces the methodological issues highlighted in

Page 10: Internet Pilot Paper2

section 2.0. For example, the translation of a questionnaire to on-line software may be difficult due to its complexity, and issues with the sampling frame used in a web survey would need to be investigated further.

The pilot has also highlighted the risk of significant bias in estimates that are produced using data collected via an internet survey. Although the web pilot achieved a positive response rate, there are differences in the characteristics of the respondents and the survey responses that require much more investigation. Further pilot work within ONS will take forward the findings of this initial pilot to examine, more robustly, the modal differences and respondent selection issues that the initial pilot did not attempt to explore. The pilot has provided a solid foundation for further work.

References

Dillman, D.A. (2007). Mail and Internet Surveys – The Tailored Design Method (2nd ed). J Wiley and Sons:New Jersey

Göritz, A. (2006) Incentives in Web Studies: Methodological Issues and a Review. International Journal of Internet Science 1(1) pp.58-70

Flatley, J. (2001). The Internet as a Mode of Data Collection in Government Social Surveys: Issues and Investigation Social Survey Methodology Bulletin, ONS 49(7)

Fricker, R and Schonlau, M. (2002). Advantages and Disadvantages of Internet Research Surveys: Evidence from the Literature. Field Methods 14(4)

Maher, M, Singer, E and Van Hoewyk, J. (2000). Experiments with Incentives in Telephone Surveys. Public Opinion Quarterly 64. pp. 171-188

Solomon, D. (2001). Conducting Web-Based Surveys. Practical Assessment, Research and Evaluation 7(19)