research methods in education, sixth edition although the application of the method to a...

15 Questionnaires

Introduction

The field of questionnaire design is vast. Thischapter provides a straightforward introductionto its key elements, indicating the main issuesto be addressed, some important problematicalconsiderations and how they can be resolved.The chapter follows a sequence in designing aquestionnaire that, it is hoped, will be useful forresearchers. The serial order is:

ethical issuesapproaching the planning of a questionnaire

operationalizing the questionnairestructured, semi-structured and unstruc-tured questionnaires

types of questionnaire itemsclosed and open questions comparedscales of datathe dangers of assuming knowledge orviewpointsdichotomous questionsmultiple choice questionsrank orderingrating scalesconstant sum questionsratio data questionsopen-ended questionsmatrix questionscontingency questions, filters and branches

asking sensitive questionsavoiding pitfalls in question writingsequencing the questionsquestionnaires containing few verbal itemsthe layout of the questionnairecovering letters or sheets and follow-up letterspiloting the questionnairepractical considerations in questionnairedesign

administering questionnairesself-administered questionnairespostal questionnaires

processing questionnaire data.

It is suggested that researchers may find it usefulto work through these issues in sequence, though,clearly, a degree of recursion is desirable.

The questionnaire is a widely used and usefulinstrument for collecting survey information,providing structured, often numerical data,being able to be administered without thepresence of the researcher, and often beingcomparatively straightforward to analyse (Wilsonand McLean 1994). These attractions have to becounterbalanced by the time taken to develop,pilot and refine the questionnaire, by the possibleunsophistication and limited scope of the datathat are collected, and from the likely limitedflexibility of response (though, as Wilson andMcLean (1994: 3) observe, this can frequentlybe an attraction). The researcher will have tojudge the appropriateness of using a questionnairefor data collection, and, if so, what kind ofquestionnaire it should be.

Ethical issues

The questionnaire will always be an intrusion intothe life of the respondent, be it in terms of timetaken to complete the instrument, the level ofthreat or sensitivity of the questions, or the possibleinvasion of privacy. Questionnaire respondents arenot passive data providers for researchers; they aresubjects not objects of research. There are severalsequiturs that flow from this.

Respondents cannot be coerced into completinga questionnaire. They might be strongly encour-aged, but the decision whether to become involved

318 QUESTIONNAIRES

and when to withdraw from the research is entirelytheirs. Their involvement in the research is likelyto be a function of the following factors:

Their informed consent (see Chapter 2 on theethics of educational research).Their rights to withdraw at any stage or not tocomplete particular items in the questionnaire.The potential of the research to improve theirsituation (the issue of beneficence).The guarantees that the research will not harmthem (the issue of non-maleficence).The guarantees of confidentiality, anonymity andnon-traceability in the research.The degree of threat or sensitivity of thequestions, which may lead to respondents’over-reporting or under-reporting (Sudmanand Bradburn 1982: 32 and Chapter 3).Factors in the questionnaire itself (e.g. its cov-erage of issues, its ability to catch what respon-dents want to say rather than to promote theresearcher’s agenda), i.e. the avoidance of biasand the assurance of validity and reliability inthe questionnaire – the issues of methodologi-cal rigour and fairness. Methodological rigouris an ethical not simply a technical matter(Morrison 1996c), and respondents have aright to expect reliability and validity.The reactions of the respondent, for example,respondents will react if they consider anitem to be offensive, intrusive, misleading,biased, misguided, irritating, inconsiderate,impertinent or abstruse.

These factors impact on every stage of the use of aquestionnaire, to suggest that attention has to begiven to the questionnaire itself, the approachesthat are made to the respondents, the explanationsthat are given to the respondents, the data analysisand the data reporting.

Approaching the planning of aquestionnaire

At this preliminary stage of design, it can some-times be helpful to use a flow chart technique toplan the sequencing of questions. In this way, re-searchers are able to anticipate the type and range

of responses that their questions are likely to elicit.In Box 15.1 we illustrate a flow chart employed in acommercial survey based upon an interview sched-ule, although the application of the method to aself-completion questionnaire is self-evident.

On a more positive note, Sellitz and her as-sociates (1976) have provided a fairly exhaustiveguide to researchers in constructing their question-naires which we summarize in Box 15.2 (see http://www.routledge.com/textbooks/9780415368780 –Chapter 15, file 15.1. ppt).

These are introductory issues, and the remainderof this chapter takes each of these and unpacksthem in greater detail. Additionally, one can setout a staged sequence for planning a questionnaire,thus:

1 Decide the purposes/objectives of thequestionnaire.

2 Decide the population and the sample (asquestions about their characteristics will needto be included on the questionnaire under‘personal details’).

3 Generate the topics/constructs/concepts/issues to be addressed and data requiredin order to meet the objectives of the re-search (this can be done from literature, ora pre-pilot, for example, focus groups andsemi-structured interviews).

4 Decide the kinds of measures/scales/questions/responses required.

5 Write the questionnaire items.6 Check that each issue from (3) has been

addressed, using several items for each issue.7 Pilot the questionnaire and refine items as a

consequence.8 Administer the final questionnaire.

Within these stages there are several sub-components, and this chapter addresses these.

Operationalizing the questionnaire

The process of operationalizing a questionnaireis to take a general purpose or set of purposesand turn these into concrete, researchable fieldsabout which actual data can be gathered. First, aquestionnaire’s general purposes must be clarified

APPROACHING THE PLANNING OF A QUESTIONNAIRE 319

Ch

ap

ter1

5

Box 15.1A flow chart technique for question planning

Do you have double-glazing on any window in your house?

Did you have itfitted or was it

here beforehand?

What are its advantages?What are its disadvantages?

etc.

Fitted bypresent occupant

What were thereasons for you

getting it installed?

Fittedbeforehand

Yes

Do you thinkyou would have

moved in here if itwas not installed?

Do you have anyplans to get it

installed or not?

What do you think are itsadvantages? And itsdisadvantages? etc.

What were thereasons for you

getting it installed?

No

Yes No

If you were given agrant to complete the

work, would that makeany difference or not?

Source: Social and Community Planning Research 1972

and then translated into a specific, concrete aim orset of aims. Thus, ‘to explore teachers’ views aboutin-service work’ is somewhat nebulous, whereas‘to obtain a detailed description of primary andsecondary teachers’ priorities in the provisionof in-service education courses’ is reasonablyspecific.

Having decided upon and specified the primaryobjective of the questionnaire, the second phaseof the planning involves the identification anditemizing of subsidiary topics that relate to itscentral purpose. In our example, subsidiary issuesmight well include the types of courses required,the content of courses, the location of courses, thetiming of courses, the design of courses, and thefinancing of courses.

The third phase follows the identification anditemization of subsidiary topics and involvesformulating specific information requirementsrelating to each of these issues. For example, with

respect to the type of courses required, detailedinformation would be needed about the duration ofcourses (one meeting, several meetings, a week, amonth, a term or a year), the status of courses (non-award bearing, award bearing, with certificate,diploma, degree granted by college or university),the orientation of courses (theoretically orientedinvolving lectures, readings, etc., or practicallyoriented involving workshops and the productionof curriculum materials).

What we have in the example, then, is a movefrom a generalized area of interest or purpose to avery specific set of features about which direct datacan be gathered. Wilson and McLean (1994: 8–9)suggest an alternative approach which is to identifythe research problem, then to clarify the relevantconcepts or constructs, then to identify what kindsof measures (if appropriate) or empirical indicatorsthere are of these, i.e. the kinds of data requiredto give the researcher relevant evidence about the

320 QUESTIONNAIRES

Box 15.2A guide for questionnaire construction

A Decisions about question content1 Is the question necessary? Just how will it be useful?2 Are several questions needed on the subject matter of this question?3 Do respondents have the information necessary to answer the question?4 Does the question need to be more concrete, specific and closely related to the respondent’s personal experience?5 Is the question content sufficiently general and free from spurious concreteness and specificity?6 Do the replies express general attitudes and only seem to be as specific as they sound?7 Is the question content biased or loaded in one direction, without accompanying questions to balance the emphasis?8 Will the respondents give the information that is asked for?

B Decisions about question wording1 Can the question be misunderstood? Does it contain difficult or unclear phraseology?2 Does the question adequately express the alternative with respect to the point?3 Is the question misleading because of unstated assumptions or unseen implications?4 Is the wording biased? Is it emotionally loaded or slanted towards a particular kind of answer?5 Is the question wording likely to be objectionable to the respondent in any way?6 Would a more personalized wording of the question produce better results?7 Can the question be better asked in a more direct or a more indirect form?

C Decisions about form of response to the question1 Can the question best be asked in a form calling for check answer (or short answer of a word or two, or a number),

free answer or check answer with follow-up answer?2 If a check answer is used, which is the best type for this question – dichotomous, multiple-choice (‘cafeteria’

question), or scale?3 If a checklist is used, does it cover adequately all the significant alternatives without overlapping and in a defensible

order? Is it of reasonable length? Is the wording of items impartial and balanced?4 Is the form of response easy, definite, uniform and adequate for the purpose?

D Decisions about the place of the question in the sequence1 Is the answer to the question likely to be influenced by the content of preceding questions?2 Is the question led up to in a natural way? Is it in correct psychological order?3 Does the question come too early or too late from the point of view of arousing interest and receiving sufficient

attention, avoiding resistance, and so on?

Source: Sellitz et al. 1976

concepts or constructs, e.g. their presence, their in-tensity, their main features and dimensions, theirkey elements etc.

What unites these two approaches is theirrecognition of the need to ensure that thequestionnaire:

is clear on its purposesis clear on what needs to be included or cov-ered in the questionnaire in order to meet thepurposesis exhaustive in its coverage of the elements ofinclusionasks the most appropriate kinds of question(discussed below)

elicits the most appropriate kinds of data toanswer the research purposes and sub-questionsasks for empirical data.

Structured, semi-structured andunstructured questionnaires

Although there is a large range of types ofquestionnaire, there is a simple rule of thumb: thelarger the size of the sample, the more structured,closed and numerical the questionnaire may haveto be, and the smaller the size of the sample, theless structured, more open and word-based thequestionnaire may be.

The researcher can select several types of ques-tionnaire, from highly structured to unstructured.

TYPES OF QUESTIONNAIRE ITEMS 321

Ch

ap

ter1

5

If a closed and structured questionnaire is used,enabling patterns to be observed and comparisonsto be made, then the questionnaire will need tobe piloted and refined so that the final versioncontains as full a range of possible responses ascan be reasonably foreseen. Such a questionnaireis heavy on time early in the research; however,once the questionnaire has been set up, then themode of analysis might be comparatively rapid.For example, it may take two or three months todevise a survey questionnaire, pilot it, refine it andset it out in a format that will enable the data to beprocessed and statistics to be calculated. However,the trade-off from this is that the data analysis canbe undertaken fairly rapidly. We already know theresponse categories, the nature of the data and thestatistics to be used; it is simply a matter of pro-cessing the data – often using computer analysis.

It is perhaps misleading to describe aquestionnaire as being ‘unstructured’, as the wholedevising of a questionnaire requires respondentsto adhere to some form of given structure. Thatsaid, between a completely open questionnairethat is akin to an open invitation to ‘write whatone wants’ and a completely closed, completelystructured questionnaire, there is the powerful toolof the semi-structured questionnaire. Here a seriesof questions, statements or items are presentedand the respondents are asked to answer, respondto or comment on them in a way that theythink best. There is a clear structure, sequenceand focus, but the format is open-ended, enablingrespondents to reply in their own terms. The semi-structured questionnaire sets the agenda but doesnot presuppose the nature of the response.

Types of questionnaire items

Closed and open questions compared

There are several kinds of question and responsemodes in questionnaires, including, for example,dichotomous questions, multiple choice questions,rating scales, constant sum questions, ratio dataand open-ended questions. These are consideredbelow (see also Wilson 1996). Closed questionsprescribe the range of responses from which

the respondent may choose. Highly structured,closed questions are useful in that they cangenerate frequencies of response amenable tostatistical treatment and analysis. They also enablecomparisons to be made across groups in the sample(Oppenheim 1992: 115). They are quicker to codeup and analyse than word-based data (Bailey 1994:118) and, often, they are directly to the pointand deliberately more focused than open-endedquestions. Indeed it would be almost impossible, aswell as unnecessary, to try to process vast quantitiesof word-based data in a short time frame.

If a site-specific case study is required, thenqualitative, less structured, word-based and open-ended questionnaires may be more appropriate asthey can capture the specificity of a particularsituation. Where measurement is sought thena quantitative approach is required; where richand personal data are sought, then a word-basedqualitative approach might be more suitable.Open-ended questions are useful if the possibleanswers are unknown or the questionnaire isexploratory (Bailey 1994: 120), or if there are somany possible categories of response that a closedquestion would contain an extremely long list ofoptions. They also enable respondents to answeras much as they wish, and are particularly suitablefor investigating complex issues, to which simpleanswers cannot be provided. Open questionsmay be useful for generating items that willsubsequently become the stuff of closed questionsin a subsequent questionnaire (i.e. a pre-pilot).

In general closed questions (dichotomous,multiple choice, constant sum and rating scales)are quick to complete and straightforward tocode (e.g. for computer analysis), and do notdiscriminate unduly on the basis of how articulaterespondents are (Wilson and McLean 1994: 21).On the other hand, they do not enable respondentsto add any remarks, qualifications and explanationsto the categories, and there is a risk that thecategories might not be exhaustive and that theremight be bias in them (Oppenheim 1992: 115).

Open questions enable participants to write afree account in their own terms, to explain andqualify their responses and avoid the limitations ofpre-set categories of response. On the other hand,

322 QUESTIONNAIRES

open questions can lead to irrelevant and redun-dant information; they may be too open-endedfor the respondent to know what kind of infor-mation is being sought; they may require muchmore time from the respondent to enter a response(thereby leading to refusal to complete the item),and they may make the questionnaire appearlong and discouraging. With regard to analysis,the data are not easily compared across partic-ipants, and the responses are difficult to codeand to classify (see http://www.routledge.com/textbooks/9780415368780 – Chapter 15, file15.2. ppt).

We consider in more detail below the differentkinds of closed and open questions.

Scales of data

The questionnaire designer will need to choosethe metric – the scale of data – to be adopted.This concerns numerical data, and we advisereaders to turn to Part Five for an analysis ofthe different scales of data that can be gathered(nominal, ordinal, interval and ratio) and thedifferent statistics that can be used for analysis.Nominal data indicate categories; ordinal dataindicate order (‘high’ to ‘low’, ‘first’ to ‘last’,‘smallest’ to ‘largest’, ‘strongly disagree’ to ‘stronglyagree’, ‘not at all’ to ‘a very great deal’); ratiodata indicate continuous values and a true zero(e.g. marks in a test, number of attendances peryear) (see http://www.routledge.com/textbooks/9780415368780 – Chapter 15, file 15.3. ppt).These are presented thus:

Question type Level of dataDichotomous questions NominalMultiple choice questions NominalRank ordering OrdinalRating scales OrdinalConstant sum questions OrdinalRatio data questions RatioOpen-ended questions Word-based data

The dangers of assuming knowledge orviewpoints

There is often an assumption that respondents willhave the information or have an opinion about the

matters in which researchers are interested. This isa dangerous assumption. It is particularly a problemwhen administering questionnaires to children,who may write anything rather than nothing. Thismeans that the opportunity should be providedfor respondents to indicate that they have noopinion, or that they don’t know the answer to aparticular question, or to state that they feel thequestion does not apply to them. This is frequentlya matter in surveys of customer satisfaction insocial science, where respondents are asked, forexample, to answer a host of questions about theservices provided by utility companies (electricity,gas, water, telephone) about which they haveno strong feelings, and, in fact, they are onlyinterested in whether the service is uninterrupted,reliable, cheap, easy to pay for, and that theircomplaints are solved.

There is also the issue of choice of vocabularyand the concepts and information behind them.It is essential that, regardless of the type ofquestion asked, the language and the conceptsbehind the language should be within the grasp ofthe respondents. Simply because the researcher isinterested in, and has a background in, a particulartopic is no guarantee that the respondents will belike minded. The effect of the questionnaire onthe respondent has to be considered carefully.

Dichotomous questions

A highly structured questionnaire will ask closedquestions. These can take several forms. Dichoto-mous questions require a ‘yes’/‘no’ response, e.g.‘Have you ever had to appear in court?’, ‘Do youprefer didactic methods to child-centred meth-ods?’ (see http://www.routledge.com/textbooks/9780415368780 – Chapter 15, file 15.4. ppt). Thelayout of a dichotomous question can be thus:

Sex(please tick) : Male ! Female !

The dichotomous question is useful, for it compelsrespondents to come off the fence on an issue. Itprovides a clear, unequivocal response. Further, itis possible to code responses quickly, there beingonly two categories of response. A dichotomousquestion is also useful as a funnelling or sorting


Ch

ap

ter1

5

device for subsequent questions, for example: ‘Ifyou answered ‘‘yes’’ to question X, please go toquestion Y; if you answered ‘‘no’’ to question X,please go to question Z’ (see the section belowon contingency questions). Sudman and Bradburn(1982: 89) suggest that if dichotomous questionsare being used, then it is desirable to use severalto gain data on the same topic, in order to reducethe problems of respondents’ ‘guessing’ answers.

On the other hand, the researcher must ask, forinstance, whether a ‘yes’/‘no’ response actuallyprovides any useful information. Requiringrespondents to make a ‘yes’/‘no’ decision may beinappropriate; it might be more appropriate tohave a range of responses, for example in a ratingscale. There may be comparatively few complexor subtle questions which can be answered witha simple ‘yes’ or ‘no’. A ‘yes’ or a ‘no’ may beinappropriate for a situation whose complexity isbetter served by a series of questions which catchthat complexity. Further, Youngman (1984: 163)suggests that it is a natural human tendency toagree with a statement rather than to disagreewith it; this suggests that a simple dichotomousquestion might build in respondent bias. Indeedpeople may be more reluctant to agree with anegative statement than to disagree with a positivequestion (Weems et al. 2003).

In addition to dichotomous questions (‘yes’/‘no’questions) a piece of research might askfor information about dichotomous variables,for example gender (male/female), type ofschool (elementary/secondary), type of course(vocational/non-vocational). In these cases onlyone of two responses can be selected. This enablesnominal data to be gathered, which can thenbe processed using the chi-square statistic, thebinomial test, the G-test and cross-tabulations(see Cohen and Holliday (1996) for examples).Dichotomous questions are treated as nominaldata (see Part Five).

Multiple choice questions

To try to gain some purchase on complexity,the researcher can move towards multiple choicequestions, where the range of choices is

designed to capture the likely range of responsesto given statements (see http://www.routledge.com/textbooks/9780415368780 – Chapter 15, file15.5. ppt). For example, the researcher mightask a series of questions about a new chemistryscheme in the school; a statement precedes a setof responses thus:

The New Intermediate Chemistry Education (NICE)is:

(a) a waste of time(b) an extra burden on teachers(c) not appropriate to our school(d) a useful complementary scheme(e) a useful core scheme throughout the school(f) well-presented and practicable.

The categories would have to be discrete (i.e.having no overlap and being mutually exclusive)and would have to exhaust the possible range ofresponses. Guidance would have to be given onthe completion of the multiple-choice, clarifying,for example, whether respondents are able to tickonly one response (a single answer mode) or severalresponses (multiple answer mode) from the list. Likedichotomous questions, multiple choice questionscan be quickly coded and quickly aggregated togive frequencies of response. If that is appropriatefor the research, then this might be a usefulinstrument.

The layout of a multiple choice question can bethus:

Number of years in teaching 1–5 ! 6–14 !15–24 ! 25+ !

Which age group do you teach at present (youmay tick more than one)?

Infant !Primary !Secondary (excluding sixth form) !Sixth form only !

Just as dichotomous questions have their parallelin dichotomous variables, so multiple choicequestions have their parallel in multiple elementsof a variable. For example, the researcher may beasking to which form a student belongs – therebeing up to, say, forty forms in a large school, or

324 QUESTIONNAIRES

the researcher may be asking which post-16 coursea student is following (e.g. academic, vocational,manual, non-manual). In these cases only oneresponse may be selected. As with the dichotomousvariable, the listing of several categories orelements of a variable (e.g. form membershipand course followed) enables nominal data tobe collected and processed using the chi-squarestatistic, the G-test and cross-tabulations (Cohenand Holliday 1996). Multiple choice questions aretreated as nominal data (see Part Five).

It may be important to include in the multiplechoices those that will enable respondents toselect the response that most closely representstheir view, hence a pilot is needed to ensure thatthe categories are comprehensive, exhaustive andrepresentative. On the other hand, the researchermay be interested in certain features only, and it isthese that would figure in the response categoriesonly.

The multiple choice questionnaire seldom givesmore than a crude statistic, for words are inherentlyambiguous. In the example above of chemistry, thenotion of ‘useful’ is unclear, as are ‘appropriate’,‘practicable’ and ‘burden’. Respondents couldinterpret these words differently in their owncontexts, thereby rendering the data ambiguous.One respondent might see the utility of thechemistry scheme in one area and thereby saythat it is useful – ticking (d). Another respondentmight see the same utility in that same one areabut, because it is only useful in that single area,may see this as a flaw and therefore not tickcategory (d). With an anonymous questionnairethis difference would be impossible to detect.

This is the heart of the problem of question-naires – that different respondents interpret thesame words differently. ‘Anchor statements’ canbe provided to allow a degree of discriminationin response (e.g. ‘strongly agree’, ‘agree’ etc.) butthere is no guarantee that respondents will alwaysinterpret them in the way that is intended. Inthe example above this might not be a problemas the researcher might only be seeking an indexof utility – without wishing to know the areas ofutility or the reasons for that utility. The evaluatormight be wishing only for a crude statistic (which

might be very useful statistically in making a deci-sive judgement about a programme). In this casethis rough and ready statistic might be perfectlyacceptable.

One can see in the example of chemistryabove not only ambiguity in the wording butalso a very incomplete set of response categorieswhich is hardly capable of representing all aspectsof the chemistry scheme. That this might bepolitically expedient cannot be overlooked, forif the choice of responses is limited, then thoseresponses might enable bias to be built into theresearch. For example, if the responses were limitedto statements about the utility of the chemistryscheme, then the evaluator would have littledifficulty in establishing that the scheme wasuseful. By avoiding the inclusion of negativestatements or the opportunity to record a negativeresponse the research will surely be biased. Theissue of the wording of questions has been discussedearlier.

Multiple choice items are also prone toproblems of word order and statement order. Forexample, Dillman et al. (2003: 6) report a study ofGerman students who were asked to compare theirhigh school teachers in terms of whether maleor female teachers were more empathetic. Theyfound that respondents rated their female teachersmore highly than their male teachers when askedto compare female teachers to male teachers thanwhen they were asked to compare their maleteachers to their female teachers. Similarly theyreport a study in which tennis was found to be lessexciting than football when the tennis option waspresented before the football option, and moreexciting when the football option was placedbefore the tennis option. These studies suggestthat respondents tend to judge later items in termsof the earlier items, rather than vice versa andthat they overlook features specific to later itemsif these are not contained in the earlier items.This is an instance of the ‘primacy effect’ or ‘ordereffect’, wherein items earlier in a list are givengreater weight than items lower in the list. Ordereffects are resilient to efforts to minimize them, andprimacy effects are particularly strong in Internetquestionnaires (Dillman et al. 2003: 22).


Ch

ap

ter1

5

Rank ordering

The rank order question is akin to the multiplechoice question in that it identifies options fromwhich respondents can choose, yet it movesbeyond multiple choice items in that it asksrespondents to identify priorities. This enablesa relative degree of preference, priority, intensityetc. to be charted (see http://www.routledge.com/textbooks/9780415368780 – Chapter 15, file15.6. ppt). In the rank ordering exercise a list offactors is set out and the respondent is required toplace them in a rank order, for example:

Please indicate your priorities by placing numbersin the boxes to indicate the ordering of your views,1 = the highest priority, 2 = the second highest, andso on.The proposed amendments to the mathematicsscheme might be successful if the following factorsare addressed:

the appropriate material resources are inschool !the amendments are made clear to all teachers !the amendments are supported by themathematics team !the necessary staff development is assured !there are subsequent improvements tostudent achievement !the proposals have the agreement of allteachers !they improve student motivation !parents approve of the amendments !they will raise the achievements of thebrighter students !the work becomes more geared toproblem-solving !

In this example ten items are listed. While thismight be enticing for the researcher, enabling finedistinctions possibly to be made in priorities, itmight be asking too much of the respondents tomake such distinctions. They genuinely might notbe able to differentiate their responses, or theysimply might not feel strongly enough to makesuch distinctions. The inclusion of too long alist might be overwhelming. Indeed Wilson andMcLean (1994: 26) suggest that it is unrealistic to

ask respondents to arrange priorities where thereare more than five ranks that have been requested.In the case of the list of ten points above, theresearcher might approach this problem in one oftwo ways. The list in the questionnaire item can bereduced to five items only, in which case the rangeand comprehensiveness of responses that fairlycatches what the respondent feels is significantlyreduced. Alternatively, the list of ten items canbe retained, but the request can be made to therespondents only to rank their first five priorities,in which case the range is retained and the task isnot overwhelming (though the problem of sortingthe data for analysis is increased).

An example of a shorter list might be:

Please place these in rank order of the most tothe least important, by putting the position (1–5)against each of the following statements, number 1being the most important and number 5 being theleast important:

Students should enjoy school [ ]Teachers should set less homework [ ]Students should have more choice of subjectsin school [ ]Teachers should use more collaborative methods [ ]Students should be tested more, so that theywork harder [ ]

Rankings are useful in indicating degrees ofresponse. In this respect they are like rating scales,discussed below. Ranking questions are treatedas ordinal data (see Part Five for a discussion ofordinal data).

Rating scales

One way in which degrees of response, intensityof response, and the move away from dichoto-mous questions have been managed can be seen inthe notion of rating scales – Likert scales, semanticdifferential scales, Thurstone scales and Guttmanscaling. These are very useful devices for the re-searcher, as they build in a degree of sensitivityand differentiation of response while still gener-ating numbers. This chapter will focus on thefirst two of these, though readers will find the

326 QUESTIONNAIRES

others discussed in Oppenheim (1992) (see http://www.routledge.com/textbooks/9780415368780 –Chapter 15, file 15.7. ppt). A Likert scale (namedafter its deviser, Rensis Likert 1932) provides arange of responses to a given question or statement,for example:

How important do you consider work placements tobe for secondary school students?

1 = not at all2 = very little3 = a little4 = quite a lot5 = a very great deal

All students should have access to free highereducation.

1 = strongly disagree2 = disagree3 = neither agree nor disagree4 = agree5 = strongly agree

Such a scale could be set out thus:

Please complete the following by placing a tick inone space only, as follows:

1 = strongly disagree; 2 = disagree;

3 = neither agree nor disagree;

4 = agree; 5 = strongly agree

1 2 3 4 5Senior school staff [ ] [ ] [ ] [ ] [ ]should teach more

In these examples the categories need to bediscrete and to exhaust the range of possibleresponses which respondents may wish to give.Notwithstanding the problems of interpretationwhich arise as in the previous example – onerespondent’s ‘agree’ may be another’s ‘stronglyagree’, one respondent’s ‘very little’ might beanother’s ‘a little’ – the greater subtlety of responsewhich is built into a rating scale renders this a veryattractive and widely used instrument in research.

These two examples both indicate an importantfeature of an attitude scaling instrument, namelythe assumption of unidimensionality in the scale;

the scale should be measuring only one thing at atime (Oppenheim 1992: 187–8). Indeed this is acornerstone of Likert’s (1932) own thinking.

It is a very straightforward matter to converta dichotomous question into a multiple choicequestion. For example, instead of asking the ‘doyou?’, ‘have you?’, ‘are you?’, ‘can you?’ typequestions in a dichotomous format, a simpleaddition to wording will convert it into a muchmore subtle rating scale, by substituting the words‘to what extent?’, ‘how far?’, ‘how much?’, ‘howoften?’ etc.

A semantic differential is a variation of a ratingscale which operates by putting an adjective atone end of a scale and its opposite at the other, forexample:

How informative do you consider the new set ofhistory textbooks to be?

1 2 3 4 5 6 7Useful # # # # # # # Useless

Respondents indicate their opinion by circlingor putting a mark on that position on the scalewhich most represents what they feel. Researchersdevise their own terms and their polar opposites,for example:

Approachable . . . . UnapproachableGenerous . . . . MeanFriendly . . . . HostileCaring . . . . UncaringAttentive . . . . InattentiveHard-working . . . . Lazy

Osgood et al. (1957), the pioneers of thistechnique, suggest that semantic differentialscales are useful in three contexts: evaluative(e.g. valuable-valueless, useful-useless, good-bad);potency (e.g. large-small, weak-strong, light-heavy); and activity (e.g. quick-slow; active-passive, dynamic-lethargic).

There are several commonly used categories inrating scales, for example:

Strongly disagree/disagree/neither agree nor disagree/agree/strongly agreeVery seldom/occasionally/quite often/very often


Ch

ap

ter1

5

Very little/a little/somewhat/a lot/a very great dealNever/almost never/sometimes/often/very oftenNot at all important/unimportant/neither importantnor unimportant/important/very importantVery true of me/a little bit true of me/don’t know/notreally true of me/very untrue of meStrongly agree/agree/uncertain/disagree/stronglyagree

To these could be added the category ‘don’t know’or ‘have no opinion’. Rating scales are widely usedin research, and rightly so, for they combine theopportunity for a flexible response with the abilityto determine frequencies, correlations and otherforms of quantitative analysis. They afford theresearcher the freedom to fuse measurement withopinion, quantity and quality.

Though rating scales are powerful and usefulin research, the investigator, nevertheless, needsto be aware of their limitations. For example,the researcher may infer a degree of sensitivity andsubtlety from the data that they cannot bear. Thereare other cautionary factors about rating scales, bethey Likert scales or semantic differential scales:

There is no assumption of equal intervalsbetween the categories, hence a rating of 4indicates neither that it is twice as powerfulas 2 nor that it is twice as strongly felt; onecannot infer that the intensity of feeling inthe Likert scale between ‘strongly agree’ and‘disagree’ somehow matches the intensity offeeling between ‘strongly disagree’ and ‘agree’.These are illegitimate inferences. The problemof equal intervals has been addressed inThurstone scales (Thurstone and Chave 1929;Oppenheim 1992: 190–5).We have no check on whether respondentsare telling the truth. Some may be deliberatelyfalsifying their replies.We have no way of knowing if the respondentmight have wished to add any other commentsabout the issue under investigation. It mighthave been the case that there was somethingfar more pressing about the issue than the rat-ing scale included but which was condemnedto silence for want of a category. A straight-forward way to circumvent this issue is to run

a pilot and also to include a category entitled‘other (please state)’.Most of us would not wish to be called extrem-ists; we often prefer to appear like each otherin many respects. For rating scales this meansthat we might wish to avoid the two extremepoles at each end of the continuum of therating scales, reducing the number of positionsin the scales to a choice of three (in a 5-pointscale). That means that in fact there could bevery little choice for us. The way round thisis to create a larger scale than a 5-point scale,for example a 7-point scale. To go beyond a7-point scale is to invite a degree of detail andprecision which might be inappropriate for theitem in question, particularly if the argumentset out above is accepted, that one respondent’sscale point 3 might be another’s scale point 4.There is a tendency for participants to opt forthe mid-point of a 5-point or 7-point scale (thecentral tendency). This is notably an issue inEast Asian respondents, where the ‘doctrine ofthe mean’ is advocated in Confucian culture.One option to overcome this is to use an evennumber scaling system, as there is no mid-point. On the other hand, it could be arguedthat if respondents wish to sit on the fence andchoose a mid-point, then they should be giventhe option to do so.On the scales so far there have been mid-points; on the 5-point scale it is category 3,and on the 7-point scale it is category 4. Theuse of an odd number of points on a scaleenables this to occur. However, choosing aneven number of scale points, for example a6-point scale, might require a decision on ratingto be indicated.

For example, suppose a new staffing structure hasbeen introduced into a school and the headteacheris seeking some guidance on its effectiveness. A6-point rating scale might ask respondents toindicate their response to the statement:

The new staffing structure in the school has enabledteamwork to be managed within a clear model of linemanagement.

328 QUESTIONNAIRES

(Circle one number)

1 2 3 4 5 6Strongly # # # # # # Stronglyagree disagree

Let us say that one member of staff circled 1,eight staff circled 2, twelve staff circled 3, ninestaff circled 4, two staff circled 5, and sevenstaff circled 6. There being no mid-point on thiscontinuum, the researcher could infer that thoserespondents who circled 1, 2 or 3 were in somemeasure of agreement, while those respondentswho circled 4, 5 or 6 were in some measure ofdisagreement. That would be very useful for, say,a headteacher, in publicly displaying agreement,there being twenty-one staff (1 + 8 + 12) agreeingwith the statement and eighteen (9 + 2 + 7)displaying a measure of disagreement. However,one could point out that the measure of ‘stronglydisagree’ attracted seven staff – a very strongfeeling – which was not true for the ‘strongly agree’category, which attracted only one member of staff.The extremity of the voting has been lost in a crudeaggregation.

Further, if the researcher were to aggregate thescoring around the two mid-point categories (3and 4) there would be twenty-one members of staffrepresented, leaving nine (1 + 8) from categories1 and 2 and nine (2 + 7) from categories 5 and6; adding together categories 1, 2, 5 and 6, atotal of eighteen is reached, which is less than thetwenty-one total of the two categories 3 and 4. Itseems on this scenario that it is far from clear thatthere was agreement with the statement from thestaff; indeed taking the high incidence of ‘stronglydisagree’, it could be argued that those staff whowere perhaps ambivalent (categories 3 and 4),coupled with those who registered a ‘stronglydisagree’, indicate not agreement but disagreementwith the statement.

The interpretation of data has to be handledvery carefully; ordering data to suit a researcher’sown purposes might be very alluring but quiteillegitimate. The golden rule here is that crude datacan yield only crude interpretation; subtle statisticsrequire subtle data. The interpretation of data mustnot distort the data unfairly. Rating scale questions

are treated as ordinal data (see Part Five), usingmodal scores and non-parametric data analysis,though one can find very many examples wherethis rule has been violated, and non-parametricdata have been treated as parametric data. This isunacceptable.

It has been suggested that the attraction ofrating scales is that they provide more opportunitythan dichotomous questions for rendering datamore sensitive and responsive to respondents. Thismakes rating scales particularly useful for tappingattitudes, perceptions and opinions. The need for apilot study to devise and refine categories, makingthem exhaustive and discrete, has been suggestedas a necessary part of this type of data collection.

Questionnaires that are going to yield numericalor word-based data can be analysed using computerprogrammes (for example SPSS or Ethnographrespectively). If the researcher intends to processthe data using a computer package it is essentialthat the layout and coding system of thequestionnaire are appropriate for that particularcomputer package. Instructions for layout in orderto facilitate data entry are contained in manualsthat accompany such packages.

Rating scales are more sensitive instrumentsthan dichotomous scales. Nevertheless, they arelimited in their usefulness to researchers by theirfixity of response caused by the need to select froma given choice. A questionnaire might be tailoredeven more to respondents by including open-endedquestions to which they can reply in their ownterms and own opinions. We consider these later.

Constant sum questions

In this type of question respondents are asked todistribute a given number of marks (points) be-tween a range of items (see http://www.routledge.com/textbooks/9780415368780 – Chapter 15, file15.8. ppt). For example:

Please distribute a total of 10 points among thesentences that you think most closely describe yourbehaviour. You may distribute these freely: they maybe spread out, or awarded to only a few statements,or all allocated to a single sentence if you wish.


Ch

ap

ter1

5

I can take advantage of new opportunities [ ]I can work effectively with all kinds of people [ ]Generating new ideas is one of my strengths [ ]I can usually tell what is likely to work in

practice [ ]I am able to see tasks through to the very end [ ]I am prepared to be unpopular for the good of the

school [ ]

This enables priorities to be identified, comparinghighs and lows, and for equality of choices tobe indicated, and, importantly, for this to bedone in the respondents’ own terms. It requiresrespondents to make comparative judgements andchoices across a range of items. For example, wemay wish to distribute 10 points for aspects of anindividual’s personality:

Talkative [ ]Cooperative [ ]Hard-working [ ]Lazy [ ]Motivated [ ]Attentive [ ]

This means that the respondent has to considerthe relative weight of each of the given aspectsbefore coming to a decision about how to awardthe marks. To accomplish this means that the all-round nature of the person, in the terms provided,has to be considered, to see, on balance, whichaspect is stronger when compared to another.1

The difficulty with this approach is todecide how many marks can be distributed (around number, for example 10, makes subsequentcalculation easily comprehensible) and how manystatements/items to include, e.g. whether to havethe same number of statements as there are marks,or more or fewer statements than the total of marks.Having too few statements/items does not dojustice to the complexity of the issue, and havingtoo many statements/items may mean that it isdifficult for respondents to decide how to distributetheir marks. Having too few marks available maybe unhelpful, but, by contrast, having too manymarks and too many statements/items can lead tosimple computational errors by respondents. Ouradvice is to keep the number of marks to ten andthe number of statements to around six to eight.

Constant sum data are ordinal, and this meansthat non-parametric analysis can be performed onthe data (see Part Five).

Ratio data questions

We discuss ratio data in Part Five and werefer the reader to the discussion and definitionthere (see http://www.routledge.com/textbooks/9780415368780 – Chapter 15, file 15.9. ppt). Forour purposes here we suggest that ratio dataquestions deal with continuous variables wherethere is a true zero, for example:

How much money do you have in the bank? ––How many times have you been late for school? ––How many marks did you score in the mathematicstest? ––How old are you (in years)? ––

Here no fixed answer or category is provided,and the respondent puts in the numerical answerthat fits his/her exact figure, i.e. the accuracy ishigher, much higher than in categories of data. Thisenables averages (means), standard deviations,range, and high-level statistics to be calculated,e.g. regression, factor analysis, structural equationmodelling (see Part Five).

An alternative form of ratio scaling is where therespondent has to award marks out of, say, ten, for aparticular item. This is a device that has been usedin business and commerce for measuring servicequality and customer satisfaction, and is being usedin education by Kgaile and Morrison (2006); seefor example Box 15.3.

This kind of scaling is often used in telephoneinterviews, as it is easy for respondents tounderstand. The argument could be advanced thatthis is a sophisticated form of rating scale, but theterminology used in the instruction clearly suggeststhat it asks for ratio scale data.

Open-ended questions

The open-ended question is a very attractivedevice for smaller scale research or for those

330 QUESTIONNAIRES

Box 15.3A 10-point marking scale in a questionnaire

Please give a mark from 1 to 10 for the following statements, with 10 being excellent and 1 being very poor. Please circlethe appropriate number for each statement.

Teaching and learningVery poor Excellent

1 The attention given to teaching and learning atthe school

1 2 3 4 5 6 7 8 9 10

2 The quality of the lesson preparation 1 2 3 4 5 6 7 8 9 103 How well learners are cared for, guided and

supported1 2 3 4 5 6 7 8 9 10

4 How effectively teachers challenge and engagelearners

1 2 3 4 5 6 7 8 9 10

5 The educators’ use of assessment for maximiz-ing learners’ learning

1 2 3 4 5 6 7 8 9 10

6 How well students apply themselves to learning 1 2 3 4 5 6 7 8 9 107 Discussion and review by educators of the qual-

ity of teaching and learning1 2 3 4 5 6 7 8 9 10

sections of a questionnaire that invite an hon-est, personal comment from respondents in ad-dition to ticking numbers and boxes (see http://www.routledge.com/textbooks/9780415368780 –Chapter 15, file 15.10. ppt). The questionnairesimply puts the open-ended questions and leavesa space (or draws lines) for a free response. It isthe open-ended responses that might contain the‘gems’ of information that otherwise might not becaught in the questionnaire. Further, it puts theresponsibility for and ownership of the data muchmore firmly into respondents’ hands.

It is useful for the researcher to provide somesupport for respondents, so that they know thekind of reply being sought. For example, an openquestion that includes a prompt could be:

Please indicate the most important factors that reducestaff participation in decision-making.Please comment on the strengths and weaknesses ofthe mathematics course.Please indicate areas for improvement in the teachingof foreign languages in the school.

This is not to say that the open-ended questionmight well not frame the answer, just as the stemof a rating scale question might frame the responsegiven. However, an open-ended question cancatch the authenticity, richness, depth of response,

honesty and candour which, as is argued elsewherein this book, are the hallmarks of qualitative data.

Oppenheim (1992: 56–7) suggests that asentence-completion item is a useful adjunct toan open-ended question, for example:

Please complete the following sentence in your ownwords:

An effective teacher. . .

or

The main things that I find annoying with disruptivestudents are . . .

Open-endedness also carries problems of data han-dling. For example, if one tries to convert opinionsinto numbers (e.g. so many people indicated somedegree of satisfaction with the new principal’smanagement plan), then it could be argued thatthe questionnaire should have used rating scalesin the first place. Further, it might well be that theresearcher is in danger of violating one principle ofword-based data, which is that they are not validlysusceptible to aggregation, i.e. that it is trying tobring to word-based data the principles of numer-ical data, borrowing from one paradigm (quanti-tative, positivist methodology) to inform anotherparadigm (qualitative, interpretive methodology).


Ch

ap

ter1

5

Further, if a genuinely open-ended question isbeing asked, it is perhaps unlikely that responseswill bear such a degree of similarity to eachother so as to enable them to be aggregated tootightly. Open-ended questions make it difficultfor the researcher to make comparisons betweenrespondents, as there may be little in common tocompare. Moreover, to complete an open-endedquestionnaire takes much longer than placing atick in a rating scale response box; not onlywill time be a constraint here, but there is anassumption that respondents will be sufficiently orequally capable of articulating their thoughts andcommitting them to paper.

In practical terms, Redline et al. (2002) reportthat using open-ended questions can lead torespondents overlooking instructions, as they areoccupied with the more demanding task of writingin their own words than reading instructions.

Despite these cautions, the space provided for anopen-ended response is a window of opportunityfor the respondent to shed light on an issue orcourse. Thus, an open-ended questionnaire hasmuch to recommend it.

Matrix questions

Matrix questions are not types of questions butconcern the layout of questions. Matrix questionsenable the same kind of response to be given toseveral questions, for example ‘strongly disagree’to ‘strongly agree’. The matrix layout helps to savespace, for example:

Please complete the following by placing a tick inone space only, as follows:1 = not at all; 2 = very little; 3 = a moderateamount; 4 = quite a lot; 5 = a very great dealHow much do you use the following for assessmentpurposes?

1 2 3 4 5(a) commercially published tests [ ] [ ] [ ] [ ] [ ](b) your own made-up tests [ ] [ ] [ ] [ ] [ ](c) students’ projects [ ] [ ] [ ] [ ] [ ](d) essays [ ] [ ] [ ] [ ] [ ](e) samples of students’ work [ ] [ ] [ ] [ ] [ ]

Here five questions have been asked in only fivelines, excluding, of course, the instructions andexplanations of the anchor statements. Such alayout is economical of space.

A second example indicates how a matrixdesign can save a considerable amount of spacein a questionnaire. Here the size of potentialproblems in conducting a piece of research isasked for, and data on how much these problemswere soluble are requested. For the first issue(the size of the problem), 1 = no problem, 2 = asmall problem, 3 = a moderate problem, 4 = alarge problem, 5 = a very large problem. For thesecond issue (how much the problem was solved),1 = not solved at all, 2 = solved only a very little,3 = solved a moderate amount, 4 = solved a lot,5 = completely solved (see Box 15.4).

Here thirty questions (15 ! 2) have been ableto be covered in just a short amount of space.

Laying out the questionnaire like this enablesthe respondent to fill in the questionnaire rapidly.On the other hand, it risks creating a mind set inthe respondent (a ‘response set’: Baker 1994: 181)in that the respondent may simply go down thequestionnaire columns and write the same numbereach time (e.g. all number 3) or, in a ratingscale, tick all number 3. Such response sets canbe detected by looking at patterns of replies andeliminating response sets from subsequent analysis.

The conventional way of minimizing responsesets has been by reversing the meaning of someof the questions so that the respondents will needto read them carefully. However, Weems et al.(2003) argue that using positively and negativelyworded items within a scale is not measuringthe same underlying traits. They report thatsome respondents will tend to disagree witha negatively worded item, that the reliabilitylevels of negatively worded items are lowerthan for positively worded items, and thatnegatively worded items receive greater non-response than positively worded items. IndeedWeems et al. (2003) argue against mixed-itemformats, and supplement this by reporting thatinappropriately worded items can induce anartificially extreme response which, in turn,compromises the reliability of the data. Mixing

332 QUESTIONNAIRES

Box 15.4Potential problems in conducting research

Size of theproblem(1–5)

How muchthe problemwas solved(1–5)

1 Gaining access to schools and teachers2 Gaining permission to conduct the research (e.g. from principals)3 Resentment by principals4 People vetting what could be used5 Finding enough willing participants for your sample6 Schools suffering from ‘too much research’ by outsiders and insiders7 Schools or people not wishing to divulge information about themselves8 Schools not wishing to be identifiable, even with protections guaranteed9 Local political factors that impinge on the school

10 Teachers’ fear of being identified/traceable, even with protections guaranteed11 Fear of participation by teachers (e.g. if they say critical matters about the

school or others they could lose their contracts)12 Unwillingness of teachers to be involved because of their workload13 The principal deciding on whether to involve the staff, without consultation

with the staff14 Schools’ or institutions’ fear of criticism or loss of face15 The sensitivity of the research: the issues being investigated

negatively and positively worded items in the samescale, they argue, compromises both validity andreliability. Indeed they suggest that respondentsmay not read negatively worded items as carefullyas positively worded items.

Contingency questions, filters andbranches

Contingency questions depend on responses toearlier questions, for example: ‘if your answer toquestion (1) was ‘‘yes’’ please go to question (4)’.The earlier question acts as a filter for the laterquestion, and the later question is contingenton the earlier, and is a branch of the earlierquestion. Some questionnaires will write in wordsthe number of the question to which to go (e.g.‘please go to question 6’); others will place an arrowto indicate the next question to be answered if youranswer to the first question was such-and-such.

Contingency and filter questions may be usefulfor the researcher, but they can be confusing for

the respondent as it is not always clear howto proceed through the sequence of questionsand where to go once a particular branch hasbeen completed. Redline et al. (2002) foundthat respondents tend to ignore, misread andincorrectly follow branching instructions, suchthat item non-response occurs for follow-upquestions that are applicable only to certainsubsamples, and respondents skip over, andtherefore fail to follow-up on those questionsthat they should have completed. Redline et al.(2002) found that the increased complexity ofthe questionnaire brought about by branchinginstructions negatively influenced its correctcompletion.

Redline et al. (2002: 7) report that the numberof words in the question affected the respondents’ability to follow branching instructions – thegreater the number of words in the question,the greater was the likelihood of the respondentsoverlooking the branching instructions. Redlineet al. (2002: 19) report that up to seven items,

ASKING SENSITIVE QUESTIONS 333

Ch

ap

ter1

5

and no more, could be retained in the short-termmemory. This has implications for the number ofitems in a list of telephone interviews, where thereis no visual recall or checking possible. Similarly,the greater was the number of answer categories,the greater was the likelihood of making errors,e.g. overlooking branching instructions. Theyreport that respondents tend to see branchinginstructions when they are placed by the lastcategory, particularly if they have chosen that lastcategory.

Further, Redline et al. (2002: 8) note thatsandwiching branching instructions between itemsthat do not branch is likely to lead to errors ofomission and commission being made: omitting toanswer all the questions and answering the wrongquestions. Further, locating the instructions forbranching some distance away from the precedinganswer box can also lead to errors in followingthe instructions. Redline et al. (2002: 17) reportthat ‘altering the visual and verbal design ofbranching instructions had a substantial impacton how well respondents read, comprehend,and act upon the branching instructions’. Itfollows from this that the clear location andvisual impact of instructions are important forsuccessful completion of branching instructions.Most respondents, they acknowledge, did notdeliberately ignore branching instructions; theysimply were unaware of them.

The implications of the findings from Redlineet al. (2002) is that instructions should be placedwhere they are to be used and where they can beseen.

We would advise judicious and limited use offiltering and branching devices. It is particularlyimportant to avoid having participants turningpages forwards and backwards in a questionnairein order to follow the sequence of questions thathave had filters and branches following from them.It is a particular problem in Internet surveyswhere the screen size is much smaller than thelength of a printed page. One way of overcomingthe problem of branches is to sectionalize thequestionnaire, keeping together conceptually closeitems and keeping the branches within thatsection.

Asking sensitive questions

Sudman and Bradburn (1982: ch. 3) drawattention to the important issue of includingsensitive items in a questionnaire. While theanonymity of a questionnaire and, frequently,the lack of face-to-face contact between theresearcher and the respondents in a questionnairemight facilitate responses to sensitive material, theissues of sensitivity and threat cannot be avoided,as they might lead to under-reporting (non-disclosure and withholding data) or over-reporting(exaggeration) by participants. Some respondentsmay be unwilling to disclose sensitive information,particularly if it could harm themselves or others.Why should they share private matters (e.g.about family life and opinions of school managersand colleagues) with a complete stranger (Cooperand Schindler 2001: 341)? Even details of age,income, educational background, qualifications,and opinions can be regarded as private and/orsensitive matters.

Sudman and Bradburn (1982: 55–6) identifyseveral important considerations in addressingpotentially threatening or sensitive issues,for example socially undesirable behaviour(e.g. drug abuse, sexual offences, violentbehaviour, criminality, illnesses, employment andunemployment, physical features, sexual activity,behaviour and sexuality, gambling, drinking,family details, political beliefs, social taboos). Theysuggest the following strategies:

Open rather than closed questions mightbe more suitable to elicit information aboutsocially undesirable behaviour, particularlyfrequencies.Long rather than short questions might bemore suitable for eliciting information aboutsocially undesirable behaviour, particularlyfrequencies.Using familiar words might increase thenumber of reported frequencies of sociallyundesirable behaviour.Using data gathered from informants, wherepossible, can enhance the likelihood ofobtaining reports of threatening behaviour.

334 QUESTIONNAIRES

Deliberately loading the question so thatoverstatements of socially desirable behaviourand understatements of socially undesirablebehaviour are reduced might be a useful meansof eliciting information.With regard to socially undesirable behaviour,it might be advisable first to ask whetherthe respondent has engaged in that behaviourpreviously, and then move to asking about hisor her current behaviour. By contrast, whenasking about socially acceptable behaviourthe reverse might be true, i.e. asking aboutcurrent behaviour before asking about everydaybehaviour.In order to defuse threat, it might be useful tolocate the sensitive topic within a discussionof other more or less sensitive matters, in orderto suggest to respondents that this issue mightnot be too important.Use alternative ways of asking standardquestions, for example sorting cards, or puttingquestions in sealed envelopes, or repeatingquestions over time (this has to be handledsensitively, so that respondents do not feelthat they are being ‘checked’), and in order toincrease reliability.Ask respondents to keep diaries in order toincrease validity and reliability.At the end of an interview ask respondentstheir views on the sensitivity of the topics thathave been discussed.If possible, find ways of validating the data.

Indeed, Sudman and Bradburn (1982: 86) suggestthat, as the questions become more threateningand sensitive, it is wise to expect greater bias andunreliability. They draw attention to the fact thatseveral nominal, demographic details might beconsidered threatening by respondents (Sudmanand Bradburn 1982: 208). This has implications fortheir location within the questionnaire (discussedbelow). The issue here is that sensitivity and threatare to be viewed through the eyes of respondentsrather than the questionnaire designer; what mightappear innocuous to the researcher might behighly sensitive or offensive to participants. We

refer readers to Chapter 5 on sensitive educationalresearch.

Avoiding pitfalls in question writing

Although there are several kinds of questions thatcan be used, there are some caveats about theframing of questions in a questionnaire (see http://www.routledge.com/textbooks/9780415368780 –Chapter 15, file 15.11. ppt):

Avoid leading questions, that is, questionsthat are worded (or their response categoriespresented) in such a way as to suggest torespondents that there is only one acceptableanswer, and that other responses mightor might not gain approval or disapprovalrespectively. For example:

Do you prefer abstract, academic-type courses, ordown-to-earth, practical courses that have somepay-off in your day-to-day teaching?

The guidance here is to check the ‘loadedness’or possible pejorative overtones of terms orverbs.Avoid highbrow questions even with sophisti-cated respondents. For example:

What particular aspects of the current positivis-tic/interpretive debate would you like to seereflected in a course of developmental psychologyaimed at a teacher audience?

Where the sample being surveyed is rep-resentative of the whole adult population,misunderstandings of what researchers take tobe clear, unambiguous language are common-place. Therefore it is important to use clear andsimple language.Avoid complex questions. For example:

Would you prefer a short, non-award-bearingcourse (3, 4 or 5 sessions) with part-day release(e.g. Wednesday afternoons) and one evening perweek attendance with financial reimbursementfor travel, or a longer, non-award-bearing course(6, 7 or 8 sessions) with full-day release, orthe whole course designed on part-day releasewithout evening attendance?

AVOIDING PITFALLS IN QUESTION WRITING 335

Ch

ap

ter1

5

Avoid irritating questions or instructions. Forexample:

Have you ever attended an in-service course ofany kind during your entire teaching career?If you are over forty, and have never attended anin-service course, put one tick in the box markedNEVER and another in the box marked OLD.

Avoid questions that use negatives anddouble negatives (Oppenheim 1992: 128). Forexample:

How strongly do you feel that no teacher shouldenrol on the in-service, award-bearing course whohas not completed at least two years’ full-timeteaching?

Or:

Do you feel that without a parent/teacherassociation teachers are unable to express theirviews to parents clearly?

In this case, if you feel that a parent/teacherassociation is essential for teachers to expresstheir views, do you vote ‘yes’ or ‘no’? Thehesitancy involved in reaching such a decision,and the possible required re-reading of thequestion could cause the respondent simplyto leave it blank and move on to thenext question. The problem is the doublenegative, ‘without’ and ‘unable’, which createsconfusion.Avoid too many open-ended questionson self-completion questionnaires. Becauseself-completion questionnaires cannot proberespondents to find out just what theymean by particular responses, open-endedquestions are a less satisfactory way of elicitinginformation. (This caution does not hold in theinterview situation, however.) Open-endedquestions, moreover, are too demanding ofmost respondents’ time. Nothing can be moreoff-putting than the following format:

Use pages 5, 6 and 7 respectively to respond toeach of the questions about your attitudes to in-service courses in general and your beliefs abouttheir value in the professional life of the servingteacher.

Avoid extremes in rating scales, e.g. ‘never’,‘always’, ‘totally’, ‘not at all’ unless thereis a good reason to include them. Mostrespondents are reluctant to use suchextreme categories (Anderson and Arsenault2001: 174).Avoid pressuring/biasing by association, forexample: ‘Do you agree with your headteacherthat boys are more troublesome than girls?’.In this case the reference to the headteachershould simply be excised.Avoid statements with which people tendeither to disagree or agree (i.e. that havebuilt-in skewedness (the ‘base-rate’ problem,in which natural biases in the population affectthe sample results).

Finally, avoid ambiguous questions or questionsthat could be interpreted differently fromthe way that is intended. The problem ofambiguity in words is intractable; at bestit can be minimized rather than eliminatedaltogether. The most innocent of questionsis replete with ambiguity (Youngman 1984:158–9; Morrison 1993: 71–2). Take the followingexamples:

Does your child regularly do homework?

What does ‘regularly’ mean – once a day; once ayear; once a term; once a week?

How many students are there in the school?

What does this mean: on roll; on roll but absent;marked as present but out of school on a field trip;at this precise moment or this week (there being adifference in attendance between a Monday and aFriday), or between the first term of an academicyear and the last term of the academic year forsecondary school students as some of them willhave left school to go into employment and otherswill be at home revising for examinations or havecompleted them?

How many computers do you have in school?

What does this mean: present but broken;including those out of school being repaired; the

336 QUESTIONNAIRES

property of the school or staffs’ and students’ owncomputers; on average or exactly in school today?

Have you had a French lesson this week?

What constitutes a ‘week’: the start of the schoolweek (i.e. from Monday to a Friday), since lastSunday (or Saturday, depending on one’s religion)or, if the question were put on a Wednesday, sincelast Wednesday; how representative of all weeksis this week – there being public examinations inthe school for some of the week?

How old are you?15–2020–3030–4040–5050–60

The categories are not discrete; will an old-looking40 year old flatter himself and put himself in the30–40 category, or will an immature 20-year oldseek the maturity of being put into the 20–30category? The rule in questionnaire design is toavoid any overlap of categories.

Vocational education is available only to the lowerability students but it should be open to every student.

This is, in fact, a double question. What does therespondent do who agrees with the first part ofthe sentence -‘vocational education is availableonly to the lower ability students’ – but disagreeswith the latter part of the sentence, or vice versa?The rule in questionnaire design is to ask only onequestion at a time.

Although it is impossible to legislate forthe respondents’ interpretation of wording, theresearcher, of course, has to adopt a common-sense approach to this, recognizing the inherentambiguity but nevertheless still feeling that it ispossible to live with this indeterminacy.

An ideal questionnaire possesses the sameproperties as a good law, being clear, unambiguousand practicable, reducing potential errors inparticipants and data analysts, being motivatingfor participants and ensuring as far as possible thatrespondents are telling the truth (Davidson 1970).

The golden rule is to keep questions as shortand as simple as possible.

Sequencing the questions

To some extent the order of questions in a scheduleis a function of the target sample (e.g. how theywill react to certain questions), the purposes of thequestionnaire (e.g. to gather facts or opinions),the sensitivity of the research (e.g. how personaland potentially disturbing the issues are that willbe addressed), and the overall balance of thequestionnaire (e.g. where best to place sensitivequestions in relation to less threatening questions,and how many of each to include).

The ordering of the questionnaire is important,for early questions may set the tone or the mind-set of the respondent to later questions. Forexample, a questionnaire that makes a respondentirritated or angry early on is unlikely to havemanaged to enable that respondent’s irritation oranger to subside by the end of the questionnaire.As Oppenheim (1992: 121) remarks, one covertpurpose of each question is to ensure that therespondent will continue to cooperate.

Further, a respondent might ‘read the signs’in the questionnaire, seeking similarities andresonances between statements so that responsesto early statements will affect responses to laterstatements and vice versa. While multiple itemsmay act as a cross-check, this very process mightbe irritating for some respondents.

Krosnick and Alwin (1987) found a ‘primacyeffect’ (discussed earlier), i.e. respondents tend tochoose items that appear earlier in a list rather thanitems that appear later in a list. This is particularlyimportant for branching instructions, where theinstruction, because it appears at the bottom of thelist, could easily be overlooked. Krosnick (1999)also found that the more difficult a questionis, the greater is the likelihood of ‘satisficing’,i.e. choosing the first reasonable response optionin a list, rather than working through a listmethodically to find the most appropriate responsecategory.

The key principle, perhaps, is to avoid creatinga mood-set or a mind-set early on in the

QUESTIONNAIRES CONTAINING FEW VERBAL ITEMS 337

Ch

ap

ter1

5

questionnaire. For this reason it is importantto commence the questionnaire with non-threatening questions that respondents can readilyanswer. After that it might be possible to movetowards more personalized questions.

Completing a questionnaire can be seen as alearning process in which respondents becomemore at home with the task as they proceed.Initial questions should therefore be simple, havehigh interest value, and encourage participation.This will build up the confidence and motivationof the respondent. The middle section ofthe questionnaire should contain the difficultquestions; the last few questions should be ofhigh interest in order to encourage respondents toreturn the completed schedule.

A common sequence of a questionnaire is asfollows:

Commence with unthreatening factual ques-tions (that, perhaps, will give the researchersome nominal data about the sample, e.g. agegroup, sex, occupation, years in post, qualifica-tions etc.).Move to closed questions (e.g. dichotomous,multiple choice, rating scales, constant sumquestions) about given statements or questions,eliciting responses that require opinions,attitudes, perceptions, views.Then move to more open-ended questions (or,maybe, to intersperse these with more closedquestions) that seek responses on opinions,attitudes, perceptions and views, togetherwith reasons for the responses given. Theseresponses and reasons might include sensitiveor more personal data.

The move is from objective facts to subjectiveattitudes and opinions through justifications andto sensitive, personalized data. Clearly the orderingis neither as discrete nor as straightforward as this.For example, an apparently innocuous questionabout age might be offensive to some respondents,a question about income is unlikely to go down wellwith somebody who has just become unemployed,and a question about religious belief might be seenas an unwarranted intrusion into private matters.

Indeed, many questionnaires keep questions aboutpersonal details until the very end.

The issue here is that the questionnaire designerhas to anticipate the sensitivity of the topics interms of the respondents, and this has a largesociocultural dimension. What is being arguedhere is that the logical ordering of a questionnairehas to be mediated by its psychological ordering.The instrument has to be viewed through the eyesof the respondent as well as the designer.

In addition to the overall sequencing ofthe questionnaire, Oppenheim (1992: ch. 7)suggests that the sequence within sections of thequestionnaire is important. He indicates that thequestionnaire designer can use funnels and filterswithin the question. A funnelling process movesfrom the general to the specific, asking questionsabout the general context or issues and thenmoving toward specific points within that. A filteris used to include and exclude certain respondents,i.e. to decide if certain questions are relevant orirrelevant to them, and to instruct respondentsabout how to proceed (e.g. which items to jumpto or proceed to). For example, if respondentsindicate a ‘yes; or a ‘no’ to a certain question,then this might exempt them from certain otherquestions in that section or subsequently.

Questionnaires containing few verbalitems

The discussion so far has assumed thatquestionnaires are entirely word-based. Thismight be off-putting for many respondents,particularly children. In these circumstances aquestionnaire might include visual informationand ask participants to respond to this (e.g.pictures, cartoons, diagrams) or might includesome projective visual techniques (e.g. to drawa picture or diagram, to join two related pictureswith a line, to write the words or what someoneis saying or thinking in a ‘bubble’ picture), to tellthe story of a sequence of pictures together withpersonal reactions to it. The issue here is that intailoring the format of the questionnaire to thecharacteristics of the sample, a very wide embracemight be necessary to take in non-word-based

338 QUESTIONNAIRES

techniques. This is not only a matter of appeal torespondents, but, perhaps more significantly, is amatter of accessibility of the questionnaire to therespondents, i.e. a matter of reliability and validity.

The layout of the questionnaire

The appearance of the questionnaire is vitallyimportant. It must look easy, attractive andinteresting rather than complicated, unclear,forbidding and boring. A compressed layout isuninviting and it clutters everything together;a larger questionnaire with plenty of space forquestions and answers is more encouraging torespondents. Verma and Mallick (1999: 120)suggest the use of high quality paper if fundingpermits.

Dillman et al. (1999) found that respondentstend to expect less of a form-filling task than isactually required. They expect to read a question,read the response, make a mark, and move on tothe next question, but in many questionnaires itis more complicated than this. The rule is simple:keep it as uncomplicated as possible.

It is important, perhaps, for respondents to beintroduced to the purposes of each section of aquestionnaire, so that they can become involvedin it and maybe identify with it. If space permits,it is useful to tell the respondent the purposes andfocuses of the sections of the questionnaire, andthe reasons for the inclusion of the items.

Clarity of wording and simplicity of designare essential. Clear instructions should guiderespondents: ‘Put a tick’, for example, invitesparticipation, whereas complicated instructionsand complex procedures intimidate respondents.Putting ticks in boxes by way of answering aquestionnaire is familiar to most respondents,whereas requests to circle precoded numbers atthe right-hand side of the questionnaire can bea source of confusion and error. In some casesit might also be useful to include an exampleof how to fill in the questionnaire (e.g. tickinga box, circling a statement), though, clearly,care must be exercised to avoid leading therespondents to answering questions in a particularway by dint of the example provided (e.g. by

suggesting what might be a desired answer to thesubsequent questions). Verma and Mallick (1999:121) suggest the use of emboldening to draw therespondent’s attention to significant features.

Ensure that short, clear instructions accompanyeach section of the questionnaire. Repeatinginstructions as often as necessary is good practicein a postal questionnaire. Since everything hingeson respondents knowing exactly what is requiredof them, clear, unambiguous instructions, boldlyand attractively displayed, are essential.

Clarity and presentation also impact on thenumbering of the questions. For example a four-page questionnaire might contain sixty questions,broken down into four sections. It might be off-putting to respondents to number each question(1–60) as the list will seem interminably long,whereas to number each section 1–4 makes thequestionnaire look manageable. Hence it is useful,in the interests of clarity and logic, to breakdown the questionnaire into subsections withsection headings. This will also indicate the overalllogic and coherence of the questionnaire to therespondents, enabling them to ‘find their way’through the questionnaire. It might be useful topreface each subsection with a brief introductionthat tells them the purpose of that section.

The practice of sectionalizing and subletteringquestions (e.g. Q9 (a) (b) (c). . .) is a usefultechnique for grouping together questions abouta specific issue. It is also a way of making thequestionnaire look smaller than it actually is!

This previous point also requires the question-naire designer to make it clear if respondents areexempted from completing certain questions orsections of the questionnaire (discussed earlier inthe section on filters). If so, then it is vital thatthe sections or questions are numbered so that therespondent knows exactly where to move to next.Here the instruction might be, for example: ‘Ifyou have answered ‘‘yes’’ to question 10 please goto question 15, otherwise continue with question11’, or, for example: ‘If you are the school principalplease answer this section, otherwise proceed tosection 3’.

Arrange the contents of the questionnaire insuch a way as to maximize cooperation. For

COVERING LETTERS OR SHEETS AND FOLLOW-UP LETTERS 339

Ch

ap

ter1

5

example, include questions that are likely to be ofgeneral interest. Make sure that questions thatappear early in the format do not suggest torespondents that the enquiry is not intended forthem. Intersperse attitude questions throughoutthe schedule to allow respondents to air theirviews rather than merely describe their behaviour.Such questions relieve boredom and frustrationas well as providing valuable information in theprocess.

Coloured pages can help to clarify the overallstructure of the questionnaire and the useof different colours for instructions can assistrespondents.

It is important to include in the questionnaire,perhaps at the beginning, assurances of confi-dentiality, anonymity and non-traceability, forexample by indicating that respondents need notgive their name, that the data will be aggregated,that individuals will not be able to be identifiedthrough the use of categories or details of theirlocation etc. (i.e. that it will not be possible toput together a traceable picture of the respondentsthrough the compiling of nominal, descriptive dataabout them). In some cases, however, the ques-tionnaire might ask respondents to put their namesso that they can be traced for follow-up interviewsin the research (Verma and Mallick 1999: 121);here the guarantee of eventual anonymity andnon-traceability will still need to be given.

Redline et al. (2002) indicate that the placingof the response categories to the immediate rightof the text increases the chance of it beinganswered (the visual location), and making thematerial more salient (e.g. through emboldeningand capitalization) can increase the chances ofit being addressed (the visibility issue). This isparticularly important for branching questions andinstructions.

Redline et al. (2002) also note that questionsplaced at the bottom of a page tend to receivemore non-response than questions placed furtherup on the page. Indeed they found that puttinginstructions at the bottom of the page, particularlyif they apply to items on the next page, can easilylead to those instructions being overlooked. It isimportant, then, to consider what should go at the

bottom of the page, perhaps the inclusion of lessimportant items at that point. Redline et al. (2002)suggest that questions with branching instructionsshould not be placed at the bottom of a page.

Finally, a brief note at the very end ofthe questionnaire can: ask respondents to checkthat no answer has been inadvertently missedout; solicit an early return of the completedschedule; thank respondents for their participationand cooperation, and offer to send a shortabstract of the major findings when the analysis iscompleted.

Covering letters or sheets and follow-upletters

The purpose of the covering letter or sheet isto indicate the aim of the research, to conveyto respondents its importance, to assure them ofconfidentiality, and to encourage their replies. Thecovering letter or sheet should:

provide a title to the researchintroduce the researcher, her/his name,address, organization, contact telephone/fax/email address, together with an invitation tofeel free to contact the researcher for furtherclarification or detailsindicate the purposes of the researchindicate the importance and benefits of theresearchindicate why the respondent has been selectedfor receipt of the questionnaireindicate any professional backing, endorse-ment or sponsorship of, or permission for, theresearch (e.g. university, professional associa-tions, government departments: the use of alogo can be helpful here)set out how to return the questionnaire(e.g. in the accompanying stamped, addressedenvelope, in a collection box in a particularinstitution, to a named person; whether thequestionnaire will be collected – and when,where and by whom)indicate the address to which to return thequestionnaireindicate what to do if questions or uncertaintiesarise

340 QUESTIONNAIRES

indicate a return-by dateindicate any incentives for completing thequestionnaireprovide assurances of confidentiality,anonymity and non-traceabilityindication of how the results will and will notbe disseminated, and to whomthank respondents in advance for their co-operation.

Verma and Mallick (1999: 122) suggest that,where possible, it is useful to personalize the letter,avoiding ‘Dear colleague’, ‘Dear Madam/Ms/Sir’etc., and replacing these with exact names.

With these intentions in mind, the followingpractices are to be recommended:

The appeal in the covering letter must betailored to suit the particular audience. Thus, asurvey of teachers might stress the importanceof the study to the profession as a whole.Neither the use of prestigious signatories,nor appeals to altruism, nor the addition ofhandwritten postscripts affect response levelsto postal questionnaires.The name of the sponsor or the organizationconducting the survey should appear on theletterhead as well as in the body of the coveringletter.A direct reference should be made to theconfidentiality of respondents’ answers and thepurposes of any serial numbers and codingsshould be explained.A pre-survey letter advising respondents of theforthcoming questionnaire has been shown tohave substantial effect on response rates.

A short covering letter is most effective; aim atno more than one page. An example of a coveringletter for teachers and senior staff might be asfollows:

Dear Colleague,

IMPROVING SCHOOL EFFECTIVENESS

We are asking you to take part in a project toimprove school effectiveness, by completing thisshort research questionnaire. The project is part of

your school development, support management andmonitoring of school effectiveness, and the projectwill facilitate a change management programme thatwill be tailor-made for the school. This questionnaireis seeking to identify the nature, strengths andweaknesses of different aspects of your school,particularly in respect of those aspects of the schoolover which the school itself has some control. Itwould be greatly appreciated if you would be involvedin this process by completing the sheets attached,and returning them to me. Please be as truthful aspossible in completing the questionnaire.

You do not need to write your name, and noindividuals will be identified or traced from this, i.e.confidentiality and anonymity are assured. If you wishto discuss any aspects of the review or this documentplease do not hesitate to contact me. I hope that youwill feel able to take part in this project.

Thank you.

Signed

Contact details (address, fax, telephone, email)

Another example might be:

Dear Colleague,

PROJECT ON CONDUCTING EDUCATIONALRESEARCH

I am conducting a small-scale piece of research intoissues facing researchers undertaking investigations ineducation. The topic is very much under-researchedin education, and that is why I intend to explore thearea.

I am asking you to be involved as you yourself haveconducted empirical work as part of a Master’s or doc-torate degree. No one knows the practical problemsfacing the educational researcher better than you.

The enclosed questionnaire forms part of my investi-gation. May I invite you to spend a short time in itscompletion?

If you are willing to be involved, please complete thequestionnaire and return it to XXX by the end of

PILOTING THE QUESTIONNAIRE 341

Ch

ap

ter1

5

November. You may either place it in the collectionbox at the General Office at my institution or sendit by post (stamped addressed envelope enclosed), orby fax or email attachment.

The questionnaire will take around fifteen minutesto complete. It employs rating scales and asks foryour comments and a few personal details. You donot need to write your name, and you will not beable to be identified or traced. ANONYMITY ANDNON-TRACEABILITY ARE ASSURED. Whencompleted, I intend to publish my results in aneducation journal.

If you wish to discuss any aspects of the study thenplease do not hesitate to contact me.

I very much hope that you will feel able to partici-pate. May I thank you, in advance, for your valuablecooperation.

Yours sincerely,

Signed

Contact details (address, fax, telephone, email)

For a further example of a questionnaire seethe accompanying web site (http://www.routledge.com/textbooks/9780415368780 – Chapter 15, file15.1.doc).

Piloting the questionnaire

It bears repeating that the wording of ques-tionnaires is of paramount importance and thatpretesting is crucial to their success (see http://www.routledge.com/textbooks/9780415368780 –Chapter 15, file 15.12. ppt). A pilot has severalfunctions, principally to increase the reliabil-ity, validity and practicability of the question-naire (Oppenheim 1992; Morrison 1993: Wilsonand McLean 1994: 47):

to check the clarity of the questionnaire items,instructions and layoutto gain feedback on the validity of thequestionnaire items, the operationalization ofthe constructs and the purposes of the research

to eliminate ambiguities or difficulties inwordingto check readability levels for the targetaudienceto gain feedback on the type of question andits format (e.g. rating scale, multiple choice,open, closed etc.)to gain feedback on response categories forclosed questions and multiple choice items, andfor the appropriateness of specific questions orstems of questionsto identify omissions, redundant and irrelevantitemsto gain feedback on leading questionsto gain feedback on the attractiveness andappearance of the questionnaireto gain feedback on the layout, sectionalizing,numbering and itemization of the question-naireto check the time taken to complete thequestionnaireto check whether the questionnaire is too longor too short, too easy or too difficultto generate categories from open-endedresponses to use as categories for closedresponse-modes (e.g. rating scale items)to identify how motivating/non-motivating/sensitive/threatening/intrusive/offensive itemsmight beto identify redundant questions (e.g. thosequestions which consistently gain a total ‘yes’or ‘no’ response: Youngman 1984: 172), i.e.those questions with little discriminabilityto identify which items are too easy, toodifficult, too complex or too remote from therespondents’ experienceto identify commonly misunderstood or non-completed items (e.g. by studying commonpatterns of unexpected response and non-response (Verma and Mallick 1999: 120))to try out the coding/classification system fordata analysis.

In short, as Oppenheim (1992: 48) remarks,everything about the questionnaire should bepiloted; nothing should be excluded, not eventhe type face or the quality of the paper.

342 QUESTIONNAIRES

The above outline describes a particular kindof pilot: one that does not focus on data, but onmatters of coverage and format, gaining feedbackfrom a limited number of respondents and expertson the items set out above.

There is a second type of pilot. This is onewhich starts with a long list of items and, throughstatistical analysis and feedback, reduces thoseitems (Kgaile and Morrison 2006). For example,a researcher may generate an initial list of,for example, 120 items to be included in aquestionnaire, and wish to know which itemsto excise. A pilot is conducted on a sizeableand representative number of respondents (e.g.50–100) and this generates real data – numericalresponses. These data can be analysed for thefollowing factors:

Reliability: those items with low reliability(Cronbach’s alpha for internal consistency: seePart Five) can be removed.Collinearity: if items correlate very stronglywith others then a decision can be takento remove one or more of them, provided,of course, that this does not result in theloss of important areas of the research (i.e.human judgement would have to prevail overstatistical analysis).Multiple regression: those items with low betas(see Part Five) can be removed, provided,of course, that this does not result in theloss of important areas of the research (i.e.human judgement would have to prevail overstatistical analysis).Factor analysis: to identify clusters of keyvariables and to identify redundant items (seePart Five).

As a result of such analysis, the items forremoval can be identified, and this can resultin a questionnaire of manageable proportions. It isimportant to have a good-sized and representativesample here in order to generate reliable datafor statistical analysis; too few respondents to thistype of pilot and this may result in important itemsbeing excluded from the final questionnaire.

Practical considerations in questionnairedesign

Taking the issues discussed so far in questionnairedesign, a range of practical implications fordesigning a questionnaire can be highlighted:

Operationalize the purposes of the question-naire carefully.Be prepared to have a pre-pilot to generateitems for a pilot questionnaire, and then beready to modify the pilot questionnaire for thefinal version.If the pilot includes many items, and theintention is to reduce the number of itemsthrough statistical analysis or feedback, thenbe prepared to have a second round of piloting,after the first pilot has been modified.Decide on the most appropriate type ofquestion – dichotomous, multiple choice, rankorderings, rating scales, constant sum, ratio,closed, open.Ensure that every issue has been exploredexhaustively and comprehensively; decide onthe content and explore it in depth andbreadth.Use several items to measure a specificattribute, concept or issue.Ensure that the data acquired will answer theresearch questions.Ask more closed than open questions for easeof analysis (particularly in a large sample).Balance comprehensiveness and exhaustivecoverage of issues with the demotivating factorof having respondents complete several pagesof a questionnaire.Ask only one thing at a time in a question. Usesingle sentences per item wherever possible.Keep response categories simple.Avoid jargon.Keep statements in the present tense whereverpossible.Strive to be unambiguous and clear in thewording.Be simple, clear and brief wherever possible.Clarify the kinds of responses required in openquestions.

PRACTICAL CONSIDERATIONS IN QUESTIONNAIRE DESIGN 343

Ch

ap

ter1

5

Balance brevity with politeness (Oppenheim1992: 122). It might be advantageous to replacea blunt phrase like ‘marital status’ with a gentler‘Please indicate whether you are married, livingwith a partner, or single . . ..’ or ‘I would begrateful if would tell me if you are married,living with a partner, or single’.Ensure a balance of questions that ask forfacts and opinions (this is especially true ifstatistical correlations and cross-tabulations arerequired).Avoid leading questions.Try to avoid threatening questions.Do not assume that respondents know theanswers, or have information to answerthe questions, or will always tell the truth(wittingly or not). Therefore include ‘don’tknow’, ‘not applicable’, ‘unsure’, ‘neither agreenor disagree’ and ‘not relevant’ categories.Avoid making the questions too hard.Balance the number of negative questionswith the number of positive questions (Black1999: 229).Consider the readability levels of thequestionnaire and the reading and writingabilities of the respondents (which may leadthe researcher to conduct the questionnaire asa structured interview).Put sensitive questions later in the question-naire in order to avoid creating a mental setin the mind of respondents, but not so latein the questionnaire that boredom and lack ofconcentration have set it.Intersperse sensitive questions with non-sensitive questions.Be very clear on the layout of the questionnaireso that it is unambiguous and attractive(this is particularly the case if a computerprogram is going to be used for dataanalysis).Avoid, where possible, splitting an item overmore than one page, as the respondent maythink that the item from the previous page isfinished.Ensure that the respondent knows howto enter a reply to each question, e.g.by underlining, circling, ticking, writing;

provide the instructions for introducing,completing and returning (or collectionof) the questionnaire (provide a stampedaddressed envelope if it is to be a postalquestionnaire).Pilot the questionnaire, using a group of re-spondents who are drawn from the possiblesample but who will not receive the final,refined version.With the data analysis in mind, plan sothat the appropriate scales and kinds of data(e.g. nominal, ordinal, interval and ratio) areused.Decide how to avoid falsification of re-sponses (e.g. introduce a checking mech-anism into the questionnaire responses toanother question on the same topic orissue).Be satisfied if you receive a 50 per centresponse to the questionnaire; decide whatyou will do with missing data and what is thesignificance of the missing data (that mighthave implications for the strata of a stratifiedsample targeted in the questionnaire), and whythe questionnaires have not been completedand returned. For example, were the questionstoo threatening or was the questionnaire toolong? (This might have been signalled in thepilot).Include a covering explanation, thanking thepotential respondent for anticipated coopera-tion, indicating the purposes of the research,how anonymity and confidentiality will be ad-dressed, who you are and what position youhold, and who will be party to the finalreport.If the questionnaire is going to be ad-ministered by someone other than theresearcher, ensure that instructions for ad-ministration are provided and that they areclear.

A key issue that permeates this lengthy list isfor the researcher to pay considerable attentionto respondents; to see the questionnaire throughtheir eyes, and envisage how they will regard it (e.g.from hostility to suspicion to apathy to grudging

344 QUESTIONNAIRES

compliance to welcome; from easy to difficult,from motivating to boring, from straightforward tocomplex etc.).

Administering questionnaires

Questionnaires can be administered in severalways, including:

self-administrationpostface-to-face interviewtelephoneInternet.

Here we discuss only self-administered and postalquestionnaires. Chapter 16 covers administrationby face-to-face interview and by telephone, andChapter 10 covers administration by the Internet.We also refer readers to Chapter 9 on surveys, tothe section on conducting surveys by interview.

Self-administered questionnaires

There are two types of self-administeredquestionnaire: those that are completed in thepresence of the researcher and those that are filledin when the researcher is absent (e.g. at home, inthe workplace).

Self-administered questionnaires in the presence ofthe researcher

The presence of the researcher is helpful inthat it enables any queries or uncertainties tobe addressed immediately with the questionnairedesigner. Further, it typically ensures a goodresponse rate (e.g. undertaken with teachers ata staff meeting or with students in one or moreclasses). It also ensures that all the questions arecompleted (the researcher can check these beforefinally receiving the questionnaire) and filled incorrectly (e.g. no rating scale items that have morethan one entry per item, and no missed items).It means that the questionnaires are completedrapidly and on one occasion, i.e. it can gather datafrom many respondents simultaneously.

On the other hand, having the researcherpresent may be threatening and exert a sense ofcompulsion, where respondents may feel uncom-fortable about completing the questionnaire, andmay not want to complete it or even start it. Re-spondents may also want extra time to think aboutand complete the questionnaire, maybe at home,and they are denied the opportunity to do this.

Having the researcher present also placespressure on the researcher to attend at an agreedtime and in an agreed place, and this may be time-consuming and require the researcher to travelextensively, thereby extending the time framefor data collection. Travel costs for conductingthe research with dispersed samples could also beexpensive.

Self-administered questionnaires without the presenceof the researcher

The absence of the researcher is helpful in that itenables respondents to complete the questionnairein private, to devote as much time as they wishto its completion, to be in familiar surroundings,and to avoid the potential threat or pressure toparticipate caused by the researcher’s presence.It can be inexpensive to operate, and is moreanonymous than having the researcher present.This latter point, in turn, can render the datamore or less honest: it is perhaps harder to tell liesor not to tell the whole truth in the presence of theresearcher, and it is also easier to be very honestand revealing about sensitive matters without thepresence of the researcher.

The down side, however, is that the researcheris not there to address any queries or problemsthat respondents may have, and they may omititems or give up rather than try to contactthe researcher. Respondents may also wronglyinterpret and, consequently, answer questionsinaccurately. They may present an untrue pictureto the researcher, for example answering whatthey would like a situation to be rather thanwhat the actual situation is, or painting a falselynegative or positive picture of the situation orthemselves. Indeed, the researcher has no controlover the environment in which the questionnaire

ADMINISTERING QUESTIONNAIRES 345

Ch

ap

ter1

5

is completed, e.g. time of day, noise distractions,presence of others with whom to discuss thequestions and responses, seriousness given to thecompletion of the questionnaire, or even whetherit is completed by the intended person.

Postal questionnaires

Frequently, the postal questionnaire is the bestform of survey in an educational inquiry. Take, forexample, the researcher intent on investigating theadoption and use made of a new curriculum seriesin secondary schools. An interview survey basedupon some sampling of the population of schoolswould be both expensive and time-consuming. Apostal questionnaire, on the other hand, wouldhave several distinct advantages. Moreover, giventhe usual constraints over finance and resources, itmight well prove the only viable way of carryingthrough such an inquiry.

What evidence we have about the advantagesand disadvantages of postal surveys derivesfrom settings other than educational. Many ofthe findings, however, have relevance to theeducational researcher. Here, we focus upon someof the ways in which educational researchers canmaximize the response level that they obtain whenusing postal surveys.

A number of myths about postal questionnairesare not borne out by the evidence (see Hoinvilleand Jowell 1978). Response levels to postalsurveys are not invariably less than those obtainedby interview procedures; frequently they equal,and in some cases surpass, those achieved ininterviews. Nor does the questionnaire necessarilyhave to be short in order to obtain a satisfactoryresponse level. With sophisticated respondents,for example, a short questionnaire might appearto trivialize complex issues with which they arefamiliar. Hoinville and Jowell (1978) identify anumber of factors in securing a good response rateto a postal questionnaire.

Initial mailing

Use good-quality envelopes, typed andaddressed to a named person wherever possible.

Use first-class – rapid – postage services, withstamped rather than franked envelopeswherever possible.Enclose a first-class stamped envelope for therespondent’s reply.In surveys of the general population, Thursdayis the best day for mailing out; in surveysof organizations, Monday or Tuesday arerecommended.Avoid at all costs a December survey(questionnaires will be lost in the welter ofChristmas postings in the western world).

Follow-up letter

Of the four factors that Hoinville and Jowell(1978) discuss in connection with maximizingresponse levels, the follow-up letter has beenshown to be the most productive. The followingpoints should be borne in mind in preparingreminder letters:

All of the rules that apply to the coveringletter apply even more strongly to the follow-upletter.The follow-up should re-emphasize theimportance of the study and the value of therespondent’s participation.The use of the second person singular, theconveying of an air of disappointment at non-response and some surprise at non-cooperationhave been shown to be effective ploys.Nowhere should the follow-up give theimpression that non-response is normal or thatnumerous non-responses have occurred in theparticular study.The follow-up letter must be accompanied by afurther copy of the questionnaire together witha first-class stamped addressed envelope for itsreturn.Second and third reminder letters suffer fromthe law of diminishing returns, so how manyfollow-ups are recommended and what successrates do they achieve? It is difficult togeneralize, but the following points are worthbearing in mind. A well-planned postal surveyshould obtain at least a 40 per cent responserate and with the judicious use of reminders,

346 QUESTIONNAIRES

a 70 per cent to 80 per cent response levelshould be possible. A preliminary pilot surveyis invaluable in that it can indicate the generallevel of response to be expected. The mainsurvey should generally achieve at least as highas and normally a higher level of return thanthe pilot inquiry. The Office of PopulationCensuses and Surveys recommends the use ofthree reminders which, they say, can increasethe original return by as much as 30 per centin surveys of the general public. A typicalpattern of responses to the three follow-ups isas follows:

Original dispatch 40 per centFirst follow-up +20 per centSecond follow-up +10 per centThird follow-up +5 per centTotal 75 per cent

Bailey (1994: 163–9) shows that follow-ups canbe both by mail and by telephone. If a follow-upletter is sent, then this should be around threeweeks after the initial mailing. A second follow-up is also advisable, and this should take placeone week after the first follow up. Bailey (1994:165) reports research that indicates that a secondfollow-up can elicit up to a 95.6 per cent responserate compared to a 74.8 per cent response withno follow-up. A telephone call in advance of thequestionnaire can also help in boosting responserates (by up to 8 per cent).

Incentives

An important factor in maximizing responserates is the use of incentives. Althoughsuch usage is comparatively rare in Britishsurveys, it can substantially reduce non-responserates particularly when the chosen incentivesaccompany the initial mailing rather than beingmailed subsequently as rewards for the returnof completed schedules. The explanation of theeffectiveness of this particular ploy appears to liein the sense of obligation that is created in therecipient. Care is needed in selecting the mostappropriate type of incentive. It should clearly beseen as a token rather than a payment for the

respondent’s efforts and, according to Hoinvilleand Jowell (1978), should be as neutral as possible.In this respect, they suggest that books of postagestamps or ballpoint pens are cheap, easily packagedin the questionnaire envelopes, and appropriate tothe task required of the respondent.

The preparation of a flow chart can help theresearcher to plan the timing and the sequencing ofthe various parts of a postal survey. One such flowchart suggested by Hoinville and Jowell (1978) isshown in Box 15.5. The researcher might wish toadd a chronological chart alongside it to help planthe exact timing of the events shown here.

Validity

Our discussion, so far, has concentrated onways of increasing the response rate of postalquestionnaires; we have said nothing yet aboutthe validity of this particular technique.

Validity of postal questionnaires can beseen from two viewpoints according to Belson(l986). First, whether respondents who completequestionnaires do so accurately, and second,whether those who fail to return theirquestionnaires would have given the samedistribution of answers as did the returnees.

The question of accuracy can be checked bymeans of the intensive interview method, a tech-nique consisting of twelve principal tactics thatinclude familiarization, temporal reconstruction,probing and challenging. The interested readershould consult Belson (1986: 35–8).

The problem of non-response (the issue of‘volunteer bias’ as Belson calls it) can, in part,be checked on and controlled for, particularlywhen the postal questionnaire is sent out on acontinuous basis. It involves follow-up contactwith non-respondents by means of interviewerstrained to secure interviews with such people. Acomparison is then made between the replies ofrespondents and non-respondents.

Processing questionnaire data

Let us assume that researchers have followedthe advice we have given about the planning

PROCESSING QUESTIONNAIRE DATA 347

Ch

ap

ter1

5

Box 15.5A flow chart for the planning of a postal survey

Seal and stampoutward envelopes

Transfer questionnairesto data preparation staff

Prepare finalresponse summaries

Send letter of thanksto all respondents

MAILING

Address and serial numberlabels and attach tooutward envelopes

Insert questionnaires, covering letters,(incentives) and return envelopes

into outward envelopes

Enter serial numberfrom labels on

questionnaires andon covering letter

Book in completedquestionnaires againstsample serial numbers

Mail first reminder to non-respondents

Prepare questionnairesand covering letters

Prepare incentives(if relevant)

Prepare stampedaddressed return

envelopes

Mail second reminderto non-respondents

Source: Hoinville and Jowell 1978

of postal questionnaires and have secured a highresponse rate to their surveys. Their task is nowto reduce the mass of data they have obtained toa form suitable for analysis. ‘Data reduction’, asthe process is called, generally consists of codingdata in preparation for analysis – by hand in the

case of small surveys; by computers when numbersare larger. First, however, prior to coding, thequestionnaires have to be checked. This task isreferred to as editing.

Editing questionnaires is intended to identifyand eliminate errors made by respondents. (In

348 QUESTIONNAIRES

addition to the clerical editing that we discuss inthis section, editing checks are also performed bythe computer. For an account of computer-runstructure checks and valid coding range checks,see Hoinville and Jowell (1978: 150–5). Moserand Kalton (1977) point to three central tasks inediting:

Completeness: a check is made that there isan answer to every question. In most surveys,interviewers are required to record an answerto every question (a ‘not applicable’ categoryalways being available). Missing answerscan sometimes be cross-checked from othersections of the survey. At worst, respondentscan be contacted again to supply the missinginformation.Accuracy: as far as is possible, a check is madethat all questions are answered accurately. In-accuracies arise out of carelessness on the partof either interviewers or respondents. Some-times a deliberate attempt is made to mislead.A tick in the wrong box, a ring round thewrong code, an error in simple arithmetic – allcan reduce the validity of the data unless theyare picked up in the editing process.Uniformity: a check is made that interview-ers have interpreted instructions and ques-tions uniformly. Sometimes the failure to giveexplicit instructions over the interpretationof respondents’ replies leads to interviewersrecording the same answer in a variety ofanswer codes instead of one. A check on uni-formity can help eradicate this source of error.

The primary task of data reduction is coding, thatis, assigning a code number to each answer toa survey question. Of course, not all answers tosurvey questions can be reduced to code num-bers. Many open-ended questions, for example,are not reducible in this way for computer anal-ysis. Coding can be built into the constructionof the questionnaire itself. In this case, we talkof precoded answers. Where coding is developedafter the questionnaire has been administered andanswered by respondents, we refer to post-codedanswers. Precoding is appropriate for closed-ended

questions – male 1, female 2, for example; or single1, married 2, separated 3, divorced 4. For questionssuch as those whose answer categories are knownin advance, a coding frame is generally devel-oped before the interviewing commences so thatit can be printed into the questionnaire itself. Foropen-ended questions (Why did you choose thisparticular in-service course rather than XYZ?), acoding frame has to be devised after the comple-tion of the questionnaire. This is best done bytaking a random sample of the questionnaires (10per cent or more, time permitting) and generatinga frequency tally of the range of responses as a pre-liminary to coding classification. Having devisedthe coding frame, the researcher can make a furthercheck on its validity by using it to code up a fur-ther sample of the questionnaires. It is vital to getcoding frames right from the outset – extendingthem or making alterations at a later point in thestudy is both expensive and wearisome.

There are several computer packages thatwill process questionnaire survey data. At thetime of writing one such is SphinxSurvey. Thispackage, like others of its type, assists researchersin the design, administration and processing ofquestionnaires, either for paper-based or for on-screen administration. Responses can be enteredrapidly, and data can be examined automatically,producing graphs and tables, as well as a widerange of statistics (the Plus edition offers lexicalanalysis of open-ended text, and the LexicaEdition has additional functions for qualitativedata analysis). A web site for previewing ademonstration of this programme can be found athttp://www.scolari.co.uk and is typical of severalof its kind.

While coding is usually undertaken by theresearcher, Sudman and Bradburn (1982: 149)also make the case for coding by the respondentsthemselves, to increase validity. This is particularlyvaluable in open-ended questionnaire items,though, of course, it does assume not only thewillingness of respondents to become involvedpost hoc but also that the researcher can identifyand trace the respondents, which, as was indicatedearlier, is an ethical matter.

research methods in education, sixth edition although the application of the method to a...

Documents